[Netarchivesuite-users] NAS 7.2 and some questions

Colin Samuel Rosenthal csr at kb.dk
Fri Nov 12 10:28:19 CET 2021


One more question - is this the complete settings file? Settings files usually start

<settings>
    <common>
      <environmentName>


WaybackIndexer starts by sending a FileListJob batchjob to the ArcRepository application (via jms). It should be possible to see this job being sent in the WaybackIndexer log and being received in the ArcRepository, BitarchiveMonitor and Bitarchive applications. Finally the result is returned to WaybackIndexer and any new files are created in the archivefile table in the database. Can you follow any of this workflow in the application logs?


cheers,

Colin


--
Colin Rosenthal PhD
Senior IT Consultant
Royal Danish Library (Aarhus)


________________________________
From: NetarchiveSuite-users <netarchivesuite-users-bounces at ml.sbforge.org> on behalf of Colin Samuel Rosenthal <csr at kb.dk>
Sent: Tuesday, November 2, 2021 1:39 PM
To: 'netarchivesuite-users at ml.sbforge.org'; 'netarchivesuite-users-bounces at ml.sbforge.org'
Cc: García Arratia, Juan Carlos; Monzón, Fernando
Subject: Re: [Netarchivesuite-users] NAS 7.2 and some questions


Can you also send the complete log file and start log from when you start the WaybackIndexer application?


cheers,

Colin


--
Colin Rosenthal PhD
Senior IT Consultant
Royal Danish Library (Aarhus)


________________________________
From: NetarchiveSuite-users <netarchivesuite-users-bounces at ml.sbforge.org> on behalf of Soleto Ruiz de Clavijo, Miguel <miguel.soleto at externos.bne.es>
Sent: Friday, October 29, 2021 12:43 PM
To: 'netarchivesuite-users at ml.sbforge.org'; 'netarchivesuite-users-bounces at ml.sbforge.org'
Cc: García Arratia, Juan Carlos; Monzón, Fernando
Subject: [Netarchivesuite-users] NAS 7.2 and some questions


Hi everybody!

It’s Miguel, from BNE. Sorry for the delay to write this, but We wanted to test our installation first.



Here is a table with our PRE environment:



MACHINE


APPLICATION


VERSION


COMMENTS


Server 1


Postgres

Broker


13

5.1


8 GB RAM


Server 2


GUIApplication

ArcRepositoryApplication

BitarchiveMonitorApplication

HarvestJobManagerApplication


7.2


24 GB RAM


Server 3


BitArchive


7.2


8 GB RAM
Replica A


Server 4


IndexServerApplication

ViewerProxyApplication


7.2


8 GB RAM


Server 5


SolrWayback


4.0.6


24 GB RAM
Apache-Tomcat 9


Server 6


WaybackIndexerApplication

AggregatorApplication


7.2


8 GB RAM


Server 7


OpenWayback

CWEB


2.4

6


Apache-Tomcat 9 with 2 nodes


Server 8


Spider


7.2 – NAS
7.2 - Heritrix


Channel: HIGHPRIORITY


Server 9


Spider


7.2 – NAS
7.2 - Heritrix


Channel: HIGHPRIORITY


Server 10


Spider


7.2 – NAS
7.2 - Heritrix


Channel: HIGHPRIORITY


Server 11


Spider


7.2 – NAS
7.2 - Heritrix


Channel: HIGHPRIORITY


Server 12


Spider


7.2 – NAS
7.2 - Heritrix


Channel: HIGHPRIORITY


Server 13


Spider


7.2 – NAS
7.2 - Heritrix


Channel: “MASIVA”


Server 14


Spider


7.2 – NAS
7.2 - Heritrix


Channel: “MASIVA”


Server 15


Spider


7.2 – NAS
7.2 - Heritrix


Channel: “MASIVA”


Server 16


Spider


7.2 – NAS
7.2 - Heritrix


Channel: “MASIVA”


Server 17


Spider


7.2 – NAS
7.2 - Heritrix


Channel: “MASIVA 2”




The system seems to work OK in all servers with Red Hat 8, but here are some doubts I have:

·        Before me, there was another person who installed our actual Production environment (NAS – Version 5.4.2). I have compared the start scripts of that version with the new one, and the difference is that in Production environment, We have this: “java -Xmx32768m” after the CLASSPATH export. Is It necessary to asign memory to the process? Is It better if We do that?

·        Finally, I could set the settings of WaybackIndexerApplication to work with Postgres (the settings are on the attached file), but It just do nothing. There is an idle in transaction with this: “select archivefil0_.filename as filename0_, archivefil0_.indexed as indexed0_, archivefil0_.indexedDate as indexedD3_0_, archivefil0_.indexingFailedAttempts as indexing4_0_, archivefil0_.originalIndexFileName as original5_0_ from ArchiveFile archivefil0_ where archivefil0_.indexed=false and archivefil0_.indexingFailedAttempts<3 order by archivefil0_.indexingFailedAttempts ASC”. I would like to ask you if there is any documentation about this tool, because I think I don’t understand exactly how it works…

·        We have made some collections in our tests, and there is one domain on the OpenWayback with 3 crawls at the same time (same day, hour, minute and second), and 2 others with a difference of 2 seconds between them. I think this isn’t normal, but I can’t figure out why does It happened… Any ideas?



We will continue doing tests to ensure that everything is working fine. Let me know any question about our installation.



Thanks you all for always helping us! See you on next Tuesday!



Best regards,

Miguel.

________________________________
Este mensaje y cualquier fichero adjunto están dirigidos únicamente a sus destinatarios y contiene información confidencial. Si usted ha recibido este correo electrónico por error, le informamos que no puede realizar ninguna revisión, alteración, impresión, copia, transmisión, difusión ni utilización alguna de este mensaje ni de cualquier fichero adjunto que pudiese contener. La realización de cualquiera de los actos indicados está expresamente prohibida por las Normas que regulan estas materias. Por todo ello se solicita que, en caso de existir error en la recepción de este mensaje, se lo notifique al remitente respondiendo a este e-mail y elimine el mensaje y su contenido inmediatamente. La Biblioteca Nacional de España se reserva las acciones legales que le correspondan en el caso de que se infrinja lo indicado anteriormente.
________________________________
The information in this e-mail and any attachments is confidential and it is intended for the addressee only. If you have received this e-mail in error, you are notified that any revision, amendment, print, copy, disclosure, distribution or use of the contents is unauthorized. Carrying out any of the above actions, is expressly banned by rules governing this matter. Hence we request that if you are not the intended recipient, please notify the sender answering this e-mail, and delete the message and any attachments. The National Library of Spain reserves itself the right to take the appropriate legal actions in the event of the above mentioned matter is being infringed.
________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20211112/0842fa92/attachment-0001.html>


More information about the NetarchiveSuite-users mailing list