[Netarchivesuite-users] NAS 4.2 harvest template update/add problem

Colin Rosenthal csr at statsbiblioteket.dk
Mon Sep 2 11:02:52 CEST 2013


Hi Meelis,
Our harvest-templates include a <queue-total-budget> xml-element  in the 
frontier, as in this example:

  <newObject name="frontier" class="org.archive.crawler.frontier.BdbFrontier">
             <float name="delay-factor">1.0</float>
             <integer name="max-delay-ms">1000</integer>
             <integer name="min-delay-ms">300</integer>
             <integer name="max-retries">3</integer>
             <long name="retry-delay-seconds">300</long>
             <integer name="preference-embed-hops">1</integer>
             <integer name="total-bandwidth-usage-KB-sec">1500</integer>
             <integer name="max-per-host-bandwidth-usage-KB-sec">500</integer>
             <string name="queue-assignment-policy">dk.netarkivet.harvester.harvesting.DomainnameQueueAssignmentPolicy</string>
             <string name="force-queue-assignment"/>
             <boolean name="pause-at-start">false</boolean>
             <boolean name="pause-at-finish">false</boolean>
             <boolean name="source-tag-seeds">false</boolean>
             <boolean name="recovery-log-enabled">false</boolean>
             <boolean name="hold-queues">true</boolean>
             <integer name="balance-replenish-amount">3000</integer>
             <integer name="error-penalty-amount">100</integer>
             <long name="queue-total-budget">-1</long>
             <string name="cost-policy">org.archive.crawler.frontier.UnitCostAssignmentPolicy</string>
             <long name="snooze-deactivate-ms">300000</long>
             <integer name="target-ready-backlog">50</integer>
             <string name="uri-included-structure">org.archive.crawler.util.BdbUriUniqFilter</string>
         </newObject>

Is it possible yours has this elements missing?

regards,
Colin Rosenthal
IT Developer
State and University Library

On 09/02/2013 10:42 AM, Bjarne Andersen wrote:
> I have tried adding a new template to our 4.2 test-system - this goes allright.
> The error looks like the xml-file is missing a vital part ? - could there be an XML-error ?
>
> Missing node:
> /crawl-order/controller/newObject[@name='frontier']/long[@name='queue-total-budget']
>
> -
> Bjarne
> ________________________________________
> Fra: netarchivesuite-users-bounces at ml.sbforge.org [netarchivesuite-users-bounces at ml.sbforge.org] På vegne af Meelis Mihhailov [meelis at nlib.ee]
> Sendt: 2. september 2013 10:16
> Til: Netarchive Suite Users
> Emne: [Netarchivesuite-users] NAS 4.2 harvest template update/add problem
>
> Hi all!
>
> I managed to update NAS to version 4.2 but cannot add or update harvest
> templates. Using postgreSQL database.
>
> Tried to create a new harvest template (using the default as a base) but
> no luck :(
>
> Can anyone help me with this problem?
>
> The error I get:
>
> -----------------------------------------------------------------------
>     dk.netarkivet.common.exceptions.ArgumentNotValid: Template error:
> Missing node:
> /crawl-order/controller/newObject[@name='frontier']/long[@name='queue-total-budget']
>          at
> dk.netarkivet.common.exceptions.ArgumentNotValid.checkTrue(ArgumentNotValid.java:166)
>          at
> dk.netarkivet.harvester.datamodel.HeritrixTemplate.(HeritrixTemplate.java:281)
>          at
> dk.netarkivet.harvester.datamodel.HeritrixTemplate.(HeritrixTemplate.java:340)
>          at
> org.apache.jsp.Definitions_002dupload_002dharvest_002dtemplate_jsp._jspService(org.apache.jsp.Definitions_002dupload_002dharvest_002dtemplate_jsp:178)
>          at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:109)
>          at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>          at
> org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:389)
>          at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:486)
>          at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:380)
>          at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>          at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
>          at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401)
>          at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>          at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>          at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>          at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
>          at
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>          at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>          at org.mortbay.jetty.Server.handle(Server.java:322)
>          at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
>          at
> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)
>          at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
>          at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
>          at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
>          at
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
>          at
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
>
> I did check the permissions for the directory and everything seems to be
> OK.
>
> -----------------------------------------------------------------------
>
>
> Meelis Mihhailov
> ----------------
> National Library Of Estonia
> meelis at nlib.ee
> _______________________________________________
> NetarchiveSuite-users mailing list
> NetarchiveSuite-users at ml.sbforge.org
> http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users
>
> _______________________________________________
> NetarchiveSuite-users mailing list
> NetarchiveSuite-users at ml.sbforge.org
> http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users



More information about the NetarchiveSuite-users mailing list