[Netarchivesuite-users] DK UMBRA setup

Tue Hejlskov Larsen tlr at kb.dk
Tue May 7 15:57:32 CEST 2019


Hi all

Here is our DK Umbra setup in production

Best regards
Tue


Net archive Umbra installation
Table of Contents
 [ hide ]
·         1Related documents<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Relaterede_dokumenter>
·         2Dependencies<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Afh.C3.A6ngigheder>
·         3Prerequisites<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Foruds.C3.A6tninger>
·         4Additional software<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Ekstra_software>
·         5Installation<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Installation>
·         6Control<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Kontrol>
·         7Starters:<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Start_umbra_:>
·         8Deploy of new version<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Deploy_af_ny_version>
·         9Changelog<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Changelog>
o    9.12019.04.11, ABR: corrected Installation's command received from Colin<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#2019.04.11.2C_ABR:_tilrettet_Installation.27s_kommando_modtaget_fra_Colin>
o    9.22019.04.10, TLR: corrected Installation's command received from Colin<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#2019.04.10.2C_TLR:_tilrettet_Installation.27s_kommando_modtaget_fra_Colin>
Related documents [ edit <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&veaction=edit&section=1> | edit wiki code <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&action=edit&section=1> ]
Dependencies [ edit <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&veaction=edit&section=2> | edit wiki code <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&action=edit&section=2> ]
Prerequisites [ edit <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&veaction=edit&section=3> | edit wiki code <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&action=edit&section=3> ]
·         git
·         sudo yum install git
·         python3.6 incl. python-pip
·         At RHEL7, rh-python36 and rh-python36 python pip are installed via RHSCL (see RedHat_Software_Collections # Python<https://itwiki.kb.dk/wiki/RedHat_Software_Collections#Python> ).
·         NB: environment can then be set via: source /opt/rh/rh-python36/enable
·         rabbitmq including rabbitmq management
·         see RabbitMQ_Install<https://itwiki.kb.dk/wiki/RabbitMQ_Install>
·         chromium browser:
·         on RHEL7 with EPEL ( EPEL: _Extra_Packages_for_Enterprise_Linux<https://itwiki.kb.dk/wiki/EPEL:_Extra_Packages_for_Enterprise_Linux> ):sudo yum install chromium
·         google chrome browser:
·         on RHEL7 see Netarkiv_google-chrome_installation<https://itwiki.kb.dk/wiki/Netarkiv_google-chrome_installation>

notes:
·         The Difference between Google Chrome and Chromium on Linux: https://chromium.googlesource.com/chromium/src/+/HEAD/docs/chromium_browser_vs_google_chrome.md
Additional software [ edit <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&veaction=edit&section=4> | edit wiki code <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&action=edit&section=4> ]

Additional software (possibly) to be used in #Control<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Kontrol> below:

sudo yum install xclock firefox

Installation [ edit <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&veaction=edit&section=5> | edit wiki code <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&action=edit&section=5> ]

umra on RHEL7:

sudo -i

source / opt / rh / rh-python36 / enable

pip install -U   https://github.com/netarchivesuite/umbra/archive/1.0-KBDK-RC1.zip

run-chrome.sh

Create the script " ~/run-chrome.sh" with this content

#! / usr / bin / env bash



#Script that shows you how to add extra flags for the chrome (ium) browser.



SCRIPT_DIR = $ (dirname $ (readlink -f $ BASH_SOURCE [0]))



mkdir -p $ {SCRIPT_DIR} / chrome_logs



LOGFILE = "$ SCRIPT_DIR / chrome_logs / chrome. $$. Log"



#Notice that we use $ @ to include the script arguments

#Notice that we DO NOT start the browser in the background, in fact we use exec to make the process take over this process

# If we did not, the pid that would be called, and thus kill, would be the pid of the script, not the pid of the browser, which might allow the browser to live on.

# This way, the browser dies, and it dies well.

# But we still keep a logfile of what the browser experienced, named for the pid of the browser

exec google chrome \

    --headless \

    --no-sandbox \

    --disable-3d-apis \

    --disable-accelerated video \

    --disable-background mode \

    --disable-GPU \

    --disable-plugins \

    --disable-plugins-discovery \

    --disable-preconnect \

    --disable-translate \

    --disable-local storage \

    --full-memory crash report \

    --mute audio \

    --disable-GPU early-init \

    --enable-logging = stderr \

    - - log-level = 0 -

    disable-dev-shm-usage

    $ @

    & >> $ LOGFILE

Make sure you ~/run-chrome.share executable

chmod a + x ~ / run-chrome.sh

logging.conf

Create ~/logging.confwith this content

[loggers]

keys = root, umbra



[handlers]

keys = consoleHandler



[formatters]

keys = umbraFormatter



[logger_root]

level = INFO

handlers = consoleHandler



[logger_umbra]

level = INFO

handlers = consoleHandler

qualname = umbra.controller.AmqpBrowserController

propagate = 0



[handler_consoleHandler ]

class = StreamHandler

level = DEBUG

formatter = umbraFormatter

args = (sys.stdout,)



[formatter_umbraFormatter]

format =% (asctime) s% (process) d% (levelname) s% (threadName) s% (pathname) s% ( name s.% (funcName) s (% (filename) s:% (lineno) d)% (message) s

datefmt =

start.sh

and last but not least ~/start.sh

/ usr / bin / env bash

ulimit -c 0



source / opt / rh / rh-python36 / enable



AMQP = 'amqp: // guest: guest @ localhost: 5672 /% 2f'



drain-queue --url "$ AMQP "



umbra \

    --max-browser 5 \

    --executable $ HOME / run-chrome.sh \

    --url" $ AMQP "\

    --log_config_file $ HOME / logging.conf \

     | tee -a $ {HOME} /umbra.log

Sad for ~/start.shbeing executable with the command

chmod a + x ~ / start.sh

Checking [ edit <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&veaction=edit&section=6> | edit wiki code <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&action=edit&section=6> ]

Performed by Digital Cultural Heritage administrators.

python :

source / opt / rh / rh-python36 / enable

python -V

must display Python 3.6.3

Umbra :

Verify that the installations work in the following ways: Run

umbra -h

to confirm that umra is installed.

Rabbitmq :

Open with firefox http: // localhost: 15672<http://localhost:15672/> and login with guest / guest to confirm that rabbitmq is working as intended. see under connection that there is an AMQP connection when umbra is started below and the harvester has started with the correct settings file

Start umre: [ edit <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&veaction=edit&section=7> | edit wiki code <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&action=edit&section=7> ]

Start re-running by driving ~/start.sh

Umbra will log in ~/umbra.logand to stdOut.

Output will look approx. so out

/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/brozzler/model.py:38: YAMLLoadWarning: calling

yaml.load () without Loader = ... is deprecated, as the default loader is unsafe. Please read https://msg.pyyaml.org/load

for full details.

 return yaml.load (f)

/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/brozzler/model.py:38: YAMLLoadWarning: calling

yaml.load () without Loader =. .. is deprecated, as the default loader is unsafe. Please read https://msg.pyyaml.org/load

for full details.

 return yaml.load (f)

/home/test/logging.conf

2019-04-11 17: 07: 39,340 27871 INFO MainThread / opt / rh / rh-python36 / root / usr / bin / umbra root <module> ( umbra: 50)

umbra 2.1.dev10 starting up

2019-04-11 17: 07: 39,346 27871 INFO AmqpConsumerThread /opt/rh/rh-python36/root/usr/lib/python3.6/site-

packages / umbra / controller.py umbra.controller.AmqpBrowserController._consume_amqp (controller .py: 211) connecting to amqp

exchange = umbra at amqp: // guest: guest @ localhost: 5672 /% 2f

2019-04-11 17: 07: 39,367 27871 INFO AmqpConsumerThread / opt / rh / rh-python36 / root / usr / lib / python3.6 / site-

packages / umbra / controller.py umbra.controller.AmqpBrowserController._wait_for_and_browse_urls (controller.py:132) aquired

browser on port 40598

Also try:

 sudo rabbitmqctl status


[cid:image001.jpg at 01D1697C.3AFB8B80]

Det Kgl. Bibliotek



Tue Hejlskov Larsen
IT analytiker, Driftsleder for Netarkivet | IT-analyst, Netarchive Operation Manager

Det Kgl. Bibliotek | Royal Danish Library
Afdelingen for Digital Kulturarv | Department of Digital Cultural Heritage

P.O. Box 2149 | DK-1016 København K
tel +45 9132 4720 | Fax +45 3393 2218 | tlr at kb.dk<mailto:chh at kb.dk> | www.kb.dk<http://www.kb.dk/>

Besøgsadresse | Visiting address | Søren Kierkegaards Plads 1
Leveringsadresse | Delivery address | Christians Brygge 8 | 1219 København K

EAN 5798 000 79 52 97 | Bank 0216 4069032583 | CVR 28 98 88 42
IBAN DK2002164069032583 | Swiftcode DABADKKK



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20190507/0e78caac/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 14433 bytes
Desc: image001.jpg
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20190507/0e78caac/attachment-0001.jpg>


More information about the NetarchiveSuite-users mailing list