[Netarchivesuite-users] DK UMBRA setup
Tue Hejlskov Larsen
tlr at kb.dk
Tue May 7 15:57:32 CEST 2019
Hi all
Here is our DK Umbra setup in production
Best regards
Tue
Net archive Umbra installation
Table of Contents
[ hide ]
· 1Related documents<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Relaterede_dokumenter>
· 2Dependencies<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Afh.C3.A6ngigheder>
· 3Prerequisites<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Foruds.C3.A6tninger>
· 4Additional software<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Ekstra_software>
· 5Installation<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Installation>
· 6Control<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Kontrol>
· 7Starters:<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Start_umbra_:>
· 8Deploy of new version<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Deploy_af_ny_version>
· 9Changelog<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Changelog>
o 9.12019.04.11, ABR: corrected Installation's command received from Colin<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#2019.04.11.2C_ABR:_tilrettet_Installation.27s_kommando_modtaget_fra_Colin>
o 9.22019.04.10, TLR: corrected Installation's command received from Colin<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#2019.04.10.2C_TLR:_tilrettet_Installation.27s_kommando_modtaget_fra_Colin>
Related documents [ edit <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&veaction=edit§ion=1> | edit wiki code <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&action=edit§ion=1> ]
Dependencies [ edit <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&veaction=edit§ion=2> | edit wiki code <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&action=edit§ion=2> ]
Prerequisites [ edit <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&veaction=edit§ion=3> | edit wiki code <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&action=edit§ion=3> ]
· git
· sudo yum install git
· python3.6 incl. python-pip
· At RHEL7, rh-python36 and rh-python36 python pip are installed via RHSCL (see RedHat_Software_Collections # Python<https://itwiki.kb.dk/wiki/RedHat_Software_Collections#Python> ).
· NB: environment can then be set via: source /opt/rh/rh-python36/enable
· rabbitmq including rabbitmq management
· see RabbitMQ_Install<https://itwiki.kb.dk/wiki/RabbitMQ_Install>
· chromium browser:
· on RHEL7 with EPEL ( EPEL: _Extra_Packages_for_Enterprise_Linux<https://itwiki.kb.dk/wiki/EPEL:_Extra_Packages_for_Enterprise_Linux> ):sudo yum install chromium
· google chrome browser:
· on RHEL7 see Netarkiv_google-chrome_installation<https://itwiki.kb.dk/wiki/Netarkiv_google-chrome_installation>
notes:
· The Difference between Google Chrome and Chromium on Linux: https://chromium.googlesource.com/chromium/src/+/HEAD/docs/chromium_browser_vs_google_chrome.md
Additional software [ edit <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&veaction=edit§ion=4> | edit wiki code <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&action=edit§ion=4> ]
Additional software (possibly) to be used in #Control<https://itwiki.kb.dk/wiki/Netarkiv_Umbra_installation#Kontrol> below:
sudo yum install xclock firefox
Installation [ edit <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&veaction=edit§ion=5> | edit wiki code <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&action=edit§ion=5> ]
umra on RHEL7:
sudo -i
source / opt / rh / rh-python36 / enable
pip install -U https://github.com/netarchivesuite/umbra/archive/1.0-KBDK-RC1.zip
run-chrome.sh
Create the script " ~/run-chrome.sh" with this content
#! / usr / bin / env bash
#Script that shows you how to add extra flags for the chrome (ium) browser.
SCRIPT_DIR = $ (dirname $ (readlink -f $ BASH_SOURCE [0]))
mkdir -p $ {SCRIPT_DIR} / chrome_logs
LOGFILE = "$ SCRIPT_DIR / chrome_logs / chrome. $$. Log"
#Notice that we use $ @ to include the script arguments
#Notice that we DO NOT start the browser in the background, in fact we use exec to make the process take over this process
# If we did not, the pid that would be called, and thus kill, would be the pid of the script, not the pid of the browser, which might allow the browser to live on.
# This way, the browser dies, and it dies well.
# But we still keep a logfile of what the browser experienced, named for the pid of the browser
exec google chrome \
--headless \
--no-sandbox \
--disable-3d-apis \
--disable-accelerated video \
--disable-background mode \
--disable-GPU \
--disable-plugins \
--disable-plugins-discovery \
--disable-preconnect \
--disable-translate \
--disable-local storage \
--full-memory crash report \
--mute audio \
--disable-GPU early-init \
--enable-logging = stderr \
- - log-level = 0 -
disable-dev-shm-usage
$ @
& >> $ LOGFILE
Make sure you ~/run-chrome.share executable
chmod a + x ~ / run-chrome.sh
logging.conf
Create ~/logging.confwith this content
[loggers]
keys = root, umbra
[handlers]
keys = consoleHandler
[formatters]
keys = umbraFormatter
[logger_root]
level = INFO
handlers = consoleHandler
[logger_umbra]
level = INFO
handlers = consoleHandler
qualname = umbra.controller.AmqpBrowserController
propagate = 0
[handler_consoleHandler ]
class = StreamHandler
level = DEBUG
formatter = umbraFormatter
args = (sys.stdout,)
[formatter_umbraFormatter]
format =% (asctime) s% (process) d% (levelname) s% (threadName) s% (pathname) s% ( name s.% (funcName) s (% (filename) s:% (lineno) d)% (message) s
datefmt =
start.sh
and last but not least ~/start.sh
/ usr / bin / env bash
ulimit -c 0
source / opt / rh / rh-python36 / enable
AMQP = 'amqp: // guest: guest @ localhost: 5672 /% 2f'
drain-queue --url "$ AMQP "
umbra \
--max-browser 5 \
--executable $ HOME / run-chrome.sh \
--url" $ AMQP "\
--log_config_file $ HOME / logging.conf \
| tee -a $ {HOME} /umbra.log
Sad for ~/start.shbeing executable with the command
chmod a + x ~ / start.sh
Checking [ edit <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&veaction=edit§ion=6> | edit wiki code <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&action=edit§ion=6> ]
Performed by Digital Cultural Heritage administrators.
python :
source / opt / rh / rh-python36 / enable
python -V
must display Python 3.6.3
Umbra :
Verify that the installations work in the following ways: Run
umbra -h
to confirm that umra is installed.
Rabbitmq :
Open with firefox http: // localhost: 15672<http://localhost:15672/> and login with guest / guest to confirm that rabbitmq is working as intended. see under connection that there is an AMQP connection when umbra is started below and the harvester has started with the correct settings file
Start umre: [ edit <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&veaction=edit§ion=7> | edit wiki code <https://itwiki.kb.dk/index.php?title=Netarkiv_Umbra_installation&action=edit§ion=7> ]
Start re-running by driving ~/start.sh
Umbra will log in ~/umbra.logand to stdOut.
Output will look approx. so out
/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/brozzler/model.py:38: YAMLLoadWarning: calling
yaml.load () without Loader = ... is deprecated, as the default loader is unsafe. Please read https://msg.pyyaml.org/load
for full details.
return yaml.load (f)
/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/brozzler/model.py:38: YAMLLoadWarning: calling
yaml.load () without Loader =. .. is deprecated, as the default loader is unsafe. Please read https://msg.pyyaml.org/load
for full details.
return yaml.load (f)
/home/test/logging.conf
2019-04-11 17: 07: 39,340 27871 INFO MainThread / opt / rh / rh-python36 / root / usr / bin / umbra root <module> ( umbra: 50)
umbra 2.1.dev10 starting up
2019-04-11 17: 07: 39,346 27871 INFO AmqpConsumerThread /opt/rh/rh-python36/root/usr/lib/python3.6/site-
packages / umbra / controller.py umbra.controller.AmqpBrowserController._consume_amqp (controller .py: 211) connecting to amqp
exchange = umbra at amqp: // guest: guest @ localhost: 5672 /% 2f
2019-04-11 17: 07: 39,367 27871 INFO AmqpConsumerThread / opt / rh / rh-python36 / root / usr / lib / python3.6 / site-
packages / umbra / controller.py umbra.controller.AmqpBrowserController._wait_for_and_browse_urls (controller.py:132) aquired
browser on port 40598
Also try:
sudo rabbitmqctl status
[cid:image001.jpg at 01D1697C.3AFB8B80]
Det Kgl. Bibliotek
Tue Hejlskov Larsen
IT analytiker, Driftsleder for Netarkivet | IT-analyst, Netarchive Operation Manager
Det Kgl. Bibliotek | Royal Danish Library
Afdelingen for Digital Kulturarv | Department of Digital Cultural Heritage
P.O. Box 2149 | DK-1016 København K
tel +45 9132 4720 | Fax +45 3393 2218 | tlr at kb.dk<mailto:chh at kb.dk> | www.kb.dk<http://www.kb.dk/>
Besøgsadresse | Visiting address | Søren Kierkegaards Plads 1
Leveringsadresse | Delivery address | Christians Brygge 8 | 1219 København K
EAN 5798 000 79 52 97 | Bank 0216 4069032583 | CVR 28 98 88 42
IBAN DK2002164069032583 | Swiftcode DABADKKK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20190507/0e78caac/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 14433 bytes
Desc: image001.jpg
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20190507/0e78caac/attachment-0001.jpg>
More information about the NetarchiveSuite-users
mailing list