Ingest
The group of operations "ingest" contains steps for the ingest process.
Harvest identifiers from OAI-PMH
This operation harvests OAI-PMH identifiers from the specified endpoint, set and date range passed as parameters. Optionally, the metadata prefix can also be given as a parameter (defaults to "oai_dc"). For every identifier retrieved, a feeder event is generated, which can be used to launch a follow-up workflow for each identifier.
The endpoint parameter should contain both base URL (e.g. https://docuteam.ch) and path (oai/request). The verb statement (?verb=ListIdentifiers) is added automatically by the action. fromDate and toDate should be given in ISO format UTC date time (e.g. 2000-01-01T00:00:00Z).
The created event has the event_type "harvestOaiPmhDate" and contains both the endpoint, set and identifier of the harvested record.
docuteam-actions harvestOaiPmhData -c [/path/to/]config.json
Options:
--version Show version number [boolean]
--debug Set log level to debug [boolean]
-c, --config Configuration file path [string] [required]
--help Show help [boolean]
-e, --endpoint OAI-PMH endpoint used to harvest [string] [required]
-s, --set OAI-PMH set to harvest [string] [required]
-m, --prefix OAI-PMH metadata format [string]
-f, --fromDate From date time in ISO format [string] [required]
-t, --toDate To date time in ISO format [string] [required]
Harvest records from OAI-PMH
This operation harvests a single OAI-PMH record from the specified endpoint. Identifier and metadata prefix are passed as parameters. The record is then stored as file named oai.xml
in the folder specified by the "path" parameter.
The "path" parameter can be an absolute path or a path relative to the folder where the action is executed. If the specified path does not exist, it will be created.
The endpoint parameter should contain both base URL (e.g. https://docuteam.ch) and path (oai/request). The verb statement (?verb=GetRecord) is added automatically by the action.
docuteam-actions harvestOaiPmhRecord -c [/path/to/]config.json
Options:
--version Show version number [boolean]
--debug Set log level to debug [boolean]
-c, --config Configuration file path [string] [required]
--help Show help [boolean]
-e, --endpoint OAI-PMH endpoint used to harvest [string] [required]
-i, --identifier OAI-PMH identifier [string] [required]
-m, --prefix OAI-PMH metadata format [string] [required]
-p, --path Path where to store the oai.xml response [string] [required]
Register URN with the German National Library
This operation registers a URN and associated URLs with the German National Library. Both URN (accessor URN) and URLs (accessor registrationURL) need to be already present in the mets.xml
file (stored in the root node of the SIP).
The URN to be registered can only be present once in the root node, while it is possible to have multiple URLs which need to be registered (though there must be at least one URL).
The operation first checks whether the URN is already registered. If this is the case, it will try to add the URLs to the already registered URN with URL priority 2. If the URN is not yet registered, it will register both the URN and the associated URLs. The first URL will be the primary URL (priority 1). All other URLs will have priority 2.
When calling the operation, the namespace for which the URN will be registered needs to be given as a parameter. The credentials for the different namespaces, together with the base URL of the API offered by the German National Library, need to be stored in the config.json
file.
docuteam-actions registerDnbUrnForRootNode -c [/path/to/]config.json
Options:
--version Show version number [boolean]
--debug Set log level to debug [boolean]
-c, --config Configuration file path [string] [required]
--help Show help [boolean]
-n, --namespace URN namespace [string] [required]
-d, --data Data file path [string] [required]