Ingest
The package "ingest" contains steps for the ingest process.
Ingest: convert BAR-SIP
Converts a BAR-SIP into a SIP that conforms to the Matterhorn profile.
java ch.docuteam.actions.ingest.BARSIPConverter \
[path/to/]SIP [targetFolder]
Parameter | Description |
---|---|
[path/to/]BAR-SIP | name of the SIP; if no path is given, it will be expected to be in the location defined by the actions.workbench.inbox property |
[targetFolder] | directory where to move the created SIP to; if omitted, the SIP will be moved to the location defined by the actions.workbench.work property |
Ingest: create SIP from eCH-0160 SIP
Creates a SIP that conforms to the Matterhorn profile from a eCH-0160 SIP.
java ch.docuteam.actions.ingest.CreateSIPFromECH0160SIP \
--sip=[path/to/]SIP \
--levelsFilePath=/path/to/levels.xml \
--[mappingFile=[path/to/]mappingFile] \
--[output-folder=/path/to/folder]
Parameter | Description |
---|---|
--sip=[path/to/]SIP | location of the SIP to convert; default lookup folder is actions.workbench.inbox |
--levelsFilePath=/path/to/levels.xml | path to the file levels.xml |
--[mappingFile=[path/to/]mappingFile] | file from which to read the mapping; defaults to a default mapping file (defined by the mapping module) |
--[output-folder=/path/to/folder] | indicate the output folder; defaults to actions.workbench.work |
Ingest: check workbench space
Checks if there is enough space for SIP processing (i.e. for working copies).
java ch.docuteam.actions.ingest.CheckWorkbenchSpace \
[path/to/]SIP [numberOfCopies]
Parameter | Description |
---|---|
[path/to/]SIP | name of the SIP. If no path is given, it will be expected to be in the location defined by the actions.workbench.work property |
[numberOfCopies] | optional, number of copies to calculate with; defaults to 3 |
Ingest: cleanup working copies
Deletes existing SIPs in actions.workbench.work
. Optionally, you can also delete SIPs with the same name in actions.workbench.preparation
.
java ch.docuteam.actions.ingest.Cleanup \
[path/to/]SIP [prep]
Parameter | Description |
---|---|
[path/to/]SIP | name of the SIP. If no path is given, it will be expected to be in the location defined by the actions.workbench.work property |
[prep] | if true , SIPs of the same name in actions.workbench.preparation will be removed as well; defaults to false |
Ingest: create EAD file
Creates EAD data from individual nodes of a given SIP.
java ch.docuteam.actions.ingest.CreateEADFile \
[path/to/]SIP [targetFilename]
Parameter | Description |
---|---|
[path/to/]SIP | name of the SIP; if no path is given, it will be expected to be in the location defined by the actions.workbench.work property |
[targetFilename] | optional, name of the output file; defaults to EAD.xml within the SIP's subfolder in the location defined by the actions.workbench.output property |
Ingest: extent calculator
Sets the number of files in the "Extent" metadata field and the unit to the default value "File(s)".
java ch.docuteam.actions.ingest.ExtentCalculator \
[path/to/]SIP
Parameter | Description |
---|---|
[path/to/]SIP | name of the SIP; if no path is given, it will be expected to be in the location defined by the actions.workbench.work property |
Ingest: migrate files
Migrates the files of a SIP according to the migrations specifications in the configuration file migration-config.xml
.
java ch.docuteam.actions.ingest.SIPFileMigrator \
[path/to/]SIP keepOriginals [path/to/migration-config.xml]
Parameter | Description |
---|---|
[path/to/]SIP | name of the SIP; if no path is given, it will be expected to be in the location defined by the actions.workbench.work property |
keepOriginals | { true |
[path/to/migration-config.xml] | optional, path to a specific migration configuration file (defaults to ./config/migration-config.xml ) |
[skipAlreadyMigratedFiles] | optional, { true |
Ingest: remove SIP from inbox
Moves an existing SIP from actions.workbench.inbox
to a specified folder or deletes it if no destination folder is specified.
java ch.docuteam.actions.ingest.SIPRemoveFromInbox \
[path/to/]SIP [targetFolder]
Parameter | Description |
---|---|
[path/to/]SIP | path of the SIP; if no path is given, it will be expected to be in the location defined by the actions.workbench.inbox property |
[targetFolder] | directory where to move the SIP to; if omitted, the SIP will be deleted |
Ingest: replace file
Replaces a file in a SIP. The metadata is retained or added. Currently, only SIPs containing a single file can be processed with this step.
java ch.docuteam.actions.ingest.ReplaceFile \
[path/to/]SIP [targetFolder]
Parameter | Description |
---|---|
[path/to/]SIP | path of the SIP; if no path is given, it will be expected to be in the location defined by the actions.workbench.work property |
[targetFolder] | path to the file to be used as replacement of the current SIP content |
Ingest: get MARC from REST and add to SIP
For every object (file/folder) of a SIP, the process gets a MARC description from a REST webservice and adds it to the descriptive metadata.
The URL of the webservice is configured in the actions.properties with the "aleph.webservice.url" property.
The URL should contain a palcehoder documentNumber
which is being replaced by the specific document number. The latter is extracted for each object based on its filename:
- For a filename of
BAU_5_000000444.wav
the document number000000444
will be extracted. - For a foldername of
DIRECTORY_X_000000555
the document number000000555
will be extracted.
If the HTTP-Request fails or the filename is invalid the operation will stop and leave the SIP unchagnedunchagendunchanged. Existing MARC metadata will be overwritten by the succesful operation.
java ch.docuteam.actions.marc.AddMarcFromRestByIdFromNodeName \
--sip=[path/to/]SIP
Parameter | Description |
---|---|
--sip=[path/to/]SIP | location of the SIP to convert; default lookup folder is actions.workbench.work |
Ingest: add OAIDC from REST by ID from filename
Takes a SIP and adds OAI DC information to its root folder.
The OAI DC information is requested from a web service, defined by the property "oai.webservice.url". The URL is expected to have a placeholder {identifier} which is replaced with the root node’s name, for example:
- “Kürzel-SignaturTIFF” z.B. “bbb-0027TIFF” wird zum {identifier} "bbb/0027"
If the node within the SIP has an invalid name or the request of the OAI DC information fails, the operation aborts and the SIP file is not change. When calling the operation on a SIP file which already contains OAI DC information, and Exception is thrown.
Additional file resources defined in the <dc:relation/>
metadata are downloaded and appended to the SIP under a new subfolder labeled "TEI-Handschriftenbeschreibungen".
java ch.docuteam.actions.oai_dc.AddOAIDCFromRESTByIDFromFilename \
--sip=[path/to/]SIP
Parameter | Description |
---|---|
--sip=[path/to/]SIP | name of the SIP; if no path is given, it will be expected to be in the location defined by the actions.workbench.work property |
Ingest: convert an EDIDOC package into a Matterhorn METS SIP
Creates a SIP that conforms to the Matterhorn METS profile from a EDIDOC package.
java ch.docuteam.actions.ingest.CreateSIPFromEdidocSIP \
--sip=[path/to/]SIP \
--levelsFilePath=path/to/levels.xml \
[--mappingFile=path/to/mappingFile] \
[--outputFolder=/path/to/folder] \
[--steuerXml=/path/to/file]
Parameter | Description |
---|---|
--sip=[path/to/]SIP | location of the package to convert; default lookup folder is actions.workbench.inbox |
--levelsFilePath=path/to/levels.xml | path to the level configuration file, to be found to the classpath |
[--mappingFile=path/to/mappingFile] | file from which to read the mapping; defaults to ./config/edidoc-mapping.xml, to be found to the classpath |
[--outputFolder=/path/to/folder] | indicate the output folder; defaults to actions.workbench.work |
[--steuerXml=/path/to/file] | path to the EDIDOC archives extension XML file |
Ingest: update xml file in SIP using xslt
Using an XSLT file, this action can update an XML file within the SIP applying this transformation.
java ch.docuteam.actions.ingest.ModifyFileWithXSL \
--sip=[/path/to/]SIP \
--xml=path/to/file.xml \
--xsl=path/to/transformation.xsl
Parameter | Description |
---|---|
--sip=[path/to/]SIP | name of the SIP; if no path is given, it will be expected to be in the location defined by the actions.workbench.work property |
--xml=path/to/file.xml | path to xml file within the SIP to be transformed (relative to the SIP's root node) |
--xsl=path/to/transformation.xsl | path to the xsl script to be used in the transformation (if relative, assume xsl resides in $ACTIONS_HOME/xslt) |