Skip to main content
Version: 6.7

Quality Assurance

The package "qualityassurance" contains steps that aim for checking of SIPs.

Quality Assurance: extract SIP into workfolder

Extracts a zipped SIP to An optional second parameter can be used to specify a different destination folder.

java ch.docuteam.actions.qualityassurance.SIPExtractor \
[path/to/]SIP [targetdir]
[path/to/]SIPname of the SIP; if no path is given, it will be expected to be in the location defined by the actions.workbench.inbox property
[targetdir]target directory; absolute path of the directory where to unzip the SIP to. Optional, defaults to

Quality Assurance: check fixity of SIP

Checks the files in a SIP for conformity with the checksums stored in the METS file. The results of the check are written to the METS file as PREMIS events.

java ch.docuteam.actions.qualityassurance.SIPFixityCheck \
[path/to/]SIPname of the SIP; if no path is given, it will be expected to be in the location defined by the property

Quality Assurance: check file path length

Checks whether the length of absolute paths of a SIP is greater than a specified value.

java ch.docuteam.actions.qualityassurance.FilePathLengthCheck \
/absolute/path/to/folder maxAllowedFilePathLength
/absolute/path/to/folderabsolute path of the folder that should be checked
maxAllowedFilePathLengththe max allowed number of characters of the canonical file path

Quality Assurance: check sip path length

Checks the file path lengths within a SIP against a specified limit.

java ch.docuteam.actions.qualityassurance.SIPPathLengthCheck \
[path/to/]SIP maxAllowedFilePathLength
[path/to/]SIPname of the SIP; if not path is given, it will be expected to be in the location defined by the property
maxAllowedFilePathLengththe max allowed number of characters of the canonical file path

Quality Assurance: get PID

Connects to a Fedora repository and retrieves a single PID. This PID later becomes the basis for storage in the repository. The value is stored in the <mets:OBJID/> element.

java ch.docuteam.actions.qualityassurance.SIPConfirmation \
[path/to/]SIP [PIDNamespace[:###]]
[path/to/]SIPname of the SIP. If no path is given, it will be expected to be in the location defined by the property
[PID namespace[:###]]namespace for new PID or complete PID to use for the object; if omitted, the standard namespace from the submission agreement will be used; if the submission agreement cannot be found, the default namespace of the Fedora repository will be used.

Quality Assurance: convert to safe filenames

Renames files with special characters. Safe file names contain only characters from A-Z, a-z, 0-9, and "_.-".

java ch.docuteam.actions.qualityassurance.SIPConvertToSafeFileNames \
[path/to/]SIPname of the SIP; if no path is given, it will be expected to be in the location defined by the property

Quality Assurance: check file extensions

Checks the file extensions in a SIP and adds them if necessary.

A parameter allows to indicate if existing but wrong extensions should be replaced.

java ch.docuteam.actions.qualityassurance.SIPFileExtensionCheck \
--sip=[path/to/]SIP [--replaceExistingExtensions=\{true|false\}]
--sipname of the SIP; if no path is given, it will be expected to be in the location defined by the property
--replaceExistingExtensionsoptional, `{ true

Quality Assurance: delete backup files

Deletes files from SIP that match a specific name pattern.

java ch.docuteam.actions.qualityassurance.SIPDeleteBackupFiles \
[path/to/]SIP [filenamePattern filenamePattern ...]
[path/to/]SIPname of the SIP; if no path is given, it will be expected to be in the location defined by the property
[filenamePattern filenamePattern ...]a list of filename patterns (not case-sensitive, '*' is wildcard, but is only allowed at the beginning or end of the pattern). Files matching any one of this patterns will be deleted

Quality Assurance: check SIP against submission agreement

Checks whether the file formats comply with the specifications in the submission agreement. There are two modes: in the first mode (removeBadFiles = false), every file which does not match the submission agreement is listed (using the WARN log entries) and an error code is displayed. In the second mode (removeBadFiles = true), every file which does not match the submission agreement will be deleted from the SIP. The modified mets.xml is saved (the original SIP remains unchanged as a backup).

java ch.docuteam.actions.qualityassurance.SIPSubmissionAgreementCheck \
[path/to/]SIP [removeBadFiles] [operationSA] [operationDSS]
[path/to/]SIPname of the SIP; if no path is given, it will be expected to be in the location defined by the property
[removeBadFiles]optional, { true
[operationSA]optional, ID of an external submission agreement to be used for this action (instead of the agreement which is part of the SIP)
[operationDSS]optional, ID of an external data submission section to be used for this action (instead of the agreement which is part of the SIP)

Quality Assurance: SIP virus check

Every file present in the SIP is scanned for viruses. The virus scanner of ClamAV ( is used for virus checking.

Prerequisite for this check is a started ClamAV service. Depending on the second argument, infected files will be discarded or automatically deleted.

java ch.docuteam.actions.qualityassurance.SIPVirusCheck \
[path/to/]SIP deleteInfected
[path/to/]SIPname of the SIP; if no path is given, it will be expected to be in the location defined by the property
deleteInfectedif true, the operation automatically removes infected files from the SIP

Quality Assurance: remove by level of description

Removes a ertain description level from a SIP.

java ch.docuteam.actions.qualityassurance.RemoveByLevelOfDescription \
[/path/to/]folder levelOfDescription
[path/to/]folderpath of the folder to rename; if no path is given, it will be expected to be in the location defined by the property
levelOfDescriptionname of the level of description to be removed from the SIP

Quality Assurance: add/update file format information

For all files of the SIP, the format identification is done and the resulting information added/updated.

A parameter allows to indicate if existing information should be replaced or kept (default: false).

java ch.docuteam.actions.qualityassurance.SIPFormatIdentificationCheck \
--sip=[path/to/]SIP [--replaceExistingFormatInfo=\{true|false\}]
--sipname of the SIP; if no path is given, it will be expected to be in the location defined by the property
--replaceExistingFormatInfooptional, `{ true

Quality Assurance: remove files by format (PUID or MIME type)

Deletes all files of the SIP that match a given file format. Formats can be indicated either by MIME type or by Pronom Unique IDentifiers (PUID).

java ch.docuteam.actions.qualityassurance.SIPDeleteFilesByFormat \
--sip=[path/to/]SIP [--mimetype=...] [--puid=...]
--sipname of the SIP; if no path is given, it will be expected to be in the location defined by the property
--mimetypeoptional; comma separated list of MIME types to be deleted from this package
--puidoptional; comma separated list of PRONOM identifiers (PUID) to be deleted from this package