File migration

Configuration of file migrations

Rules for format migrations and the tools to be used for this are defined in the file migration-config.xml. It is used in the step Ingest: migrate files.

<?xml version="1.0" encoding="UTF-8"?>
<config>
	<application id="1"
                 name="ImageMagick" 
                 executable="D:\docuteam\apps\ImageMagick\convert.exe"
				 parameter="-compress#none#{[arg1]}#{[arg2]}" />
...
</config>

In this example the application ImageMagick is defined as application number 1. It is also specified that the program convert.exe is to be executed, which is located in the folder D:\docuteam\apps\ImageMagick. The parameters -compress#none#{[arg1]}#{[arg2]} are passed to the program call, where "{[arg1]}" is replaced by the source file and "{[arg2]}" by the target file.

The second part of the file migration-config.xml consists of instructions for format migrations.

	<puid name="fmt/41" 
          applicationID="1"
          targetExtension="tif"
          targetPronom="fmt/353" />

The example defines that files with a PUID (PRONOM's Persistent Unique Identifier) fmt/41 (Raw JPEG Stream) should be converted to a file with PUID fmt/353 (Tagged Image File Format). The application defined above with the application number 1 (here ImageMagick) should be used.

Besides the specification of a PUID, MIME types and file extensions can also be specified. Format migration according to PUID has first priority. If this is not successful, the second priority is to try to perform the migration according to the MIME type. If this also fails, the file extension is taken into account:

	<puid      name="fmt/41" 
               applicationID="1"
               targetExtension="tif"
               targetPronom="fmt/353" />
    
    <mimeType  name="image/jpeg"
               applicationID="1"
               targetExtension="tif"
               targetPronom="fmt/353" />
	
    <extension name="jpg"
               applicationID="1"
               targetExtension="tif"
               targetPronom="fmt/353" />

File formats that are not listed (whether by PUID, mime type, or file extension) are not migrated.