Skip to main content

Migration F3 -> F6

Context

With the latest release of Fedora 3 published in 2015, docuteam has closely followed the developments of subsequent major versions of the Fedora Repository software. With version 6, a suitable solution is available that allows switching to a modern platform without drawbacks in underlying technologies. At the same time, a new standard for archival descriptions "Records in Contexts" (RiC) has evolved, which does not require but benefit from linked data technologies that Fedora 6 has at its core.

In the light of these two developments, we updated our tools – and added new ones – to support the new technological platform as well as the new concptual model of RiC.

For existing users of docuteam cosmos, migrating the long-term repository is an activity which has to be carefully planned, executed and verified. Here's an outline of the procedure and tasks involved.

Infrastructure preparation

Migration is done with the assumption, that the Fedora 6 repository stack will be installed and running in parallel with the existing Fedora 3 stack until als data transfer and verification has been successfully concluded. The Fedora 6 stack, therefore, will need it's own instance for installation – there will not be an in-place upgrade of Fedora 3 or a possibility to operate both versions on the same server instance.

Data analysis

In parallel to the setup of the new infrastructure, the existing objects in the Fedora 3 repository will undergo a validation regarding both content and structure. This validation will produce a report with objects that do not correspond to the generic object model of docuteam. This will be the case for clients that have added additional/non-standard metadata, or for example done manual modifications on the repository objects. docuteam will consult in such situations to ensure that such situations are either resolved prior to the migration, or taken into consideration during the actual procedure.

Data migration

Once the new repository stack is in place, and any issues discovered during the data analysis is resolved, we will launch the actual data migration. The process consists of two workflows: One that checks for PIDs not yet migrated and creates events for each one. And a second workflow that processes a given PID: extracting the DIP for the given PID, transforming the XML (METS, EAD, PREMIS) into RDF (RiC, PREMIS), and creating the respective resources in the LD platform (Fedora 6). While this data migration is ongoing, the regular operation can continue – either by doing double storage operations in F3 and F6, or by launching the migration repeatedly.

Verification

Migration verification is done by comparing the original Fedora Object XML files (FOXML) to a FOXML file created through the box API based on the data in Fedora 6.

Cleanup, teardown F3

When the data migration has been verified for correctness and completeness, the Fedora 3 instance will be turned off, and the ingest workflow only do storage into the Fedora 6 repository.