Skip to main content

Docuteam Dublin Core 1.0

Docuteam Dublin Core 1.0 is a package format that can be used for delivery to the Deposition API of docuteam feeder.


  • A docuteam dublin core SIP is a .zip file containing a folder named sip which is a bagit container.
  • The bagit must be created using at least sha256 checksums (other checksum algorithms supported by bagit are optional).
  • Inside the bagit container, a hierarchical folder contains data objects described using XML DublinCore metadata.


Bagit library:


Bagit container structure specification

Within the zipped bagit, the SIP is organized as follows:

  • bagit contains at least sha256 checksums
  • the root folder, corresponding to the root object within the SIP, is named data (this is handled automatically by bagit libraries)
  • subfolders may be named freely
  • subfolders may be organized recursively
  • in each folder (at all levels) there is a mandatory metadata file always named dc.xml
  • in addition, each folder (at all levels) may contain either (but not both!):
    • one or more subfolders
    • one datafile, which may be named freely (except dc.xml)

A more formal structure definition:

<rootfolder>     ::=  <metadata file> <children>*
<metadata file> ::= dc.xml
<children> ::= <folder>* | <file>
<folder> ::= <metadata file> <children>*
<file> ::= filename.ext

Container structure examples

  • Example 1: container structure with only one file
├── dc.xml
└── filename1.ext
  • Example 2: container structure with several files
├── dc.xml
├── folder1
│ ├── dc.xml
│ └── fileA.ext
├── folder2
│ ├── dc.xml
│ └── fileB.ext
└── folder3
├── dc.xml
└── fileC.ext
  • Example 3: complex structure with several files
├── dc.xml
├── folder1
│ ├── dc.xml
│ ├── folder2
│ │ ├── dc.xml
│ │ └── file3.ext
│ └── folder4
│ ├── dc.xml
│ └── folder5
│ ├── dc.xml
│ └── file5.ext
├── folder6
│ ├── dc.xml
│ └── file6.ext
└── folder7
├── dc.xml
└── folder8
├── dc.xml
└── folder9
├── dc.xml
└── file8.ext

Metadata specification

Metadata is restricted to the Dublin Core Metadata Element Set, i.e. to 15 elements (dc 1.1 terms, see

In addition, the following constraints apply:

  • The Identifier field is mandatory at each level in dc.xml, it must contain:
    • At each level: the the client application identifier of the object with the prefix clientid:(e.g. clientid:1234567 or clientid:d4FTw3v6T)
    • At root level, a mandatory identifier with the customer namespace in the repository (this is often the ISIL code) prefixed with namespace:, e.g. namespace:CH-1234-1
  • The Title field is mandatory at each level in the dc.xml file. It is not repeatable. All other 13 fields are optional and repeatable, they are:
    • Creator (e.g. the authors, one per field repetition, that can be persons or institutions)
    • Subject (typically keywords, one per field repetition)
    • Description (a textual description of the object or folder)
    • Publisher
    • Contributor
    • Date (use ISO-8601, e.g. 2018-11-30)
    • Type
    • Format
    • Source
    • Language
    • Relation
    • Coverage
    • Rights

Metadata examples

  • Example 1: minimal metadata at root level
<?xml version="1.0" encoding="UTF-8"?>

<dc:title>Minimalist Example</dc:title>
  • Example 2: full metadata at root level
<?xml version="1.0" encoding="UTF-8"?>

<dc:title>All fields are set</dc:title>
<dc:creator>Atreid, Leto</dc:creator>
<dc:description>Description of the docuteam dublin core package format, v. 1.0.</dc:description>
<dc:contributor>Smith, John</dc:contributor>
<dc:contributor>Jaquard, Paul</dc:contributor>
<dc:source>Dublin Core Package Structure ( 1lxqiqkmlNYVWlwJSsIe4b5DwJxN6DZqNvpo0MouAFIA/edit)</dc:source>
<dc:relation>docuteam bridge api for client applications</dc:relation>
<dc:rights>CreativeCommons CC-By</dc:rights>