API
API v1
docuteam feeder offers REST APIs for creating depositions and events and accessing digital objects. These APIs usually return a JSON (and binary data) response.
Using the depostion API, external applications are able to submit depositions (package with data and metadata) to docuteam feeder. Depositions are then picked up by docuteam feeder workflows, usually processing and storing the deposition in a Fedora repository.
After a successful ingest, the depositions API return persistent identifiers (PIDs) for every object (file, folder) within the deposition. Using these PIDs, clients are able to access the deposited objects again.
In the Open Archival Information System (OAIS) terminology:
- docuteam feeder receives "Submission Information Packages" (SIP) on its deposition API.
- The SIPs are processed by docuteam feeder (quality assurance, optional initial preservation actions) and stored into a repository as "Archival Information Packages" (AIP).
- Client applications retrieve "Dissemination Information Packages (DIP)" of their originally submitted objects using the access API of docuteam feeder.
The technical API documentation is available as OpenAPI specification from within the application itself under ./docs/v1
.
Version 1 of this API offers the following operations:
Deposition API
This API can be used to create depositions. When creating a deposition, the HTTP response returns the deposition ID. Based on this ID, the status of the deposition and (after successfull archiving) the PIDs of the archived objects can be queried using the API.
The following API routes are available:
- POST
/api/v1/depositions
: Create a new deposition - GET
/api/v1/depositions
: List all depositions with details - GET
/api/v1/depositions/:id
: Show deposition - PUT
/api/v1/depositions/:id
: Update depositions response (parameterfeeder_response
) - DELETE
/api/v1/depositions/:id
: Soft-delete a deposition (marking it as deleted and removing its actual payload)
When creating a deposition, the package format needs to be specified in the package_format
parameter. If empty the package format is set to MatterhornMETSv1.0 by default.
When listing depositions, the following optional parameters can be used to filter the list of depositions:
- status
- from
- until
- organization
When updating depositions it is possible to update the feeder_response
of the deposition (by sending a well-formed and url encoded json string in the feeder_response
parameter), assuming the status of the deposition does not equal error
. Please note, that it is not possible to change the status of the deposition using the API (only internal feeder processes can change the status).
Responses
Responses are given in JSON or (for the show method) as a binary. Each JSON response is a list of depositions with their details. The generic structure looks like this:
{ "api" :
{ "name": "docuteam bridge API",
"version": "1.4.0" },
"response" :
[
{ "id": 1234,
"uploaded_at": "2018-11-03T11:13:39.278026Z",
"queued_at": "2018-11-03T14:16:12.678056Z",
"processed_by_feeder_at": "2018-11-03T14:16:12.678016Z",
"archived_at": "2018-11-03T14:16:12.678016Z",
"deleted_at": null,
"status": "archived",
"feeder_response": { json-blackbox },
"organization": "museumplus",
"repository_key": "museumplus",
"package_format" : "DocuteamDublinCorev1.0",
"package_attached" : true,
"package_byte_size": 2716786
}
]
"request" :
{ "organization": "museumplus",
"role": "reader",
"requested_at": "2018-11-03T11:13:39.278026Z"}
}
Key elements include:
id
is the deposition identifier, i.e. the API internal reference for depositions. It is used to access a specific depositionstatus
is the deposition's status, as described above.feeder_response
contains feedback from the processing of the deposition in feeder. The content of this field is also formatted in JSON. It is a black box from the API's perspective.
Upon archival success, feeder will return a structure of the form:
{ "pids":
[
{ "clientId":"c1", "pid":"CH-1234565-7:1"},
{ "clientId":"c2", "pid":"CH-1234565-7:2"},
...
],
"feeder_version": "5.4.0"
}
It must be noted that:
- the
clientId
corresponds to the mandatory ids submitted by the client application for each object within the SIP (for example in the case of docuteam DublinCore SIP it is located in dc.xml files and using the following syntax<dc:identifier>clientId:d4FTw3v6T</dc:identifier>
). - the
pid
is the persistent identifier allocated by the repository. It is of the formnamespace:id
, where "namespace" is generally the institutional or ISIL code (for example:CH-1234565-7:2
), and "id" the unique number for that namespace in the repository.
Event API
This API can be used to create events in docuteam feeder and retrieve either a list of events or a details of a single event.
The following API routes are available:
- POST
/api/v1/events
: Create new event - GET
/api/v1/events
: List all events. The page (defaults to 1) and per_page (defaults to 50) parameters can be used to filter the results. - GET
/api/v1/events/:id
: Show specific events
When creating event, the request body should look as follows
{
"event": {
"payload": {
"event_type": "File Placed",
"source": "Doc Observer"
}
},
}
In payload
, you can enter whatever is relevant to your event that you send. The following two fields are required:
event_type
: The type of your eventsource
: The source of the event
The parameters are only checked for their existence, but not for their content. Our recommendation is to declare the application sending the event as the source
and a descriptive name for the event_type
.
You are free to add more parameters to a payload. Have a look at the respective documentation page to see what kind of events could be created.
If everything is fine, the event API will respond with a JSON representation of the event just created:
{
"id": 9,
"payload": {
"event_type": "File Placed",
"source": "Doc Observer"
},
"created_at": "2021-12-06T16:31:29.000+01:00"
}
If something is wrong with the provided data, the event API will respond with the HTTP code 400 and a detailled error message. E.g. leaving out the payload results in the following response json:
[
"Event payload can't be blank",
"Event payload Source attribute is missing or empty in payload",
"Event payload Event type attribute is missing or empty in payload"
]
Access API
The Access API of docuteam feeder can be used to access DIP and repository objects from a Fedora 3-based respository. It acts as a proxy for docuteam rservices. If using a Fedora 6 based repository stack, the Access API of feeder is not functional. Instead the Access API of docuteam box should be used and the Access API of feeder should be deactivated using the fedora3 feature flag.
The Access API is able to generate DIPs starting from any level of an archival package and assemble it recursively. Another notable feature is the on-the-fly generation of preview/thumbnail and, more generally, format migrations. This access method retrieves data from the repository, hence expects PIDs (and not API internal IDs, as it is the case for the depositions API). The Access API is limited to synchronous requests, meaning that the required object is prepared and returned at once. Async/callback requests are not yet possible.
The following API routes are available:
- GET
/api/v1/access/sync_ead/:pid
: Download EAD metadata - GET
/api/v1/access/sync_premis/:pid
: Download PREMIS metadata - GET
/api/v1/access/sync_dip/:pid
: Download DIP - GET
/api/v1/access/sync_original/:pid
: Download binary file - GET
/api/v1/access/sync_preview/:pid
: Download preview version of the binary file
When downloading DIPs, it is possible to recursively create a DIP containing both the node selected by the pid parameter and its children.
Workflow executions API
The workflow executions API can be used to list feeder workflow executions, show details of an execution and launch a workflow exeuction. The API returns XML data of one or multiple workflow executions.
The following API routes are available:
- GET
/api/v1/workflow_executions
: List all workflow executions (depending on the token, all executions or only executions of a feeder organisation are shown) - GET
/api/v1/workflow_executions/:id
: Show data of a single workflow execution (based on a workflow execution id) - POST
/api/v1/workflow_executions
Create a new workflow execution
When using the API to create a new execution, a json body containing the details has to be delivered:
{
"workflow_execution": {
"workflow_id": 248042309,
"sip": "sip",
"queue_name": "queue_name"
}
}
Authentications
Access is restricted via tokens that must be transmitted with each request using the token
parameter. This is required for all HTTP methods, i.e. GET, POST, PUT, DELETE. docuteam feeder API relies solely on tokens for authentication and authorization. Tokens are bound to organizations and roles and restrict the API. An authentication token must be at least 15 characters long. Tokens are passed using a "token" HTTP request parameter, e.g. https://server:port/api/method?token=123456789012345
With the menu entry "Authentications" in the "Admin" tab it is possible to create and manage authentications (tokens) for the feeder APIs.
When creating an authentication, the following fields are available:
- Name: Required field, can be defined at will
- Organization: Required field, drop down menu for selection an organization, which can be accessed by the token
- Role: Role of the token
- Repository: Dropdown menu for selecting a repository which can be accessed by the token (only necessary for the Access API
- Username: Username for accessing docuteam rservices
- Password: Password for accessing docuteam rservices
The actual token value is generated automatically when creating the entry.
The following four roles are available:
read
(limited to single organization)deliver
(limited to single organization)manage
(limited to single organization)feeder
(global scope)
The following table gives an overview of the different roles and their autorisations. The three roles read
, deliver
and manage
are limited to a single organization:
API endpoint | method | read token | deliver token | manage token | feeder token |
---|---|---|---|---|---|
access | all methods | yes | yes | yes | yes |
depositions | create | yes | yes | yes | |
depositions | index / show | yes (own depositions) | yes (depositions of organization) | yes | |
depositions | update | yes | |||
depositions | delete | yes (own depositions) | yes (depositions of organization) | yes | |
events | create | yes | yes | yes | |
events | index / show | yes (own events) | yes (events of organization) | yes | |
workflow_executions | create | yes | yes | yes | |
workflow_executions | index / show | yes (executions of organization) | yes |