archiving and presenting journals with rosetta matthias groß, bavarian state library, munich,...

16
Archiving and Presenting Journals with Rosetta Matthias Groß, Bavarian State Library, Munich, Germany 10th IGeLU Conference, Budapest, September 2 nd 2015 DRAG Dresden 2014

Upload: clarence-leonard

Post on 28-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Archiving and Presenting Journals with Rosetta

Matthias Groß, Bavarian State Library, Munich, Germany 10th IGeLU Conference, Budapest, September 2nd 2015

DRAG Dresden 2014

2

Short timeline (1) - DigiTool

BVB: Bavarian Library Network, regional consortia for research librariesHead Office: department of the Bavarian State Library

2004-2006: looking for powerful „multimedia“ software2006-: implementing DigiTool, going live 2007/08

How to manage journals?complex objects / collections / METS objects

BVB chooses METS-objects for journals

4

Short timeline (2) - Rosetta

2010-: implementing Rosetta at BSB journals not included in pilot workflows

How to manage journals?collections / METS-objects/…

2013/14 collection management gets better, but …2014 … decision to follow own approach in parallel2015 struggling with some problems, then:

Welcome, journals, to Rosetta!

5

Presenting journals with Rosetta• BSB uses Rosetta as „light“ archive

whenever reasonable• A tree structure with several levels

(unlimited depth) is powerful enough to handle most common journal structures and seems natural for end user presentation

• If the tree structure is represented by an „object“, this can correspond with catalogue entries / persistent identifier on the title level

WANTED:

WANTED:

(elsewhere)

6

Re-shaping our DigiTool concept for Rosetta

• In the „Manual Legal Deposit“ workflow, new issues are ingested as new IEs

• Testing collection management in Rosetta in 2014 we saw still some shortcomings (addressed in Pressure Points document)

• Adding new components (issues) to METS-objects would create new versions and lead to a confusing situation, obfuscating genuine preservation actions

BVB wants something that acts like METS, but is not a METS-object

7

Starting at the end …BVB developed own METS viewer for DigiTool in 2012/13 which is basically independent of the system holding the objects; display uses jquery/css. Only a few interfaces to the system needed:

1. Table of contents: from StructMap/FileSec json (Precache)tree structure with Digitool-PIDs of components as leaves

2. Bibliographic metadata: on-the-fly from original MARC/MODS/DC data (2-layer XSLT transformation to json)

3. Request for a child object: uses delivery URL for embedded mode (provides main title and stream)

4. Thumbnail preview: based on Table of contents using special

Delivery Rule

8

Facial composite of the solution (1)

1. Table of contents as „near-METS“• All components of a journal share the same

bibliographic ID in dc:relation • Store reference data (volume, issue, year)

in dcterms:bibliographicCitation (trick: use OpenURL 1.0)

• Based on this information, a ToC can becreated and stored in the file system asBibID.json with Rosetta‘s IE IDs as leaves.

9

Facial composite of the solution (1a)

Plan: Using MARC/MODS metadata instead; OpenURL trick is not so friendly for human editing

OpenURL as container

10

Facial composite of the solution (2)

2. Bibliographic metadata

BibID is known (from each component); for display fetch recent MARC-XML record via Aleph SRU interface

3. Request for child object

DeliveryRule „embedded“ in Rosetta

4. Thumbnail preview

DeliveryFunction „thumbnail“ in Rosetta

11

Proof of concept

12

Creation of near-METS industrialized

Our approach: Harvesting the OAI interface (good experience with DigiTool)

However, we encountered problems to get valid XML output from Rosetta. After some months it turned out that there is a config parameter ‚dublincore_additional_namespaces‘

(see Home > Advanced > Configuration > General > General Parameters) that should be defined as [blank] – which was not the case in our installation.

13

Data processing (simplified: without deletions)

• ( Rosetta OAI repository

filter by journal

Harvest: What‘s new since …?

BibID BV123456789issue 3, vol. 2, year 2015

Found new component?

add to StructMap

BV123456789.json

Known journalNew journal

createStructMap BV123456789.json

get bibliographic MD from Aleph

14

Following two tracks

Combining near-METS with Rosetta-Collections1 collection equals 1 journal

Metadata on journal level

URN on journal level (PP: CM 2.2.2)

AssignCMS for journal level (metadata in Rosetta // URN, ArchiveURL in ALEPH) (Collection Support – WP, 2012)

Searching monographs and journals in parallel (IEs and collections, PP: CM 2.2.3)

Manual Legal Deposit : Issue goes to correct journal „automatically“

Easy administration of IEs in Rosetta

15

They are waiting:

Legal Deposit:- in DigiTool: 450 journals, 15.000 issues- on heap: 100+ journals, constantly new titles arriving

OA publications- finalizing collection strategy for Bavarica and special subject fields

Licensed publications (E-journal backfiles): - responsibility on national, regional and local levels- for hosting and long term preservation

Digitized material- from ZEND / TSM

16

Thank you very much for your interest in the most fascinating format

of scientific literature!

[email protected]