Work Package 4:Structured data from economic and social
historyAuke RijpmaJanuary 2016
Predecessors• Economic and social historians often work
with structured (tabular) data.
• Includes large data-projects…
• …and many small data sets.
Problems to solve• Finding data in multiples repositories.
• Harmonisation.
• Linking datasets to answer new questions.
• Analysis of multilevel & big data sets.
• Isolated and unknown datasets.
• Reproducability v. disposable science.
Problems to solve• Finding data in multiples repositories.
• Harmonisation.
• Linking datasets to answer new questions.
• Analysis of multilevel & big data sets.
• Isolated and unknown datasets.
• Reproducability v. disposable science.
Problems to solve• Finding data in multiples repositories.
• Harmonisation.
• Linking datasets to answer new questions.
• Analysis of multilevel & big data sets.
• Isolated and unknown datasets.
• Reproducability v. disposable science.
What we propose• Gather and curate important datasets and place
them on the Clariah Structured Data Hub.
• Use web-based linked data-technology to augment, harmonise, link, and query datasets.
• Provide tooling and incentives to upload new datasets.
• Uploading and describing your data gives you augmentation, harmonisation, and links to other micro and macro datasets.
Empower Individual Researchers• Augment and link individual datasets according to best
practices of the community or against colleagues
• Share machine-interpretable code books with fellow researchers
• Align codes and identifiers across datasets
• Publish standards-compliant, reusable datasets
Grow a giant graph of interconnected datasets
Tools to explore, visualise, query, and analyse datasets.
Future CSDH• Upload, describe, and store data.
• Augment, harmonise, and link data.
• Find, explore, query, visualise, and analyse data.
• Share data, queries, and results.
Today’s CSDH• Prototype up and running.
• Loosely interconnected parts without a “hood”.• QBer (intake, data description, harmonisation, linking).
• Dedicated data pipelines.
• Triplestore, data-API, queries.
• Grlc: Query-API.
• Come see our demos!
Utrecht 1829 Utrecht 1839
QBer
Triplestore, data-API, queries, queries-API