open context and publishing to the web of data: eric kansa's lawdi presentation

22
Publishing to the “Web of Data” in Archaeology: Quality and Workflows Eric Kansa UC Berkeley / OpenContext.org Unless otherwise indicated, this work is licensed under a Creative Commons Attribution 3.0 License <http://creativecommons.org/licenses/by/3.0/>

Upload: ekansa

Post on 22-Jan-2015

532 views

Category:

Technology


2 download

DESCRIPTION

This presentation discusses how a model of “data sharing as publishing” can contribute to developing Linked Open Data resources in archaeology and the study of the ancient world. The paper gives examples from Open Context’s developing approach to data editing, documentation and quality improvement processes. The goal of these efforts is to better align the professional interests of individual researchers with the needs of the larger community to access and use high-quality data in Linked Data scenarios.

TRANSCRIPT

  • 1. Publishing to the Web of Datain Archaeology: Quality and Workflows Eric Kansa UC Berkeley / OpenContext.org Unless otherwise indicated, this work is licensed under a Creative CommonsAttribution 3.0 License

2. Web of Data (2011) Main Contributors:Institutions (esp. government)Thematic collections / projects 3. Thousand Flowers Open access, open licensed data Archiving by California Digital Library Persistent Identifiers (DOIs, ARKs) Web services NSF/NEH links for data management plans 4. Thousand FlowersFills a Gap:Most data sources are institutional.Open Context publishes individual,small group contributions 5. Thousand FlowersFills a Gap:Most data sources are institutional. Challenge:Open Context publishes individual, Diversesmall group contributionscontributions, needing lots of work to clean- up and link 6. 3-year project Oct 2010 Sep 2013Funded with a National Leadership Grant from theInstitute for Museum and Library Services, LG-06-10-0140-10, Dissemination Information Packagesfor Information ReuseIxchel Faniel, PI & Elizabeth Yakel, Co-PIhttp://www.dipir.org 7. Open Context Interviewees22 Ph.D. or graduate studentsinterviewed13 men9 womenNovices / Experts19 experts3 novicesInterviewees who wherecurators or professors alsowith a curatorial role = 6 8. Open Context Interviewees 9. Data Documentation PracticesI use an Excel spreadsheetwhich I inherited from my researchadvisers. my dissertation advisor was still recording data for eachspecimen on paper when I was in graduate school so thats what Istarted then quickly, I was like, "This is ridiculous. I just startedusing an Excel spreadsheet that has sort of slowly gotten bigger andbigger over time with more variables or columnsIve added colorcodingI also usea very sort of primitive numerical coding system,again, that I inherited from my research advisersSo, this little bookthat goes with me of codes which is sort of odd, but we all knowthat a 14 is a sheep. (CCU13) 10. Data Documentation PracticesI use an Excel spreadsheetwhich I inherited from my researchadvisers. my dissertation advisor was still recording data for eachspecimen on paper when I was in graduate school so thats what Istarted then quickly, I was like, "This is ridiculous. I just startedusing an Excel spreadsheet that has sort of slowly gotten bigger andbigger over time with more variables or columnsIve added colorcodingI also usea very sort of primitive numerical coding system,again, that I inherited from my research advisersSo, this little bookthat goes with me of codes which is sort of odd, but we all knowthat a 14 is a sheep. (CCU13)A long way to go before weget Linked Data 11. Sometimes data is betterserved cooked. 12. Thousand FlowersClean-up and documentcontributed dataMap to ArchaeoMLMint URIs to entities(potsherds, projects, contexts,people)Link to important vocabularies /collections (Pleiades,Encyclopedia of Life)Working on CLAROS-basedCIDOC-CRM (RDF)representations (notstraightforward) 13. My Precious DataImage Credit: Lord of the Rings (2003, NewLine), All Rights Reserved Copyright 14. Data sharing as publication 15. Data Publishing 16. Publishing Data Quality and Standards Alignment (1) Check consistency (2) Edit functions (3) Align to common standards (Linked Data if applicable) (4) Issue tracking, version control 17. Publishing Tools of the Trade(1) Google Refine (check, edit,consistancy)(2) Mantis (issue-tracker,coordinate edits, metadatacreation) 18. Publishing Project Metadata Column Descriptions 19. Publishing Entity Reconciliation(1) With Google Refine(2) Implemented, EOL andPleiades(3) Need more vocabularies!(4) Simple model, not complexontology mapping 20. CDL Archiving ServiceHow do DOIs, ARKs, etc. workwith Web and Linked Data?Question of granularity andemphasis(archive objects) 21. Summary Outcomes of Publishing Data:(1) Communicate and setexpectations about content andquality(2) Organize workflows to improvedata quality and usability(3) Make datasets first class citizensin world of scholarlycommunications 22. Final ThoughtsPublication needs to evolve! (1) Participating in Linked Data is a great goal, but far removed from most everyday practice (2) Researchers need help. (3) 19th century publication norms poorly suited to 21st century methods, research, public goals