hvordan sette opp en oai-pmh metadata-innhøster

Post on 25-Dec-2014

1.063 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Held in Bergen, Norway.

TRANSCRIPT

Hvordan sette opp en OAI-PMH metadata-innhøster

(PKP Open Archives Harvester)

Magnus EngerBiblioteklaboratorie-workshop

Bergen, 12. - 13. november 2007

System Requirements● PHP >= 4.2.x (including PHP 5.x); Microsoft IIS

requires PHP 5.x● MySQL >= 3.23.23 (including MySQL 4.x/5.x)

or PostgreSQL >= 7.1 (including PostgreSQL 8.x)

● Apache >= 1.3.2x or >= 2.0.4x or Microsoft IIS 6 (untested)

● Operating system: Any OS that supports the above software, including Linux, BSD, Solaris, Mac OS X, Windows

Support

Med SSH (PuTTY)/Telnet● Logg på serveren● Last ned fila$ wget http://pkp.sfu.ca/harvester2/download/harvester-2.0.1.tar.gz

● Pakk ut fila$ tar -xvf harvester-2.0.1.tar.gz

● Gå inn i mappa$ cd harvester-2.0.1

● Flytt innholdet til ønsket plass$ mv * ~/subdomener/harvester/

Med FTP● Last ned fila til lokal maskin● Pakk ut fila● Logg på server med FTP-klient● Last opp de utpakkede filene til ønsket plass på

serveren

Fremgangsmåte● Er beskrevet i fila docs/README

Endre filrettigheter● Gjør følgende filer/mapper skrivbare:

– config.inc.php (optional -- if not writable you will be prompted to manually overwrite this file during installation)

– public– cache– cache/t_cache– cache/t_config– cache/t_compile– cache/_db

Lag opplastingsmappe● Lag en mappe for å lagre opplastede filer,

gjerne utenfor server-rota● Gjør denne mappa skrivbar

Installasjon● Via vevleser

– http://yourdomain.com/path/to/harvester2/● Eller vi kommandolinja

– php tools/install.php

Rediger fila config.inc.php

; Use URL parameters instead of CGI PATH_INFO. This is useful for; broken server setups that don't support the PATH_INFO environment; variable.disable_path_info = Off

Endre siste linja til:

disable_path_info = On

Problem● «login»:

http://harvester.collib.info/index.php/login● Viser bare forsiden● Klikk på «HOME»:

http://harvester.collib.info/index.php?page=index● Gå til URLen:

http://harvester.collib.info/index.php?page=login

Hvordan finne høstbare arkiver?● «Registered Data Providers» fra OAI:

http://www.openarchives.org/Register/BrowseSites

Eksempel: DUO ved UIO

Sets● Document types● Frequent occurrences of languages in the

database● Documents where online fulltext-versions are

available● Units at the university

Document types● Master thesis● Dissertation● Student thesis● Series titles● Report● Monography● Article

Frequent occurrences of languages in the database

● Norwegian● English ● French● Norwegian Bokmål● Norwegian Nynorsk● Swedish● German

Documents where online fulltext-versions are available

Units at the university● Humanities(2437)● Humanities\Media and Communication(358)● Humanities\Media and Communication\Media

studies(331)● Humanities\Media and

Communication\Journalism(23)● Humanities\Archeology, Conservation and

Historical Studies(459)

«Update Metadata Index»

Bruk tools/harvest.php !

Nyttige «kommandoer»● php harvest.php

– lister opp diverse opsjoner● php harvest.php list

– liste over arkivene● php harvest.php 1

– Høster metadata fra ett arkiv, tallet tilsvarer tallet i lista over

● php harvest.php 1 verbose– Som over, men med detaljerte meldinger om

fremdriften

Flere nyttige «kommandoer»● php harvest.php all

– Høster data fra alle arkivene● php harvest.php all from=last

– Høster alle nye metadata siden sist innhøsting– Kjør denne jevnlig ved hjelp av cron!

Hacke databasen● archive_settings● archives● captchas ● crosswalk_fields● crosswalks● email_templates● email_templates_data● entries● entry_attributes● plugin_settings● raw_fields

● records● rt_contexts● rt_searches● rt_versions● schema_plugins● search_keyword_list● search_object_keywords● search_objects● sessions● site_settings● versions

Tabell: recordsmysql> describe records;+------------------+--------------+------+-----+---------+----------------+| Field | Type | Null | Key | Default | Extra |+------------------+--------------+------+-----+---------+----------------+| record_id | int(11) | | PRI | NULL | auto_increment || archive_id | int(11) | | | 0 | || schema_plugin_id | int(11) | | | 0 | || identifier | varchar(255) | YES | | NULL | || datestamp | datetime | YES | | NULL | |+------------------+--------------+------+-----+---------+----------------+

Tabell: entriesmysql> describe entries;+--------------+---------+------+-----+---------+----------------+| Field | Type | Null | Key | Default | Extra |+--------------+---------+------+-----+---------+----------------+| entry_id | int(11) | | PRI | NULL | auto_increment || record_id | int(11) | | MUL | 0 | || raw_field_id | int(11) | | MUL | 0 | || value | text | YES | | NULL | |+--------------+---------+------+-----+---------+----------------+

Tabell: raw_fieldsmysql> describe raw_fields;+------------------+-------------+------+-----+---------+----------------+| Field | Type | Null | Key | Default | Extra |+------------------+-------------+------+-----+---------+----------------+| raw_field_id | int(11) | | PRI | NULL | auto_increment || name | varchar(60) | | MUL | | || schema_plugin_id | int(11) | | | 0 | |+------------------+-------------+------+-----+---------+----------------+

Strukturen til en «record»● Record

– Entry a● raw_field_id = x

– raw_fields name = 'Title'● value = 'Kasus før og nå'

– Entry b● raw_field_id = y

– raw_fields name = 'Author'● value = 'Hansen, Hans'

– Entry c● raw_field_id = z

– raw_fields name = 'Publisher'● value = 'Universitetet i Bodø'

Hacke systemet● Interessante kataloger

– classes– locale - språk– pages– plugins– styles – CSS– templates – benytter Smarty

Lære mer● README

– http://pkp.sfu.ca/harvester2/README● Administrator's Guide (17 s.)

– http://pkp.sfu.ca/harvester2/AdminGuide.pdf● Technical Reference (50 s.)

– http://pkp.sfu.ca/harvester2/TechnicalReference.pdf

Spørsmål● Bruk BibLab-wikien (Allmenningen)!● eller● magnus@enger.priv.no

top related