volltextsuche mit elasticsearch im geodaten- münchen hochschule für angewandte wissenschaften...

Download Volltextsuche mit Elasticsearch im Geodaten- München Hochschule für angewandte Wissenschaften Fakultät

Post on 17-Aug-2019

212 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Hochschule München Hochschule für angewandte Wissenschaften Fakultät für Geoinformation

    Bachelorarbeit zur Erlangung des akademischen Grades Bachelor of Engineering (B.Eng.) im

    Studiengang Geoinformatik und Satellitenpositionierung

    Volltextsuche mit Elasticsearch im Geodaten-Umfeld

    vorgelegt von

    Mosaab Asli Matrikelnummer: 61825514

    eingereicht am 09. April 2018

    Betreuer: Prof. Dr. Gerhard Joos

    Dipl.-Ing. Florian Mückl

  • Inhaltsverzeichnis Listingverzeichnis...............................................................................................................................III Abbildungsverzeichnis.......................................................................................................................IV Tabellenverzeichnis............................................................................................................................IV Abkürzungsverzeichnis........................................................................................................................V Vorwort...............................................................................................................................................VI Abstrakt.............................................................................................................................................VII 1 Einleitung...........................................................................................................................................1

    1.1 Problemstellung............................................................................................................................... 1 1.2 Zielsetzung...................................................................................................................................... 1 1.3 Aufbau der Arbeit............................................................................................................................ 2

    2 Entwicklung der Datenbanksysteme.................................................................................................3 3 Elasticsearch......................................................................................................................................4

    3.1 Hauptmerkmale von Elasticsearch................................................................................................... 4 3.2 Grundlegendes Konzept.................................................................................................................. 5

    3.2.1 Logisches Layout...........................................................................................................................5 3.2.2 Physisches Layout.........................................................................................................................5

    3.3 Indexstruktur.................................................................................................................................... 6 3.3.1 Index Settings................................................................................................................................6 3.3.2 Mapping Type................................................................................................................................8 3.3.3 Aliases...........................................................................................................................................8

    3.4 Arbeitsmechanismus........................................................................................................................ 9 3.5 Suche............................................................................................................................................. 11

    3.5.1 Match_All-Abfrage......................................................................................................................12 3.5.2 Volltextsuche................................................................................................................................13 3.5.3 Term-Level-Abfrage....................................................................................................................15

    3.6 Highlighting................................................................................................................................... 16 3.7 Aggregation................................................................................................................................... 16 3.8 Kommunikation mit Elasticsearch................................................................................................. 18

    3.8.1 HTTP-Kommunikation................................................................................................................18 3.8.2 Native Kommunikation................................................................................................................19 3.8.3 cURL...........................................................................................................................................20 3.8.4 Kibana Dev-Tools........................................................................................................................20

    4 Elasticsearch im Vergleich...............................................................................................................21 4.1 Vergleich mit ausgewählten DBMS............................................................................................... 21 4.2 Leistungsvergleich mit PostgreSQL.............................................................................................. 22

    5 Programmierung und Implementierung...........................................................................................24 5.1 Getrennte Datenhaltung................................................................................................................. 24 5.2 Docker........................................................................................................................................... 25 5.3 Vorgehensweise............................................................................................................................. 26

    5.3.1 View erstellen und anpassen........................................................................................................26 5.3.2 Von PostgreSQL zu Elasticsearch................................................................................................30 5.3.3 Java-Applikation generalisieren...................................................................................................31 5.3.4 Mapping und Settings festlegen...................................................................................................31 5.3.5 Daten-Import...............................................................................................................................34 5.3.6 Bash-Skripte................................................................................................................................35

    6 REST-Fassade..................................................................................................................................36 6.1 Vorbereiten der Suchabfrage.......................................................................................................... 37 6.2 ElasticAdress-API......................................................................................................................... 39 6.3 REST-Endpoint.............................................................................................................................. 44 6.4 Weitere Bestandteile der Softwareentwicklung.............................................................................44

    I

  • 6.4.1 Logging.......................................................................................................................................44 6.4.2 Unit-Tests....................................................................................................................................45 6.4.3 Upgrade auf neuere Version.........................................................................................................46

    7 Fazit und Ausblick...........................................................................................................................47 Literaturverzeichnis............................................................................................................................48 Anhänge..............................................................................................................................................52

    II

  • Listingverzeichnis Listing 1: Elasticsearch Index Settings.................................................................................................................................6 Listing 2: Tokenizer..............................................................................................................................................................7 Listing 3: HTML-Strip-Char-Filter......................................................................................................................................7 Listing 4: Index Aliases........................................................................................................................................................9 Listing 5: Einstellung von Shards und Replicas...................................................................................................................9 Listing 6: Match_All-Abfrage............................................................................................................................................12 Listing 7: Termbasierte Query........................................................