anwendungsfaelle für elasticsearch

Download Anwendungsfaelle für Elasticsearch

Post on 11-Aug-2014

542 views

Category:

Data & Analytics

0 download

Embed Size (px)

DESCRIPTION

German slides for different use cases for Elasticsearch: Document Store, full text search, flexible query cache, geospatial search, logfile analytics, analytics.

TRANSCRIPT

  • Anwendungsflle fr Florian Hopf @fhopf http://www.florian-hopf.de 15.07.2014
  • Agenda
  • Vorbereitung
  • curl -XGET http://localhost:9200 { "status" : 200,"name" : "Hawkeye", "version" : { "number" : "1.2.1", "build_hash" : "6c95b759f9e7ef0f8e17f77d850da43ce8a4b364", "build_timestamp" : "2014-06-03T15:02:52Z", "build_snapshot" : false, "lucene_version" : "4.8" }, "tagline" : "You Know, for Search" } Installation curl -XGET http://localhost:9200 { "status" : 200,"name" : "Hawkeye", "version" : { "number" : "1.2.1", "build_hash" : "6c95b759f9e7ef0f8e17f77d850da43ce8a4b364", "build_timestamp" : "2014-06-03T15:02:52Z", "build_snapshot" : false, "lucene_version" : "4.8" }, "tagline" : "You Know, for Search" } # download archive wget https://download.elasticsearch.org/ elasticsearch/elasticsearch/elasticsearch-1.2.1.zip # zip is for windows and linux unzip elasticsearch-1.2.1.zip # on windows: elasticsearch.bat elasticsearch-1.2.1/bin/elasticsearch
  • curl -XGET http://localhost:9200 { "status" : 200,"name" : "Hawkeye", "version" : { "number" : "1.2.1", "build_hash" : "6c95b759f9e7ef0f8e17f77d850da43ce8a4b364", "build_timestamp" : "2014-06-03T15:02:52Z", "build_snapshot" : false, "lucene_version" : "4.8" }, "tagline" : "You Know, for Search" } Zugriff curl -XGET http://localhost:9200 { "status" : 200,"name" : "Hawkeye", "version" : { "number" : "1.2.1", "build_hash" : "6c95b759f9e7ef0f8e17f77d850da43ce8a4b364", "build_timestamp" : "2014-06-03T15:02:52Z", "build_snapshot" : false, "lucene_version" : "4.8" }, "tagline" : "You Know, for Search" } curl -XGET http://localhost:9200 { "status" : 200,"name" : "Hawkeye", "version" : { "number" : "1.2.1", "build_hash" : "6c95b759f9e7ef0f8e17f77d850da43ce8a4b364", "build_timestamp" : "2014-06-03T15:02:52Z", "build_snapshot" : false, "lucene_version" : "4.8" }, "tagline" : "You Know, for Search" }
  • Document Store
  • Document { "title" : "Anwendungsflle fr Elasticsearch", "speaker" : "Florian Hopf", "date" : "2014-07-15T16:30:00.000Z", "tags" : ["Java", "Lucene"], "conference" : { "name" : "Developer Week", "city" : "Nrnberg" } }
  • Speichern curl -XPOST http://localhost:9200/conferences/talk/ --data-binary @talk-example.json { "_index":"conferences", "_type":"talk", "_id":"GqjY7l8sTxa3jLaFx67_aw", "_version":1, "created":true }
  • Speichern curl -XPOST http://localhost:9200/conferences/talk/ --data-binary @talk-example.json { "_index":"conferences", "_type":"talk", "_id":"GqjY7l8sTxa3jLaFx67_aw", "_version":1, "created":true } Index
  • Speichern curl -XPOST http://localhost:9200/conferences/talk/ --data-binary @talk-example.json { "_index":"conferences", "_type":"talk", "_id":"GqjY7l8sTxa3jLaFx67_aw", "_version":1, "created":true } Index Type
  • Lesen curl -XGET http://localhost:9200/conferences/talk/ GqjY7l8sTxa3jLaFx67_aw?pretty=true { "_index" : "conferences", [...] "_source":{ "title" : "Anwendungsflle fr Elasticsearch", "speaker" : "Florian Hopf", "date" : "2014-07-15T16:30:00.000Z", "tags" : ["Java", "Lucene"], "conference" : { "name" : "Developer Week", "city" : "Nrnberg" } } }
  • Sharding Aufteilen eines Index in mehrere Teile Default: 5 Shards pro Elasticsearch-Index Mehrere Elasticsearch-Instanzen knnen einen Cluster bilden Automatische Verteilung auf die Knoten im Cluster
  • Sharding
  • Sharding
  • Sharding
  • Einfache Speicherung von JSON-Dokumenten Index und Type Sharding fr groe Datenmengen Verteilung ist First Class Citizen Recap
  • Users HipChat http://highscalability.com/blog/2014/1/6/how-hipchat-stores-and- indexes-billions-of-messages-using-el.html Engagor http://www.jurriaanpersyn.com/archives/2013/11/18/introduction-to- elasticsearch/ http://www.elasticsearch.org/case-study/engagor/
  • Volltextsuche
  • Suche per Parameter curl -XGET "http://localhost:9200/conferences/talk/_search ?q=elasticsearch&pretty=true" {"took" : 73, [] "hits" : { [] "hits" : [ { [] "_score" : 0.076713204, "_source":{ "title" : "Anwendungsflle fr Elasticsearch", "tags" : ["Java", "Lucene"], [] } } ] } }
  • Query DSL curl -XPOST "http://localhost:9200/conferences/_search " -d' { "query": { "match": { "title" : { "query": "elasticsaerch", "fuzziness": 2 } } }, "filter": { "term": { "conference.city": "nrnberg" } } }'
  • Sprache curl -XGET "http://localhost:9200/conferences/talk/_search ?q=title:anwendungsfall&pretty=true" { "took" : 2, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 0, "max_score" : null, "hits" : [ ] } }
  • Term Document Id anwendungsfall 1 elasticsearch 1,2 fur 1 mit 1 such 1 verteilt 1 1. Tokenization 2. Lowercasing 3. Stemming Anwendungsflle fr Elasticsearch Verteiltes Suchen mit Elasticsearch Analyzing
  • Mapping curl -XDELETE "http://localhost:9200/conferences/" curl -XPUT "http://localhost:9200/conferences/ curl -XPUT "http://localhost:9200/conferences/talk/_mapping" -d' { "properties": { "tags": { "type": "string", "index": "not_analyzed" }, "title": { "type": "string", "analyzer": "german" } } }'
  • Sprache curl -XGET "http://localhost:9200/conferences/talk/_search ?q=title:anwendungsfall&pretty=true" { "took" : 2, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, [] } }
  • Was noch? Faceting/Aggregations Suggestions Highlighting Sortierung Pagination ...
  • Recap Ausdrucksstarke Suchen ber Query DSL Analyzing als Kernfunktionaltt Alle Lucene-Goodies verfgbar
  • Users GitHub http://exploringelasticsearch.com/github_interview.html http://www.elasticsearch.org/case-study/github/ StackOverflow http://meta.stackexchange.com/questions/160100/a-new-search-engine-for-stack-exchange http://nickcraver.com/blog/2013/11/22/what-it-takes-to-run-stack-overflow/ SoundCloud http://developers.soundcloud.com/blog/architecture-behind-our-new-search-and-explore-experience http://www.elasticsearch.org/case-study/soundcloud/ XING http://www.elasticsearch.org/case-study/xing/
  • Flexibler Cache
  • Anwendung DB Setup Suche
  • Nur Suche?
  • Anwendung DB Queries
  • Listing curl -XPOST "http://localhost:9200/conferences/_search " -d' { "filter": { "term": { "conference.city": "nrnberg" } } }'
  • Geo-Suche
  • Strukturierte Suche Nicht nur Volltext Strukturierte Daten: Geo- und numerische Daten, Datumswerte Geopoint als Datentyp Sortierung Filterung
  • Anwendungen Zeige nchste Filiale Filialsuche Sortierung Kleinanzeigen Sortierung Locations Filterung auf Nhe Social Media-Analysen
  • Document { "title" : "Anwendungsflle fr Elasticsearch", "speaker" : "Florian Hopf", "date" : "2014-07-15T16:30:00.000Z", "tags" : ["Java", "Lucene"], "conference" : { "name" : "Developer Week", "city" : "Nrnberg", "coordinates": { "lon": "11.115358", "lat": "49.417175" } } }
  • Mapping curl -XPUT "http://localhost:9200/conferences/talk/_mapping" -d' { "properties": { [], "conference": { "type": "object", "properties": { "coordinates": { "type": "geo_point" } } } } }'
  • Sortierung curl -XPOST "http://localhost:9200/conferences/_search " -d' { "sort" : [ { "_geo_distance" : { "conference.coordinates" : { "lon": 8.403697, "lat": 49.006616 }, "order" : "asc", "unit" : "km" } } ] }'
  • Filterung curl -XPOST "http://localhost:9200/conferences/_search" -d' { "filter": { "geo_distance": { "conference.coordinates": { "lon": 8.403697, "lat": 49.006616 }, "distance": "200km", "distance_type": "arc" } } }'