elasticsearch for pharo smalltalk

36
Elasticsearch for Pharo Smalltalk Smalltalk で全文検索 Sho Yoshida / @newapplesho SORABITO Inc. 2016/01/29 84Smalltalk勉強会

Upload: sho-yoshida

Post on 13-Apr-2017

1.321 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Elasticsearch for Pharo Smalltalk

Elasticsearch for Pharo Smalltalk

Smalltalkで全文検索

Sho Yoshida / @newapplesho SORABITO Inc.

2016/01/29 第84回Smalltalk勉強会

Page 2: Elasticsearch for Pharo Smalltalk

About• Sho Yoshida

• SORABITO Inc. で働いています

• 働く機械の国際オンライン取引所 ALLSTOCKER ( https://allstocker.com) を作っています

Page 3: Elasticsearch for Pharo Smalltalk

おかげさまで160カ国以上からアクセスが来ております

こんなものを扱っています

Page 4: Elasticsearch for Pharo Smalltalk

第75回Smalltalk勉強会の話• 第75回Smalltalk勉強会で今後やりたいことの1つに「全文検索」

• http://www.smalltalk-users.jp/Home/gao-zhi/dai75kaismalltalkbenkyoukai

• RDS PostgreSQLを使っているので、日本語の全文検索ができない

• 日英の全文検索をサポートしなければならない

Page 5: Elasticsearch for Pharo Smalltalk

https://www.elastic.co/products/elasticsearch

Page 6: Elasticsearch for Pharo Smalltalk

Elasticsearchとは• Apache Luceneベースの全文検索エンジン、 解析サーバー

• スキーマレス、ドキュメント指向

• RESTで操作できる

• クラスタリングも想定しているので、基本的な設定は容易?

• ライセンスは Apache License v2

• Javaで実装

• ファセット、ハイライト検索も可能

Page 7: Elasticsearch for Pharo Smalltalk

事例• GitHub

• foursquare

• SoundCloud

• stackoverflow

• ALLSTOCKER

Page 8: Elasticsearch for Pharo Smalltalk

GitHubを使って見る

https://github.com

Page 9: Elasticsearch for Pharo Smalltalk

全文検索とElasticsearch

Elasticsearchは転置インデックス(単語とドキュメントIDを辞書にして、検索する)

https://speakerdeck.com/johtani/introduction-elasticsearch-and-elk-elasticsearchmian-qiang-hui-in-nagoya

詳しくは

Page 10: Elasticsearch for Pharo Smalltalk

データ構造

RDB = Database -> Tables -> Rows -> Columns

ElasticSearch = Index -> Types -> Documents -> Fields

そもそも全くことなるので比べるのも変ですが・・・・

Page 11: Elasticsearch for Pharo Smalltalk

Elasticsearch for Pharo Smalltalk• Paul DeBruicker が作ったElasticsearchのフォークプロジェクト

• http://ss3.gemtalksystems.com/ss/Elasticsearch.html

• 作者: @umejava, @newapplesho

• 最新リポジトリはGitHub

• https://github.com/newapplesho/elasticsearch-smalltalk

• Elasticsearch version 1系はサポート。2系は未対応

Page 12: Elasticsearch for Pharo Smalltalk

Elasticsearch for Pharo Smalltalk• 拡張が難しいのを改善

• Aggregation, Query系を一新した

Page 13: Elasticsearch for Pharo Smalltalk

Elasticsearchのインストール

1.wget https://download.elasticsearch.org/….

2.tar -xf elasticsearch-1.7.2.tar.gz

3../elasticsearch-1.7.2/bin/elasticsearch

Page 14: Elasticsearch for Pharo Smalltalk

Kuromojiのinstall日本語の形態素解析エンジン

bin/plugin install elasticsearch/elasticsearch-analysis-kuromoji/2.5.0

Page 15: Elasticsearch for Pharo Smalltalk

Elasticsearch-inquisitorプラグインのインストール

bin/plugin -install polyfractal/elasticsearch-inquisitor

http://localhost:9200/_plugin/inquisitor/#/

Elasticsearch-inquisitor GUIからQueryを実行できるプラグイン

Page 16: Elasticsearch for Pharo Smalltalk

起動確認$ curl localhost:9200

{ "status" : 200, "name" : "Lilandra Neramani", "cluster_name" : "elasticsearch", "version" : { "number" : "1.7.2", "build_hash" : "e43676b1385b8125d647f593f7202acbd816e8ec", "build_timestamp" : "2015-09-14T09:49:53Z", "build_snapshot" : false, "lucene_version" : "4.10.4" }, "tagline" : "You Know, for Search" }

Page 17: Elasticsearch for Pharo Smalltalk

elasticsearch for Pharo Smalltalkをinstall

Gofer new url: 'http://ss3.gemtalksystems.com/ss/Elasticsearch'; package: 'ConfigurationOfElasticsearch'; load. (Smalltalk at: #ConfigurationOfElasticsearch) load.

Metacello new baseline: 'Elasticsearch'; repository: 'github://newapplesho/elasticsearch-smalltalk:v1.1.3/pharo-repository'; load.

または

Page 18: Elasticsearch for Pharo Smalltalk

インデックス作成とマッピングの設定

curl -XPUT 'localhost:9200/st_study' -d @sushi.json

Page 19: Elasticsearch for Pharo Smalltalk

Kuromojiの動作確認

curl -XPOST 'http://localhost:9200/st_study/_analyze?analyzer=analyzer&pretty=true' -d '油圧ショベルは建設機械'

Page 20: Elasticsearch for Pharo Smalltalk

Kuromojiの動作確認結果 { "tokens" : [ { "token" : "油圧", "start_offset" : 0, "end_offset" : 2, "type" : "word", "position" : 1 }, { "token" : "ショベル", "start_offset" : 2, "end_offset" : 6, "type" : "word", "position" : 2 }, { "token" : "は", "start_offset" : 6, "end_offset" : 7, "type" : "word", "position" : 3 }, { "token" : "建設", "start_offset" : 7, "end_offset" : 9, "type" : "word", "position" : 4 }, { "token" : "機械", "start_offset" : 9, "end_offset" : 11, "type" : "word", "position" : 5 } ] }

Page 21: Elasticsearch for Pharo Smalltalk

Sample Data

"properties": { "title": { "type": "string", "store": "yes", "index": "not_analyzed" }, "description": { "type": "string", "store": "yes", "index": "analyzed" }, "price": { "type": "integer", "store": "yes" }

Page 22: Elasticsearch for Pharo Smalltalk

Sample (Seaside Example Sushi Store)

#('Akami Maguro' 'Red Tuna' 'The lean meat near the spine of the tuna fish. It comes in various shades of red--with the lighter, shinier varieties being the best. For dieters, however, the redder the better. Easy on the palatte. The least expensive of the three types of maguro.' 150)

Page 23: Elasticsearch for Pharo Smalltalk

ドキュメントの追加

neta := JsonObject new. neta title:'Aji'; description:'This fish is pink-grey and shiny. When it''s fresh, the flesh is almost transparent. The texture is slippery and easy on the tongue--it should melt in your mouth. Aji is often eaten with soy sauce containing onion, ginger and garlic.'

esDocument := ESDocument new type:'store'; content: neta. index addDocument: esDocument.

Page 24: Elasticsearch for Pharo Smalltalk

ドキュメントの削除

esDocument := ESDocument new id:'AVKMOVs3-FeOW1ziNoOb'; type:'store'; content: neta. esDocument deleteFromIndex: index.

Page 25: Elasticsearch for Pharo Smalltalk

インデックスの削除

index delete.

Page 26: Elasticsearch for Pharo Smalltalk

全件検索

"Match All" index := ESIndex indexNamed: 'st_study'. search := ESSearch new; index: index. query := ESMatchAllQuery new. search query: query. results := search search. results explore.

Page 27: Elasticsearch for Pharo Smalltalk

全件検索(ページング)

"Match All" index := ESIndex indexNamed: 'st_study'. search := ESSearch new index: index. query := ESMatchAllQuery new. search query: query. results := search searchFrom: 0 size:2. results explore.

Page 28: Elasticsearch for Pharo Smalltalk

Match Query

"Match" index := ESIndex indexNamed: 'st_study'. search := ESSearch new; index: index. query := ESMatchQuery new. query query:'aji'. search query: query. results := search search. results explore.

Page 29: Elasticsearch for Pharo Smalltalk

Term Query

"ESTermQuery" index := ESIndex indexNamed: 'st_study'. search := ESSearch new index: index. query := ESTermQuery new field:'title'; query:'Aji'. search query: query. results := search search. results explore.

Page 30: Elasticsearch for Pharo Smalltalk

ソート

"sort" index := ESIndex indexNamed: 'st_study'. search := ESSearch new; index: index. query := ESTermQuery new field:'title'; query:'Aji'. sort := ESSortCriteria new fieldName: 'title'; sortDescending; yourself. search query: query. search addSortCriteria: sort. results := search search. results explore.

Page 31: Elasticsearch for Pharo Smalltalk

Aggregations

"min Aggregations" index := ESIndex indexNamed: 'st_study'. search := ESSearch new; index: index. query := ESMatchAllQuery new. aggregation := ESMinAggregation new field:'price'. search query: query. search addAggregation: aggregation. result := search aggregate.

Page 32: Elasticsearch for Pharo Smalltalk

Aggregations

"max Aggregations" index := ESIndex indexNamed: 'st_study'. search := ESSearch new; index: index. query := ESMatchAllQuery new. aggregation := ESMaxAggregation new field:'price'. search query: query. search addAggregation: aggregation. result := search aggregate.

Page 33: Elasticsearch for Pharo Smalltalk

Aggregations

"avg Aggregations" index := ESIndex indexNamed: 'st_study'. search := ESSearch new; index: index. query := ESMatchAllQuery new. aggregation := ESAvgAggregation new field:'price'. search query: query. search addAggregation: aggregation. result := search aggregate.

Page 34: Elasticsearch for Pharo Smalltalk

DEMO

Page 35: Elasticsearch for Pharo Smalltalk

今後の予定• Elasticsearch 2.0に対応予定

Page 36: Elasticsearch for Pharo Smalltalk

準備は整った さあSmalltalkを書こう

paul bica https://www.flickr.com/photos/dexxus/5820866907/