elasticsearch and ruby [rupy2012]
Post on 26-Jan-2015
118 Views
Preview:
DESCRIPTION
TRANSCRIPT
Elasticsearch and Ruby
{elasticsearch in a nutshell}
Built on top of Apache LuceneSearching and analyzing big dataScalabilityREST API, JSON DSL
Great fit for dynamic languages and web-oriented workflows / architectures
http://www.elasticsearch.org
Elasticsearch and Ruby
}{
It all started in this gist… (< 200 LOC)
Elasticsearch and Ruby
Example
class Results include Enumerable attr_reader :query, :curl, :time, :total, :results, :facets
def initialize(search) response = JSON.parse( Slingshot.http.post("http://localhost:9200/#{search.indices}/_search", search.to_json) ) @query = search.to_json @curl = %Q|curl -X POST "http://localhost:9200/#{search.indices}/_search?pretty" -d '#{@query}'| @time = response['took'] @total = response['hits']['total'] @results = response['hits']['hits'] @facets = response['facets'] end
def each(&block) @results.each(&block) endend
Elasticsearch plays nicely with Ruby…
curl -‐X POST "http://localhost:9200/articles/_search?pretty=true" -‐d '
{ "query" : { "filtered" : { "filter" : { "range" : { "date" : { "from" : "2012-‐01-‐01", "to" : "2012-‐12-‐31" } } }, "query" : { "bool" : { "must" : { "terms" : { "tags" : [ "ruby", "python" ] } }, "must" : { "match" : { "title" : { "query" : "conference", "boost" : 10.0 } } } } } } }}
elasticsearch’s Query DSL
Elasticsearch and Ruby
Example
Tire.search('articles') do query do boolean do must { terms :tags, ['ruby', 'python'] } must { string 'published_on:[2011-01-01 TO 2011-01-02]' } end endend
Elasticsearch and Ruby
Example
tags_query = lambda do |boolean| boolean.must { terms :tags, ['ruby', 'python'] }end
published_on_query = lambda do |boolean| boolean.must { string 'published_on:[2011-01-01 TO 2011-01-02]' }end
Tire.search('articles') do query { boolean &tags_query }end
Tire.search('articles') do query do boolean &tags_query boolean &published_on_query endend
Elasticsearch and Ruby
Example
search = Tire.search 'articles' do query do string 'title:T*' end filter :terms, tags: ['ruby'] facet 'tags', terms: tags sort { by :title, 'desc' }end
search = Tire::Search::Search.new('articles')search.query { string('title:T*') }search.filter :terms, :tags => ['ruby']search.facet('tags') { terms :tags }search.sort { by :title, 'desc' }
Elasticsearch and Ruby
TEH PROBLEM
Designing the Tire library as domain-specific language, from the higher level, and consequently doing a lot of mistakes in the lower levels.
‣ Class level settings (Tire.configure); cannot connect to two elasticsearch clusters in one codebase
‣ Inconsistent access (methods vs Hashes)
‣ Not enough abstraction and separation of concerns
Elasticsearch and Ruby
”Blocks with arguments”(alternative DSL syntax)
Tire.search do query do text :name, params[:q] endend
Tire.search do |search| search.query do |query| query.text :name, params[:q] endend
The Git(Hub) (r)evolution
‣ Lots of contributions... but less feedback
‣ Many contributions focus on specific use case
‣ Many contributions don’t take the bigger picture and codebase conventions into account
‣ Almost every patch needs to be processed, polished, amended
‣ Maintainer: lots of curation, less development — even on this small scale (2K LOC, 7K LOT)
‣ Contributors very eager to code, but a bit afraid to talk
Elasticsearch and Ruby
Tire’s Ruby on Rails integration
$ rails new myapp \ -‐m "https://raw.github.com/karmi/tire/master/examples/rails-‐application-‐template.rb"
‣ Generate a fully working Rails application with a single command
‣ Downloads elasticsearch if not running, creates the application, commits every step, seeds the example data, launches the application on a free port, …
‣ Tire::Results::Item fully compatible with Rails view / URL helpers
‣ Any ActiveModel compatible OxM supported
‣ Rake task for importing data (using pagination libraries)
Elasticsearch and Ruby
Rails integration baked in‣ No proper separation of concerns / layers
‣ People expect everything to be as easy as that
‣ Tire::Results::Item baked in, not opt-in, masquerades as models
‣ People consider ActiveRecord the only OxM in the world
Base library (HTTP, JSON, API)
The Ruby DSL
ActiveModel integration
ActiveRecord extensions
Rails extensions
Persistence extension
…
https://rubygems.orghttps://github.com/rubygems/rubygems.org/pull/455
Elasticsearch and Ruby
class Rubygem < ActiveRecord::Base # ...
def self.search(query) conditions = <<-SQL versions.indexed and (upper(name) like upper(:query) or upper(translate(name, '#{SPECIAL_CHARACTERS}', '#{' ' * SPECIAL_CHARACTERS.length}')) like upper(:query)) SQL
where(conditions, {:query => "%#{query.strip}%"}). includes(:versions). by_downloads endend
https://github.com/rubygems/rubygems.org/blob/master/app/models/rubygem.rb
“Search”
2
1
3
4
5
6
Adding search to an existing application
https://github.com/karmi/rubygems.org/compare/search-steps
“Hello Cloud” with Chef Server
http://git.io/chef-hello-cloud
‣ Deploy Rubygems.org on EC2 (or locally with Vagrant) from a “zero state”
‣ 1 load balancer (HAproxy), 3 application servers (Thin+Nginx)
‣ 1 database node (PostgreSQL, Redis)
‣ 2 elasticsearch nodes
‣ Install Ruby 1.9.3 via RVM
‣ Clone the application from GitHub repository
‣ init.d scripts and full configuration for every component
‣ Restore data from backup (database dump) and import into search index
‣ Monitor every part of the stack
Elasticsearch and Ruby
top related