aws black belt techシリーズ amazon cloudsearch

60
Amazon CloudSearch AWS Black Belt Tech Webinar 2014 (旧マイスターシリーズ) アマゾンデータサービスジャパン株式会社 ソリューションアーキテクト 篠原 英治

Upload: amazon-web-services-japan

Post on 26-Jan-2015

127 views

Category:

Technology


8 download

DESCRIPTION

AWS Black Belt Tech Webinar 2014 (旧マイスターシリーズ) Amazon CloudSearch

TRANSCRIPT

  • 1. Amazon CloudSearch AWS Black Belt Tech Webinar 2014 ()

2. Agenda Amazon CloudSearch 20143 CloudSearch Launch CloudSearch Japanese Text Processing Ranking and Relevance() (2011-02-01) Q&A 3. Agenda Amazon CloudSearch 20143 CloudSearch Launch CloudSearch Japanese Text Processing Ranking and Relevance() (2011-02-01) Q&A 4. Amazon CloudSearch 20145 33 (AutoComplete) 5. CloudSearch 6. CloudSearch () 7. CloudSearch 8. CloudSearch JR 9. CloudSearch or 10. Search Engine RDBMS (specic) ) 1001 Search Engine (arbitrary) ) Amazon RDSAmazon CloudSearch 11. Search Engine grep11 12. - - / - - Amazon RDS 13. DynamoDB - 14. Amazon S3 - / - 15. Field Value id tt0371746 title Iron Man description When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to ght against evil. director John Favreau actors Robert Downey Jr., Gwyneth Paltrow, Terrence Howard ... rating 7.9 release_date 2008-05-02T00:00:00Z DynamoDBRDSCloudSearch (: S3)CloudSearch 16. Term Documents (Posting List) Iron The Man in the Iron Mask Iron Man The Iron Giant ... Man Rain Man The Man in the Moon The Third Man Iron Man ... 17. Automatic Scaling / 18. Agenda Amazon CloudSearch 20143 CloudSearch Launch CloudSearch Japanese Text Processing Ranking and Relevance() (2011-02-01) Q&A 19. 20143 CloudSearch Launch (33) Algorithmic Stemming (AutoComplete) Term boosting Multi-AZ IAM Integration (Tokyo, Sydney, Sao Paulo) 20. 20143 CloudSearch Launch Arabic, Armenian, Basque, Bulgarian, Catalan, Simplied Chinese, Traditional Chinese, Czech, Danish, Dutch, English, Finnish, French, Galician, German, Greek, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Korean, Latvian, Norwegian, Persian, Portuguese, Romanian, Russian, Spanish, Swedish, Thai, Turkish Support for 33 languages 21. 20143 CloudSearch Launch Per-Field Language Control 20143LaunchMultiple Languages (Language Detection) 22. 20143 CloudSearch Launch Highlighting /search&q=iron+man&highlight.plot={"format":"text"}! "hit": [{! "id": "tt1228705",! "fields": {! "title": "Iron Man 2! },! "highlights": {! "plot": "Tony Stark has declared himself ! *Iron* *Man* and installed world...! } },... ! 23. 20143 CloudSearch Launch Suggestions /suggest?q=ir&suggester=title_sug "suggest": {"query": "iro", "found": 5,! "suggestions": [! {suggestion: Iron Man,"id": "tt0371746"},! {"suggestion": "Iron Man 2,"id:"tt1228705"},! ...! 24. 20143 CloudSearch Launch IAM Integration Access Policy: Search()Document() {! "Version":"2012-10-17",! "Statement": [! { "Effect": "Allow", "Action": ["cloudsearch:*"], "Resource": "arn:aws:cloudsearch:us-east-1:111122223333:domain/imdb-movies" },! { "Effect": "Deny",! "Action": ["cloudsearch:DeleteDomain"],! "Resource": "arn:aws:cloudsearch:us-east-1:111122223333:domain/imdb-movies" }! ]! }! 25. 20143 CloudSearch Launch Geo-Spatial support Latitude-Longitude data types / Distance sort (haversin) near me 26. 20143 CloudSearch Launch Enhanced Availability Multi Availability-Zone 27. 20143 CloudSearch Launch Enhanced Availability Instance SizeScaling Options 28. 20143 CloudSearch Launch "When a Man Loves a Woman"! "Wonder Woman"! "The Woman in Black! ...! "The Lawnmower Man"! "Dead Man"! "Repo Man! ...! Term Boosting (or 'man' 'woman')&q.parser=structured! (or (term boost=5 'man) 'woman')&q.parser=structured! 29. Agenda Amazon CloudSearch 20143 CloudSearch Launch CloudSearch Japanese Text Processing Ranking and Relevance() (2011-02-01) Q&A 30. CloudSearch Double Date Signed Integer Text Literal 31. CloudSearch Field Types Type date yyyy-mm-ddT00:00:00Ztimestamp date-array date double double-precision 64-bit double-array double int 64-bit int-array int latlon latitude()longitude(). literal literal-array literal text text-array text 32. CloudSearch Processing Script Queuing Batching Amazon EC2 Amazon EC2 Amazon CloudSearch Amazon SQS Source System Search Data Format (SDF) 33. CloudSearch SDF(Search Data Format) Comma Separated Value (.csv) Adobe Portable Document Format (.pdf) HTML (.htm, .html) Microsoft Excel (.xls, .xlsx) Microsoft PowerPoint (.ppt, .pptx) Microsoft Word (.doc, .docx) Text Documents (.txt) 34. CloudSearch S3DynamoDBUpload 35. CloudSearch(CSV) SDF 36. CloudSearch(HTTP) http(s)://< document service endpoint >/2013-01-01/documents/batch Accept: application/json Content-Type: application/json Host: doc-yamanote-xxx.ap-northeast-1.cloudsearch.amazonaws.com {type:add,id:yamanote_4,elds:{id:003,name:"}}, {type:add,id:yamanote_5,elds:{id:004,name:"}}, {"type":"delete","id":"yamanote_20"} 37. CloudSearch(Reference Architecture) 38. Agenda Amazon CloudSearch 20143 CloudSearch Launch CloudSearch Japanese Text Processing Ranking and Relevance() (2011-02-01) Q&A 39. Japanese Text Processing (Morphological Analysis) (http://ja.wikipedia.org/wiki/) ) (-)/(-)/(-)/() 40. Japanese Text Processing (Normalize) ()() CloudSearch NFD(Canonical Decomposition): D NFC(Canonical Composition): C NFKD(Compatibility Decomposition): KD NFKC(Compatibility Composition): KC 41. Japanese Text Processing Stemming (-, baseForm:)/() (API/SDK) 42. Japanese Text Processing Stopword Removal Stopword (API/SDK) 43. Japanese Text Processing Synonym Addition Synonym = Stopwords, Stemming 44. Japanese Text Processing Synonym Addition (API/SDK) Alias pupilstudent studentpupil Group 1st, rst, one 1st, rst, one 45. Agenda Amazon CloudSearch 20143 CloudSearch Launch CloudSearch Japanese Text Processing Ranking and Relevance() (2011-02-01) Q&A 46. Ranking and Relevance() TF-IDF uniqueness and presence TF: term frequency IDF: inverse document frequency 47. Ranking and Relevance() (_score) 48. Ranking and Relevance() A/B 49. Ranking and Relevance() (_score) ( ) ( ) () text/literal int/double date 50. Ranking and Relevance() Expressions Arithmetic + - * / % Bitwise | & ^ ~ > >>> Boolean && || ! ?: Comparison < = > Mathematical functions abs ceil exp floor ln log2 log10 logn max min pow sqrt pow Trigonometric functions acos acosh asin asinh atan atan2 atanh cos cosh sin sinh tanh tan haversin distance function haversin(38.958687,-77.343149,latitude,longitude) 51. Ranking and Relevance() Query Parser Simple Query Parser Structured Query Parser boolean Lucene Query Parser CloudSearch , Dismax Query Parser Lucene Query ParserSolr 52. Agenda Amazon CloudSearch 20143 CloudSearch Launch CloudSearch Japanese Text Processing Ranking and Relevance() (2011-02-01) Q&A 53. (2011-02-01) 2013-01-01 Tokyo CloudSearch 54. (2011-02-01) 2013-01-01 SDKAPI ) AWS SDK for Ruby AWS CLI AWS Command Line Interface 1.3.6 CloudSearch https://aws.amazon.com/releasenotes/CLI/8906204440930658 55. (2011-02-01) () http://docs.aws.amazon.com/cloudsearch/latest/ developerguide/migrating.html 56. (2011-02-01) : Metrics Metrics Total Searches Searches with No Results 0 Metrics 57. Agenda Amazon CloudSearch 20143 CloudSearch Launch CloudSearch Japanese Text Processing Ranking and Relevance() (2011-02-01) Q&A 58. Amazon CloudSearch AWS(S3,DynamoDB) /(, SDK, CLI) /(Auto Scaling) (Multi AZ) 20143Launch /Tokyo () / Simple Monthly Calculator(http://calculator.s3.amazonaws.com/) 59. Amazon CloudSearch http://aws.amazon.com/jp/cloudsearch/testimonials/ Go Global in minutes 60. Amazon CloudSearch Developer Guide http://docs.aws.amazon.com/cloudsearch/latest/ developerguide/what-is-cloudsearch.html Amazon CloudSearch FAQ https://aws.amazon.com/jp/cloudsearch/faqs/ Amazon CloudSearch https://aws.amazon.com/jp/cloudsearch/pricing/