cominvent as company presentation

Download Cominvent AS company Presentation

If you can't read please download the document

Upload: cominvent-as

Post on 25-May-2015

3.506 views

Category:

Technology


2 download

DESCRIPTION

Presentation of Norwegian based search consulting company Cominvent AS, focusing on Apache Solr/Lucene/ElasticSearch and other enterprise search and big data technology.

TRANSCRIPT

  • 1. Jan Hydahlenterprise search specialists

2. 2Cominvent AS (27) Business critical search Lucene/Solr/ElasticSearch and FASTDomain knowledge, best practicesConsultingTrainingSupport 3. 3Jan Hydahl(27)1995: Developer telecom1998: Java developer2000: Search - FAST2006: Lucene2007: new Cominvent()2009: Lucene/Solr2011: Lucene committer2012: Lucene PMC> 100 projects 4. 4Consulting(27) Cominvent delivers independent search consulting Technology advice/evaluation Solution architecture, design & implementation Staff coaching "For us, he gave a very valuable contribution to our research on migration from commercial to open source-based search technologies, providing first- hand expert knowledge on the subject." Kent Vilhelmsen, 1881.no 5. 5Training(27) Cominvent delivers training public and on-site Certified Solr and FAST instructorwww.solrtraining.com www.lucenetraining.com "Jan Hydahls teaching had a good mix of instruction and hands-on exercises. In particular we would like to emphasize Jan Hydahls ability to include our own issues in the course. The training session has been very beneficial"Allan Forsberg, DBC 6. 6Commercial Support(27) Professional support agreement for Lucene/Solr/ES Delivered in cooperation with OpenESN partners Read more: http://www.cominvent.com/support/ 7. 7Case study: e-Commerce (27) Norwegian online book-store Microsoft based (EPI/SQL..) Selected Cominvent + Solr Designed new relevant search: Balanced relevancy between author, title, metadata Best sellers and new-comers compete for top-10 Advanced auto-complete avoidsmisspellings, shortens path totarget Helped increase conversion rate 8. 8Case study: Archive search (FAST-Solr) (27) Archiving product enabling searchin document archive Many huge installations (100m+) Used Microsoft FAST InStream asOEM component on Linux Came to Cominvent when FASTdiscontinued Linux support Helped them with pre-study,architecture spec for migrating toSolr including Multi lingual, adv. linguistics Parsing 100s of doc.formats FIXML->Solr without re-feed Much smaller index footprint! 9. 9Case study: Online newspapers (27) NHST Media Group (www.dn.no) News search for Norwaysleading financial newspaper DNand seven other publications. "Jan helped us migrate our FAST based news search to Solr, including solution architecture and adapting our Java based search middleware. The new Solr powered search performs better on less hardware and has proven to be a stable and robust solution."Petter I. Gustafson 10. 10Case study: Intranet & web-crawl(27) University of Oslo (www.uio.no) 26 web sites crawled 48 web sites pushed from CMS PageRank and AnchorTexts Security filtered CMS results All common Office doc formats 100% open source Joint project with Atilika Inc. 11. 11Case study: Intranet(27) Intranet search Apache Solr + OpenPipeline Sources: File system Intranet Wiki WebSak case handling Security filtering with AD&WebSak Search frontend - JellyFish Joint project with 12. 12 (27)Short introduction to Apache & Lucene/Solr 13. 13(27) Founded 1999 based on Apache HTTP server ASF provides support for the Apache community of open-source software projects The Apache projects are characterized by a collaborative,consensus based development process, an open andpragmatic software license, and a desire to create highquality software that leads the way in its field. About 60 top-level projects Completely de-centralized, no offices. 50% of the Top 10 downloadedOpen Source products are Apache projects! 14. 14(27) 15. 15The Apache 2.0 license (27) It allows you to: Download, use, modify, distribute, charge money... It forbids you to: Violate Apache trademarks or pretend you wrote the sw It requires you to: Include a copy of the license with your software Provide attribution to ASF when distributing sw It does not require you to: Include the source or your own modifications Submit changes back to Apache 16. 16Lucene (27) Lucene is a search engine library (not a server) Need to program in Java to use it Created by Doug Cutting, who also createdNutch and Hadoop Apache in 2001, TLP in 2005 Lucene is the search core of Solr 17. 17Apache Solr (27) Began at cnet.com review site Originally developed by Yonik Seeley Donated to ASF in 2006 Quickly became popular, mainly due to its faceted search 18. 18Apache Solr (27)Search server(Commercially friendly) 19. 19 Apache Solr - characteristics(27)ModularCommunityContributions & patches Light weight 20. 20Areas of usage (27) 21. 21 Solr community mailing list growth (27)Lucene mailing list volume per month4500400035003000 lucene-dev2500 solr-dev solr-user2000 TOTAL150010005000 Feb 2002 Dec 2002 Oct 2003 Aug 2004 Jun 2005 Apr 2006 Feb 2007 Dec 2007 Oct 2008 Aug 2009 Jun 2010 Apr 2011 Sep 2001 Jul 2002 May 2003 Mar 2004 Jan 2005 Nov 2005 Sep 2006 Jul 2007 May 2008 Mar 2009 Jan 2010 Nov 2010 Sep 2011 22. 22More Solr/Lucene deployments(27) More: http://wiki.apache.org/solr/PublicServersThanks to Lucid Imagination for logo collection 23. 23Differences Lucene & Solr?(27)Programming library HTTP based search serverNeeds Java competence No programming neededNo built-in scaling/admin etc Built-in scaling, admin, GUIsIdeal for embedding inIdeal for server use overdesktop software (used by HTTPe.g. iTunes)Ports in .NET, C++ etcNo need for a port since it willrun by itself on any OS 24. 24Other related ASF projects(27) Lucene Java library Rich document extraction Large-scale web crawling Connectors & security Machine learning Classification/clustering Collaborative filtering... 25. 25Short introduction to ElasticSearch(27) Search server built on Lucene Written by Shay Bannon ASL 2.0 licensed Source code @ GitHub Built for scaling in the cloud Schema-less & "zero-config" Built-in fault tolerance Strictly JSON/REST API Solution for nested documents Built-in saved searches/alerting www.elasticsearch.org 26. 26When to choose Solr vs ElasticSearch(27)Solr ES The "safe" bet (official Ease of scaling large is keyApache product) Very dynamic data, need Largest communityfor schema-less server Integrated with many Multi tenancy, letsystems alreadycustomers have their own Simple query model,index and put anything in iteverything is HTTP params Alerting/saved searches More consultants available need, dont want to write your own Need for hierarchical nested documents (without custom code) 27. 27Thank You (27)[email protected]/cominventlinkedin.com/in/janhoy