bityota data warehouse podcast

12
© 2014 An overview BitYota – data warehouse service Dev Patel, CEO

Upload: insidehpc

Post on 08-May-2015

1.390 views

Category:

Technology


2 download

DESCRIPTION

In this slidecast, Dev Patel and Poulomi Damany from BitYota describe the company's Data Warehouse Service. "Our vision is to make data and analytics accessible to all. There's a revolution underway and we're taking sides. We want to create a data platform that enables everyone -- from data scientists and engineers to SQL savvy analysts, business and product users, to understand their data, to build better products/services, and create new avenues of growth and productivity." Watch the video presentation: http://wp.me/p3RLEV-1SS

TRANSCRIPT

Page 1: BitYota Data Warehouse Podcast

© 2014

An overview!BitYota – data warehouse service !

Dev Patel, CEO!

Page 2: BitYota Data Warehouse Podcast

BitYota: Who we are!

Problem  Today’s  big  data  analy/cs  is  either  a  ‘Big  Cost’  or  a  ‘Big  Headache’  or  both  for  companies  of  all  sizes.  Users  have  to  learn  new  skills  and  CEOs  need  to  buy  uniquely  engineered,  prohibi/vely  expensive  systems.    

Solu+on  

BitYota  offers  a  be-er  alterna0ve:  A  Data  Warehouse  Service  for  Big  Data  analy0cs.  This  PaaS  offering  takes  away  both  big  cost  and  big  headache,  making  analy0cs  accessible  to  everyone  at  scale,  with  no  compromise  on  func0onality  or  service  levels.    

Customers   Mobile  apps,  E/commerce,  Adver/sing/Marke/ng,  Games  

Background  

Founded  Sep  2011  by  data  experts:  Dev  Patel,  Harmeek  Bedi  and  Soren  Riise.    

Company  has  raised  $12M  through  Seed  &  Series  A  from  Globespan  Capital,  Social+Capital  Partnership,  Dawn  Capital,  Andreessen  Horowitz,  Crosslink  Capital,  Morado  Ventures,  &  individual  investors  Maynard  Webb,  Graham  Summers,  Jerry  Yang  and  Sharmila  Mulligan.  

Opportunity   Companies  are  increasingly  looking  to  gain  insights  from  their  data  via  analy/cs.  Analy/cs  for  big  data  in  the  cloud  is  BitYota’s  opportunity.  

Team  

Management:  Dev  Patel,  CEO;  Harmeek  Singh  Bedi,  CTO;  Soren  Riise,  Chief  Cloud  Service;  Poulomi  Damany,  VP  Product.  

Its  core  team  has  35+  years  of  big  data  experience  at  Yahoo!,  Oracle,  Veritas/Symantec,  Informix,  BMC,  Kabira/Tibco,  and  Microso_.    

Page 3: BitYota Data Warehouse Podcast

Does This Sound Like You?!!

3

Today’s Big Data = Big Cost/Big Headache!

.. And you need some critical insights for what’s

next

You’re a company that just launched

You’re a company that just launched …

OR

Your data infra-structure can’t scale

.. And you can’t spare any more engineers or money to maintain it

OR

You have lots of data in multiple silos

.. And it takes too long for your analysts to get answers

Page 4: BitYota Data Warehouse Podcast

What questions do you want answered?!

How can I combine social profiles, in-app purchases, and event stream data?

What’s my ROI on my marketing spend? Where should I be spending more/less $$$?

How do I increase engagement?

Why is the new app version crashing?

Who are my best users?

Access patterns by OS/ device?

4

Page 5: BitYota Data Warehouse Podcast

Velocity, “fast” analytics

Velocity, analytics on fresh data

BitYota: Data warehouse for next gen data!

Managed Service & Pay per Use

MPP architecture – scale with

Compute

Continuous extract of changing data from

MongoDB

Integration into SQL/BI ecosystem

Elastic scale up/down

Cloud, easy set up with Burst capacity

Agility & Time-to-Market

No CAPEX & low OPEX

Variety, semi-structured data

No translation to structure & No data modeling

5

Page 6: BitYota Data Warehouse Podcast

BitYota is focused on use cases where …!

Customers want:!

1.  Analytics over data from multiple sources!

2.  Migrate analytics from on-premise to Cloud !

3.  Analytics on data from single source NoSQL or relational transactional systems !

4.  Analytics on “fresh data” !

6

Page 7: BitYota Data Warehouse Podcast

Markets for BitYota!

Companies in!•  Advertising/Marketing!

•  Social Media!

•  SaaS!

•  Games & Entertainment!

•  E-commerce!

•  Communication & Productivity !

7

Page 8: BitYota Data Warehouse Podcast

BitYota focused in new Big Data Analytics!•  Cost effective, elastic capacity!! Deploy in a heterogeneous environment;

scale out; scale storage & compute independently!

•  Flexible Storage !! Semi-structured data types (JSON,

XML), Data types for new applications – timestamp, IP, location, etc!

! Table Layout – row and column, on disk, memory, external tables !

•  Fast time to analytics !! Load and explore directly, not dependent

on slow & fragile ETL!•  Interactive analytics !! Use ANSI SQL directly on new data

types. Leverage existing BI tools!

User profiles Social data

Server Logs

Volume  Variety  Velocity  

Inventory

Sales Orders/ Returns

Website Views & Clicks

•  Data from Multiple sources in Multiple formats !

8

Page 9: BitYota Data Warehouse Podcast

BitYota  Cluster  

BitYota  Extract  Tool  

BSON,  JSON  

Data  nodes  

Compute  nodes  

Primary shard

Secondary shards

Oplog Tail

Mongo dump Load

Extract   Load   Transform  &  Analyze  

Business Analytics on data from MongoDB!

Schedule incremental extract and load MongoDB extract format (BSON)

Joins across collections SQL over JSON, UDFs Transforms into Cols for performance Views for BI tool

SQL over JSON, and access from BI tools

Mobile/Web Apps

9

Page 10: BitYota Data Warehouse Podcast

Change  MongoDB  JSON  doc  structure  any/me  =  NO  extra  downstream  effort  needed  

Process to Load Data into BitYota!SOURCE  JSON  DATA  

"session":[{  "u":"8927ABBCD2873CCD",  "v":"1.0",  "uid","TheTestUser1",    "dv":"Apple  iPhone  3GS",    "t":200                          }  

LOAD  DATA   ANALYZE  DATA  

1-­‐+me  setup    •  Scheduled  Load  •  Schema  auto-­‐discovered  •  Table  auto-­‐created  

SELECT  jdoc-­‐>'u’,  jdoc-­‐>’t’  FROM  session;  CREATE  TABLE  session(  

 jdoc  JSON  )      

OPTIMIZE  DESIGN  

CREATE  TABLE  session_cols  (          u                               TEXT,          t                                 INT,        origjdoc                  JSON  )  PARTITION  BY  RANGE  (t)    (PARTITION  VALUES  ('0'),  PARTITION  VALUES  ('50'),  PARTITION  VALUES  ('200')  )  COLUMNSTORE  STORAGE  (SEGMENTSIZE  13102  TABLESIZE  200000);  

INSERT  INTO  session_cols        SELECT  jdoc-­‐>'u',    (jdoc-­‐>'t')::int8,  jdoc          FROM  session;;  

10

Page 11: BitYota Data Warehouse Podcast

As a Service!

•  Launch cluster in minutes !•  Removes the ‘headache’ of database management!

•  No hardware, no software installation & upgrades; no licenses !

•  Available on AWS & Rackspace!

11

Page 12: BitYota Data Warehouse Podcast

Recap!

•  BitYota is a Cloud based Data Warehouse Service for Big Data Analytics.!

•  Its core attributes are:!•  100% Service oriented!•  Analytics on data from multiple sources/formats!•  Analytics on “fresh” data !

•  Customers are gaining deep insights on their business operations!

•  Customers in Games, Mobile apps, advertising/marketing, e/commerce!

12