bityota data warehouse podcast
DESCRIPTION
In this slidecast, Dev Patel and Poulomi Damany from BitYota describe the company's Data Warehouse Service. "Our vision is to make data and analytics accessible to all. There's a revolution underway and we're taking sides. We want to create a data platform that enables everyone -- from data scientists and engineers to SQL savvy analysts, business and product users, to understand their data, to build better products/services, and create new avenues of growth and productivity." Watch the video presentation: http://wp.me/p3RLEV-1SSTRANSCRIPT
© 2014
An overview!BitYota – data warehouse service !
Dev Patel, CEO!
BitYota: Who we are!
Problem Today’s big data analy/cs is either a ‘Big Cost’ or a ‘Big Headache’ or both for companies of all sizes. Users have to learn new skills and CEOs need to buy uniquely engineered, prohibi/vely expensive systems.
Solu+on
BitYota offers a be-er alterna0ve: A Data Warehouse Service for Big Data analy0cs. This PaaS offering takes away both big cost and big headache, making analy0cs accessible to everyone at scale, with no compromise on func0onality or service levels.
Customers Mobile apps, E/commerce, Adver/sing/Marke/ng, Games
Background
Founded Sep 2011 by data experts: Dev Patel, Harmeek Bedi and Soren Riise.
Company has raised $12M through Seed & Series A from Globespan Capital, Social+Capital Partnership, Dawn Capital, Andreessen Horowitz, Crosslink Capital, Morado Ventures, & individual investors Maynard Webb, Graham Summers, Jerry Yang and Sharmila Mulligan.
Opportunity Companies are increasingly looking to gain insights from their data via analy/cs. Analy/cs for big data in the cloud is BitYota’s opportunity.
Team
Management: Dev Patel, CEO; Harmeek Singh Bedi, CTO; Soren Riise, Chief Cloud Service; Poulomi Damany, VP Product.
Its core team has 35+ years of big data experience at Yahoo!, Oracle, Veritas/Symantec, Informix, BMC, Kabira/Tibco, and Microso_.
Does This Sound Like You?!!
3
Today’s Big Data = Big Cost/Big Headache!
.. And you need some critical insights for what’s
next
You’re a company that just launched
You’re a company that just launched …
OR
Your data infra-structure can’t scale
.. And you can’t spare any more engineers or money to maintain it
OR
You have lots of data in multiple silos
.. And it takes too long for your analysts to get answers
What questions do you want answered?!
How can I combine social profiles, in-app purchases, and event stream data?
What’s my ROI on my marketing spend? Where should I be spending more/less $$$?
How do I increase engagement?
Why is the new app version crashing?
Who are my best users?
Access patterns by OS/ device?
4
Velocity, “fast” analytics
Velocity, analytics on fresh data
BitYota: Data warehouse for next gen data!
Managed Service & Pay per Use
MPP architecture – scale with
Compute
Continuous extract of changing data from
MongoDB
Integration into SQL/BI ecosystem
Elastic scale up/down
Cloud, easy set up with Burst capacity
Agility & Time-to-Market
No CAPEX & low OPEX
Variety, semi-structured data
No translation to structure & No data modeling
5
BitYota is focused on use cases where …!
Customers want:!
1. Analytics over data from multiple sources!
2. Migrate analytics from on-premise to Cloud !
3. Analytics on data from single source NoSQL or relational transactional systems !
4. Analytics on “fresh data” !
6
Markets for BitYota!
Companies in!• Advertising/Marketing!
• Social Media!
• SaaS!
• Games & Entertainment!
• E-commerce!
• Communication & Productivity !
7
BitYota focused in new Big Data Analytics!• Cost effective, elastic capacity!! Deploy in a heterogeneous environment;
scale out; scale storage & compute independently!
• Flexible Storage !! Semi-structured data types (JSON,
XML), Data types for new applications – timestamp, IP, location, etc!
! Table Layout – row and column, on disk, memory, external tables !
• Fast time to analytics !! Load and explore directly, not dependent
on slow & fragile ETL!• Interactive analytics !! Use ANSI SQL directly on new data
types. Leverage existing BI tools!
User profiles Social data
Server Logs
Volume Variety Velocity
Inventory
Sales Orders/ Returns
Website Views & Clicks
• Data from Multiple sources in Multiple formats !
8
BitYota Cluster
BitYota Extract Tool
BSON, JSON
Data nodes
Compute nodes
Primary shard
Secondary shards
Oplog Tail
Mongo dump Load
Extract Load Transform & Analyze
Business Analytics on data from MongoDB!
Schedule incremental extract and load MongoDB extract format (BSON)
Joins across collections SQL over JSON, UDFs Transforms into Cols for performance Views for BI tool
SQL over JSON, and access from BI tools
Mobile/Web Apps
9
Change MongoDB JSON doc structure any/me = NO extra downstream effort needed
Process to Load Data into BitYota!SOURCE JSON DATA
"session":[{ "u":"8927ABBCD2873CCD", "v":"1.0", "uid","TheTestUser1", "dv":"Apple iPhone 3GS", "t":200 }
LOAD DATA ANALYZE DATA
1-‐+me setup • Scheduled Load • Schema auto-‐discovered • Table auto-‐created
SELECT jdoc-‐>'u’, jdoc-‐>’t’ FROM session; CREATE TABLE session(
jdoc JSON )
OPTIMIZE DESIGN
CREATE TABLE session_cols ( u TEXT, t INT, origjdoc JSON ) PARTITION BY RANGE (t) (PARTITION VALUES ('0'), PARTITION VALUES ('50'), PARTITION VALUES ('200') ) COLUMNSTORE STORAGE (SEGMENTSIZE 13102 TABLESIZE 200000);
INSERT INTO session_cols SELECT jdoc-‐>'u', (jdoc-‐>'t')::int8, jdoc FROM session;;
10
As a Service!
• Launch cluster in minutes !• Removes the ‘headache’ of database management!
• No hardware, no software installation & upgrades; no licenses !
• Available on AWS & Rackspace!
11
Recap!
• BitYota is a Cloud based Data Warehouse Service for Big Data Analytics.!
• Its core attributes are:!• 100% Service oriented!• Analytics on data from multiple sources/formats!• Analytics on “fresh” data !
• Customers are gaining deep insights on their business operations!
• Customers in Games, Mobile apps, advertising/marketing, e/commerce!
12