bellek İÇİ teknolojİlerİnden hadoop’a sap İle bÜyÜk verİ yÖnetİmİ / sap

25
SAP FORUM İSTANBUL Discover Simple Bellek İçi Teknolojilerden Hadoop’a SAP ile Büyük Veri Yönetimi Konuşmacı Adı : Mustafa Kutlu Firma Adı : SAP

Upload: sap-turkiye

Post on 10-Feb-2017

289 views

Category:

Technology


1 download

TRANSCRIPT

SAP FORUM İSTANBULDiscover SimpleBellek İçi Teknolojilerden Hadoop’a SAP ile Büyük Veri Yönetimi

Konuşmacı Adı : Mustafa Kutlu

Firma Adı : SAP

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 2

Agenda

SAP HANA Platform Overview

Big Data and Apache Hadoop

HANA – Hadoop Integration

SAP HANA Vora

SAP HANA Platform

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 4

The problem with current IT landscapeSilos, delay and complexity hinders business agility and innovation

ETL &

Staging

Data

Application silos

Multiple data copies

Batch processing

Partial business view

No real-time insights

Limited ability to innovate

Manufacturing

App

Data

Logic

Finance

App

Data

Logic

Sales

App

Data

Logic

Service

App

Data

Logic

Streaming

App

Data

Logic

Predictive

App

Data

Logic

Analytics

App

Logic

Data

Spatial

App

Data

Logic

ETL &

Staging

Data

ETL &

Staging

Data

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 5

The solution: Make all data readily available to all applicationsReduce data movement and data latency – improve business agility and innovation

Unified application workloads

Unified data – single copy

Real-time processes

Complete business view

Ability to react in real-time

Ability to innovate

Data

Manufacturing

App

Finance

App

Sales

App

Service

App

Streaming

App

Predictive

App

Analytics

App

Spatial

App

ONE Platform for ALL applications Finance

App

Manufacturing

App

Sales

App

Service

App

Streaming

App

Predictive

App

Analytics

App

Spatial

App

Logic Logic Logic Logic Logic LogicLogic Logic

Data Data Data Data Data DataData Data

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 6

No waiting for data

access and processing

Speed

Simplicity

Innovation

All application logic (OLTP & OLAP)

processed in one system

The solution is only possible with in-memory data managementOnly data in-memory enables all applications to become real-time

All data types processed in one

system

ONE In-Memory Platform for ALL applications

Manufacturing

App

Finance

App

Sales

App

Service

App

Streaming

App

Predictive

App

Analytics

App

Spatial

App

Data

Logic Logic Logic Logic Logic LogicLogic Logic

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 7

SAP HANA Platform The data management and application platform for all applications

SAP HANA PlatformApplication Services

Database Services

Integration Services

SAP, ISV and Custom Applications

All Devices

OLTP + OLAP ONE open platform ONE copy of the data

All Data

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 8

J/ODBCADO.NET ODataJSONSQL HTML5 MDX XML/A

SAP HANA PlatformComprehensive services to make information available to any application

Smart Data Quality Series Data Functional Libraries

SAP HANA Platform

Application Services

Web Server | JavaScript | Fiori UX | Application Lifecycle Management

Integration Services

Database Services

In-Memory Columnar | Parallelization | Compression | Multitenant Database Containers | Dynamic Tiering

||

On-Premise | Cloud | Hybrid

Graph PlanningText AnalyticsSearchSpatial Predictive | | ||| | Text Mining

Smart Data Integration Smart Data StreamingSmart Data Access | | Remote Data Sync| Hadoop Integration|

Big Data and Apache Hadoop

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 10

Open Source Community Gift: Apache Hadoop

85% from New Data Types

15x Machine Data by 2020

40 ZB by 2020

New Sources (Sentiment, Clickstream, Sensor)

2.8 ZB in 2012

Predictive

including data mining and machine learning

‘Unstructured’ Analysis

including text, media and spatial

Scalable Storage

Streaming Data

including sensors, social and media

Urgent

Need

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 11

What is Hadoop ?

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 12

Why Hadoop ?

HANA – Hadoop Integration

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 14

Our Journey

SP06

SP07

SP09 HDFS

Yarn/MR

HBASE

HiveSparkPig

Mahout

Ambari

Hive Added as a Remote Source

ODBC Based Communication

Query Optimization Like Remote

Caching and Join Relocation

Reading HDFS Directly

Map Reduce Job Execution

SP10 Spark SQL added as a new Remote Source

Ambari launcher tile in HANA Cockpit

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 15

The Journey thus far..

HANA & Hadoop Integration

HANA & Hadoop Integration SQL on Hadoop via SDA (virtual tables) –

Hive (SPS06)

Remote caching with Hive (SPS07)

Connectivity to Apache Spark using ODBC

Execution of MR-Jobs via HANA (Virtual Functions) and direct access to HDFS (SPS 09)

Spark SQL adapter via SDA (SPS10)

Join relocation to Hadoop thru SparkRDD

Unified Admin thru Ambari integration for Hortonworks

Key Benefits Deep Integration for storage & processing

Optimized data access between HANA & Hadoop

Data tiering to Hadoop for cold storage

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 16

SAP HANA Platform for Big Data

Virtualized Data Access for SAP HANA

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 17

Rapid Data Provisioning with Data Virtualization

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 18

SAP HANA Platform for Big Data

Platform for Big Data Reference Architectures

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 19

Data Lifecycle Manager (DLM) for Hadoop as a tier

Define a data aging strategy with DLM – available in DWF 1.0 SP01

Leverage SAP HANA Dynamic Tiering (Warm-Store), Hadoop or SAP Sybase IQ in SAP

HANA native use cases with a tool based approach to model aging rules on tables to

displace ‘aged’ data to HANA extended tables to optimize the memory footprint of data in

SAP HANA.

SAP HANA

Data Lifecycle Manager

HOT-STORE

(Column Table)

WARM-STORE

(Extended Table)DATA

MOVEMENT *

SAP HANA VORA

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 21

Why SAP HANA Vora? Bridging the Digital Divide for Analysts, Developers, DBAs, and Data Scientists

Simplify Big Data

Ownership

Democratize Data AccessFor data science discovery

Precision Decision MakingIn enterprise apps + analytics

Business coherence

On-demand correlation

New insights from

aggregated data

Interactive data

Enrich the candidate

data sets

Simplified landscape

Improved correlation with

historical data

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 22

SAP HANA VoraWhat’s Inside and What Does It Do?

Democratize

Data

Access

Make

Precision

Decisions

Simplify

Big Data

Ownership

SAP HANA Vora is an in-memory query engine which leverages

and extends the Apache Spark execution framework to provide

enriched interactive analytics on Hadoop. Drill Downs on HDFS

Mashup API Enhancements

Compiled Queries

HANA-Spark Adapter

Unified Landscape

Open Programming

Any Hadoop Clusters

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 23

YARN

HDFS

Enable Precision DecisionsWith Contextual Insights In Enterprise Systems

Other Apps

Files Files Files

HANA-Spark Adapter for improved

performance between distributed systems

Gain business coherence with business data and big data

Compiled queries enable applications &

data analysis to work more efficiently

across nodes

Familiar OLAP experience on Hadoop

to derive business insights from big data

such as drill-down into HDFS data

Compiled

Queries

Spark

Adapter

Drill Downs

SAP HANA in-memory platform

Vora

Spark

Vora

SparkIn-Memory

Store

Application Services

Database Services

Integration Services

Processing Services

SAP HANA Platform

Vora

SparkHANA-Spark

Adaptor

© 2015 SAP AG or an SAP affiliate company. All rights reserved. 24

Democratize Data Access for Data Science Discovery

Extensive programming support for

Scala, python, C, C++, R, and Java allow

data scientists to use their tool of choice,

Pursue new inquiries without compromise on data and

easily integrate these insights with all data

Enable data scientists and developers

who prefer Spark R, Spark ML to mash

up corporate data with Hadoop/Spark

data easily

Optionally, leverage HANA’s multiple

data processing engines for developing

new insights from business and

contextual data.

Mashup

Enhancements

Open

Programming

Optional Use of SAP HANA for

Delegated, multi-engine pre-processing

Spark Data-source

API enhancement

In-Memory

Store

SAP HANA Platform

YARN

HDFSFiles Files Files

Vora

Spark

Vora

Spark

Vora

Spark

HANA Smart Data

Access, UDFs,

Others

Application Services

Database Services

Integration Services

Processing Services

© 2015 SAP AG or an SAP affiliate company. All rights reserved.

Teşekkürler

Mustafa Kutlu

Çözüm Yöneticisi

[email protected]