security and governance on hadoop with apache atlas and apache ranger by srikanth venkat

15
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Enterprise Ready Security & Governance with Hortonworks Data Platform Srikanth Venkat Senior Director, Product Management

Upload: artem-ervits

Post on 16-Apr-2017

392 views

Category:

Software


7 download

TRANSCRIPT

Page 1: Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srikanth Venkat

1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Enterprise Ready Security & Governance with Hortonworks Data PlatformSrikanth Venkat Senior Director, Product Management

Page 2: Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srikanth Venkat

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Protecting the Elephant in the Castle…..Kerberos,

Wire Encryption

HDFS Encryption

Apache RangerNetwork Segmentation,

Firewalls

LDAP/AD

Apache Knox

Page 3: Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srikanth Venkat

3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Apache Ranger

• Central audit location for all access requests

• Support multiple destination sources (HDFS, Solr, etc.)

• Real-time visual query interface

AuditingAuthorization

• Store and manage encryption keys• Support HDFS Transparent Data

Encryption• Integration with HSM

• Safenet LUNA

Ranger KMS

• Centralized platform to define, administer and manage security policies consistently across Hadoop components

• HDFS, Hive, HBase, YARN, Kafka, Solr, Storm, Knox, NiFi

• Extensible Architecture• Custom policy conditions, user context

enrichers• Easy to add new component types for

authorization

Page 4: Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srikanth Venkat

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Ranger Architecture

HDFS

Ranger Administration Portal

HBase

Hive Server2

Ranger Audit Server

Ranger Plugin

Had

oop

Com

pone

nts

Ent

erpr

ise

Use

rs

Ranger Plugin

Ranger Plugin

Legacy Tools and Data Governance

HDFS

Knox

NifI

Ranger Plugin

Ranger Plugin

SolrRanger Plugin

Ranger Policy Server Integration API

KafkaRanger Plugin

YARNRanger Plugin

Ranger PluginStorm Ranger Plugin Atlas

Solr

Page 5: Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srikanth Venkat

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Enterprise Data Governance: Apache Atlas Data Managementalong the entire data lifecycle with integrated provenance and lineage capability

• Cross component lineage

Modeling with Metadataenables comprehensive business metadata vocabulary with enhanced tagging and attribute capabilities

• Common Business Language

• Hierarchically organized – No dupes !

Interoperable Solutionsacross the Hadoop ecosystem, through a common metadata store

• Combine and Exchange Metadata

STRUCTURED

UNSTRUCTURED

TRADITIONALRDBMS

METADATA

MPP APPLIANCES

Kafka Storm

Sqoop

Hive

ATLASMETADATA

Falcon

RANGER

STREAMING

Custom

Partners

Page 6: Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srikanth Venkat

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

High Level Architecture: 4 Key points

Type System

Repository

Search DSL

Brid

geHive Storm

Falcon Custom

REST API

Graph DB

Sear

ch

Kafka

SqoopCo

nnec

tors

Mes

sagi

ng F

ram

ewor

k

3 REST APIModern, flexible access to Atlas services, HDP components, UI & external tools

1 Data Lineage Only product that captures lineage across Hadoop components at platform level.

4 ExchangeLeverage existing metadata / models by importing it from current tools. Export metadata to downstream systems

2 Agile Data Modeling:Type system allows custom metadata structures in a hierarchy taxonomy

Page 7: Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srikanth Venkat

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Apache Atlas Component Integration

• Cross- component dataset lineage. Centralized location for all metadata inside HDP

• Single Interface point for Metadata Exchange with platforms outside of HDP

Apache Atlas

Hive

Ranger

Falcon

Sqoop

Storm

Kafka

Spark

NiFi

HBase

HDP 2.3

HDP 2.5

Beyond HDP 2.5

Page 8: Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srikanth Venkat

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Next Generation Security & Governance for Hadoop NEW

Page 9: Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srikanth Venkat

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Demo Scenario HortoniaBank – mid-size financial services company (bank + health insurance

services) expanding from US to international markets Employees in EU and US Multiple business units need access to customer data: Analysts, Compliance

Admins, HR Customer data is co-mingled as well as isolated Leases data from external data brokers Needs to have rational security policies to provide the right level of access

control to customer data across geographies, business functions, and to comply with external regulations (PII, HIPAA, EU Privacy etc.)

all user passwords: hadoop

Page 10: Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srikanth Venkat

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Demo Data Customer data in hortoniabank DB

• 2 Customer Tables: 50K customer records each with 38 fields (PII, PHI, PCI & non-sensitive data)

–us_customers: USA person data only–ww_customers: multi-language, multi-country, localized person

data across the world• 1 Reference table: eu_countries (reference table for looking up EU

country codes to country mappings – with BRExit etc.) Finance DB: 1 data set leased from a data broker

– tax_2015: Data lease expired already (on Dec 31st 2015)

all user passwords: hadoop

Page 11: Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srikanth Venkat

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Ranger Policies Setup for Demo Only US employees can see data in us_customers table and only from locations within the US

(access_us_customers)

Only US employees can see data rows of US persons in ww_customers table (filter_ww_customers_table + access_ww_customers)

Only EU employees can see rows with EU person data in ww_customers table (filter_ww_customers_table + access_ww_customers)

US HR team members can see all original unmasked data (PCI, PII,….)

Analysts can view masked versions of sensitive data from WW customers table but are prohibited from viewing PII data in US tables (All masking policies under Masking Tab of Resource based policies)

No combination of zip code, MRN, and bloodgroup data are permitted to be joined in any query (prohibition policy)

Page 12: Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srikanth Venkat

12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Personas Setup for DemoUser Group Access Privileges

joe-analyst us_employees, analyst

US Data Only, non-sensitive data only, rest masked or forbidden depending on sensitivity

kate-hr us_employees, hr US Data Only, All sensitive data (PCI, PII, PHI)

ivana-eu-hr eu_employees, hr EU Data Only, All sensitive data

compliance-admin compliance, us_employees

Compliance with licensing, can only see leased data sets

Page 13: Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srikanth Venkat

13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Data Column Data Column Description

Masking Type

Sample Output Ranger Masking Policy

password Password Hash 237672b21819462ff39fcea7d990c3e5 mask_password_hash

nationalid National ID Show Last 4 xx-xx-9324 mask_nationalid_last4

ccnumber Credit Card Number

Show First 4 4532xxxxxxxxxxxx mask_ccnumber_first4

streetaddress Street Address

Redact nnn Xxxxxx Xxxxx mask_streetaddress_redact

MRN MRN Nullify null mask_mrn_nullify

age Age CUSTOM (Adds a random number below 20 to actual age)

mask_age_custom

birthday Date of Brith

CUSTOM 01-01-1987 (Keep year of birth and make date & month 01-01)

mask_dob_custom

Data Masking Policies setup for us_customers data for analyst group

Page 14: Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srikanth Venkat

14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Tag Based Policy for Leased data

Group Access Privileges

public No Access after data lease expiration date (denied)

compliance Compliance team allowed to access data after expiration date

Tagging Leased Data set in Atlas

tax_2015 table tagged with EXPIRES_ON with expiry_date:2015-12-31

Tag Based Policy in Ranger for leased dataset: (Policy name: tag_EXPIRES_ON)

Page 15: Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srikanth Venkat

15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

HDP Security Benefits Comprehensive Securitythrough a platform approach. Providing Administrators with complete visibility into the security administration process

Data ProtectionEncryption of data at rest and in motion, Dynamic Masking & Row Filtering

Centralized Administrationof security policies and user authentication. Consistently define, administer and manage security policies. Define a policy once and apply it to all the applicable components across the stack

Fine-Grain Authorizationfor data access control for Database, Table, Column, LDAP Groups & Specific Users. Dynamic tag based policies

Integrated with Data Governance via Apache Atlas

YA R ND A T A O P E R A T I N G S Y S T E M

OPERATIONS SECURITY

GOVERNANCE

ST

OR

AG

E

ST

OR

AG

E

MachineLearningBatch

StreamingInteractive

Search

SECURITY