how to use the powerpoint template - oracle · original data sets in hadoop –publish blended data...

65
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 电信行业大数据发掘,提升企业智慧 Emily Wu Endeca Solution Specialist — APAC Dec,2014

Upload: others

Post on 14-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

电信行业大数据发掘,提升企业智慧

Emily Wu Endeca Solution Specialist — APAC Dec,2014

1

Page 2: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

议程

Oracle Confidential – Internal 2

数据发掘的价值

技术架构

应用案例讨论

Q&A

1

2

3

4

Page 3: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

“There are a lot of commonly held beliefs about big data that need to be challenged, with the first being that you simply adopt Hadoop and are good to go. The problem is that Hadoop is a technology, and big data isn't about technology. Big data is about business needs. In reality, big data should include Hadoop and relational [databases] and any other technology that is suitable for the task at hand.”

– Ken Rudin, Head of Analytics, Facebook

Big Data = Hadoop + NoSQL + Relational

3

Page 4: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

电信行业具备完整的大数据能力基础

4

CT ICT DT

Information Architectures Today: Decisions based on database data

Big Data: Decisions based on all your data

Video and Images

Machine-Generated Data Social Data

Documents

Session Data Records (SDRs)

Call Data Records (CDRs)

DTV / Set Top Box Data

Signalling Data Network Logs

Deep Packet Inspection (DPI) Weblog Data SMS Data

Network Fault/Alarm Logs Network Perf Counters

Telemetry/Sensor Data Location Data

Sales Data CRM Data

Order & Fulfillment Data

Trouble Ticket Data

Churn Data

Call Center Logs

非结

构化

结构

Page 5: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

开发大数据分析项目的主要障碍

5

Page 6: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 6

数据产生

收集和存储

分析和计算

决策和共享

问题聚焦

Page 7: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

数据价值实现

已知的需求和问题

未知的需求和问题

可执行的信息

已知的数据资源

已知的视图

多重复杂数据源 多样的展现视图

Page 8: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

价值

投资

从探索开始,发掘大数据长尾的价值

数据探索和发现

数据决策和行动

构建分析应用 驱动决策和行动 针对未知问题

进行探索

Page 9: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Big Data – A New Approach

传统分析项目 大数据分析项目

传统的80% 的工作集中在数据准备上

数据量大,种类和速度爆炸的复杂性

不可持续性!

9

Page 10: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

不容易快速发现价值

10

80% 工作花在评估和数据准备上

数据不确定

• 不熟悉和不易克服

• 潜在的价值不明显

• 需要大量的操作

过度依赖于有限的,技术要求高的资源

工具复杂

• 早期Hadoop工具是技术专家使用

• 已有的BI工具不是为Hadoop设计

• 整合的方案缺乏广泛推广的能力

Oracle Confidential – Internal

Page 11: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

大数据分析,需要一种全新的方法

11

快速处理和文本丰富让其更易理解

让所有人来发现和分享大数据价值

一个直观的用户界面 ……

发现和浏览大数据,发现潜在的价值

发现 浏览 处理 探索 分享

Page 12: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 12

大数据探索发现

发现 浏览 处理 探索 分享

Page 13: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

• Navigate a rich catalog of all data in the Hadoop cluster

• Familiar search and guided navigation for ease of use

• Access data set summaries, annotation and recommendations

• Provision your own data through self-service upload

• Data is automatically enriched with extracted locations, terms, sentiment

• Browse personal big data projects and those shared by the community

Oracle Confidential – Internal 13

Easily Find Relevant Data Sets

Page 14: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

• Understand shape of the data. Visualize attributes by type

• Entropy based sorting by information potential

• View attribute statistics, data quality and outliers

• Use scratch pad to see statistical correlations between attribute combinations

• Evaluate whether a data set is worthy of further investment

Oracle Confidential – Internal 14

Explore the Data and Understand Potential

Page 15: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

• Intuitive user driven data wrangling

• Library of data transformations to replace values, convert types, collapse, reshape, pivot, group, custom tag, merge and much more

• Data enrichments for inferring location and language. Theme, entity and sentiment enrichments for text

• Preview results, undo, commit and replay transforms

• Run on sample data in memory or full data set in Hadoop

Oracle Confidential – Internal 15

Transform and Enrich Data to Make it Ready

Page 16: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

• Mash up different data sets for deeper perspectives

• Drag and drop from a rich library of interactive visualizations to compose discovery dashboards

• Filter through data with powerful search and intuitive guided navigation

• Publish blended data sets back to Hadoop

• Share projects, bookmarks and snapshots with team members for collaboration

Oracle Confidential – Internal 16

Analyze the Data to Discover New Insights

Page 17: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Share Results and Publish for Enterprise Leverage

Oracle Confidential – Internal 17

• Share and collaborate with the team

– Share projects, bookmarks and snapshots then collaborate and iterate

• Publish back to Hadoop

– Transforms and enrichments may be applied to original data sets in Hadoop

– Publish blended data sets back to HDFS

• Leverage results in other tools

– Publish data to Hadoop in format optimized for advanced analytic tools (e.g. ORAAH)

– Hadoop compliant BI tools (e.g. OBIFS) can burst out to the masses

– Leverage any native Hadoop tooling (e.g. Pig, Hive, Impala, Python, etc)

– Integrate BDD data sets with DWH to secure, govern and optimize for query performance (e.g. Oracle Big Data SQL)

Oracle Big Data Discovery plays well with the big data ecosystem

Explore

Transform Discover

Find

Share & Collaborate

raw data

transformed data

data reservoir (HDFS)

Publish

data warehouse

business intelligence

advanced analytics

other hadoop tools

Leverage

Page 18: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

议程

Oracle Confidential – Internal 18

数据发掘的价值

技术架构

应用案例讨论

Q&A

1

2

3

4

Page 19: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

数据模型

引导式导航

完整的搜索

非结构化内容

迭代式部署

Page 20: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

发现和洞察 自适应、灵活的数据分析架构

Page 21: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

A Faceted, Flexible Model Enables Iterative Data Growth

Endeca index

Mak

e Problem Code

Causal Service P/N

Gear Box

Loss of Power

Injured

Model

Colo

r

Model Yr.

Production

Record

Claim Record

Test Codes

Date

Production P/N Repair Type

Manufacturing Quality

Sales Record

Regio

n

Date Sold Dealer Location

Customer Arrival Date

PPAP Status

Sta

tus

Rev Level

Component

Description

Plant Name

Component Details Records

NHTSA Record

Com

pla

int

Term

s

Make Component Description

VIN

Page 22: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Flexible Faceted Model Providing Discoverable Relationships

Page 23: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Hybrid Analytic Search Database Structured data Sales Transactions

Transaction

TxnID ProductID Category Amount

12324 506 Mountain Bike $499

12325 507 Road Bike $1399

Semi-structured data Product Catalog XML

Unstructured data Product Reviews

• TxnID = 12324 • ProductID = 506 • Category = Mountain Bike • Amount = $499.99

• TxnID = 12325 • ProductID = 507 • Category = Road Bike • Amount = $1399.49

• Suspension = Fox 32 F-Series • FrameType = Aluminium • Saddle = Bontrager SSR • Mountain Accessories = Fork sag meter • Mountain Accessories = Water Bottle

• Weight = 20lb. • FrameType = Composite • Saddle = Bontrager Race

• Review = A great bike for off road. Smooth ride over the bumps

• Review = Disappointing for the price. The frame feels heavier than I expected.

Text Enrichment Terms and Sentiment

• ReviewSentiment = Positive • ReviewTerm = Great • ReviewTerm = Off Road • ReviewTerm = Smooth • ReviewTerm = Bumps

• ReviewSentiment = Negative • ReviewTerm = Disappointing • ReviewTerm = Price • ReviewTerm = Heavy

Review: 1301 Product: 506 A great bike for off road. Smooth ride over the bumps.

混合型分析搜索数据库

Review: 1327 Product: 507 Disappointing for the price. The frame feels heavier than I expected.

Page 24: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Data Warehouses Data Appliances

Departmental Databases

Quality Research Call Center

Enterprise Applications

External and Third Party

Social Media

Hybrid Analytic Search Database

半结构化

不同的来源

结构化数据

DW/DM 非结构化

文本

高价值的探索应用

各种数据集成

混合型分析搜索数据库

Page 25: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

数据模型

引导式导航

完整的搜索

非结构化内容

迭代式部署

Page 26: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Guided Navigation

Guided Navigation

Product ID

506 (99) 507 (78) 357 (43) 768 (23)

Category

Mountain Bike (112) Road Bike (76) Hybrid (23)

Price

Frame

Accessories

Structured data Sales Transactions

Transaction

TxnID ProductID Category Amount

12324 506 Mountain Bike $499

12325 507 Road Bike $1399

Semi-structured data Product Catalog XML

Unstructured data Product Reviews

• TxnID = 12324 • ProductID = 506 • Category = Mountain Bike • Amount = $499.99

• TxnID = 12325 • ProductID = 507 • Category = Road Bike • Amount = $1399.49

• Suspension = Fox 32 F-Series • FrameType = Aluminium • Saddle = Bontrager SSR • Mountain Accessories = Fork sag meter • Mountain Accessories = Water Bottle

• Weight = 20lb. • FrameType = Composite • Saddle = Bontrager Race

• Review = A great bike for off road. Smooth ride over the bumps

• Review = Disappointing for the price. The frame feels heavier than I expected.

Text Enrichment Terms and Sentiment

• ReviewSentiment = Positive • ReviewTerm = Great • ReviewTerm = Off Road • ReviewTerm = Smooth • ReviewTerm = Bumps

• ReviewSentiment = Negative • ReviewTerm = Disappointing • ReviewTerm = Price • ReviewTerm = Heavy

Review: 1301 Product: 506 A great bike for off road. Smooth ride over the bumps.

Review: 1327 Product: 507 Disappointing for the price. The frame feels heavier than I expected.

Page 27: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

数据过滤

Oracle Confidential – Internal/Restricted/Highly Restricted 27

可以Endeca中的值都会变成“可用细化”的过滤选择条件

离散的值会变成列表,同时显示每个值对应的数据量

连续的值会变成区间选择,改变数值区间,会显示对应的结果数量

离散的值,可以独立模糊搜索 可以将选择的值排除在搜索的结果集之外

可以一步一步过滤数据,每次过滤的条件会显示在导航栏中

Page 28: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

数据模型

引导式导航

完整的搜索

非结构化内容

迭代式部署

Page 29: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

数据搜索

Oracle Confidential – Internal/Restricted/Highly Restricted 29

可以指定所有数据中搜索,也可以指定在某个属性上搜索

所有数据中的搜索导航标签

在搜索中输入部分关键字系统会自动匹配。在选择是所有数据中搜索,还是某个属性中搜索

某个属性中搜索导航标签

Page 30: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Hit Highlighting and Relevance Ranking 高亮度显示 可配置的相关性排名

Page 31: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Snippeting, Stemming, Etc Snippeting

Stemming “did you mean?”

Spell Correction

同类词

Page 32: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

数据模型

引导式导航

完整的搜索

非结构化内容

迭代式部署

Page 33: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

支持的非结构化数据内容

• Call Center Verbatim

• Quality Record

• Patient Research

Data Warehouses Data Appliances

Applications

Content Management

Systems

Standard Database Queries

CMS Connectors

Web Services Website APIs

Social Media Web Services Website APIs

Internal Content

Web Data

Web Content

Reviews

Forums

Social Media

Page 34: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

对非结构化内容的解析

Metadata • Date • Source • Patient • Department

Procedures • Biopsy • Urinalysis • Catheterization • Dilation

Medications • Linopril • Midazolam • Atenolol

Doctors • S. Ryan • W. Summer • K. Hughes

Themes • Stent failure • Allergy to penicillin • No obvious rashes

Sentiment • Positive • Negative • Neutral

Facilities • Norwood Memorial • Quincy Medical

External

Industry Ontology

Internal

Data Records

NLP Technology

Tagging &

Enrichment

Page 35: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

非结构化数据 – 标签

Oracle Confidential – Internal/Restricted/Highly Restricted 35

标签

标签 + 标签对于的记录数量

标签列表展现

Page 36: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

非结构化数据

Oracle Confidential – Internal/Restricted/Highly Restricted 36 Oracle Confidential – Internal/Restricted/Highly Restricted 36

搜索总结信息: 符合搜索条件的记录总数 正面情感的记录数量 负面情感的记录数量 Endeca能对非结构化数据进行舆情分析

在自动解析的主题元素基础上,可以基于自己定义的词库,给相关元素打上标签,如“物流”,从而反馈跟此标签,如“物流”相关的反馈,后面的数字表示反馈的记录数。 所有的词条都可以点击下钻,而且不需要设置任何下钻路径;当用户点击某个词条时,系统就将此词条加入到新的搜索的条件中。

这是打上“产品”标签的相关元素

这是系统自动解析出来的主题元素,可以看到用户主要关心的话题,也可以解析出实体,例如人物,地点,组织,产品等等。同时提供迭代的提供词库和主题,对数据规范化。所有的词条都可以点击下钻,而且不需要设置任何下钻路径;当用户点击某个词条时,系统就将此词条加入到新的搜索的条件中。

在搜索框中,可以输入关键字,可以是部分关键字系统会自动匹配。

Page 37: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

数据模型

引导式导航

完整的搜索

非结构化内容

迭代式部署

Page 38: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

探索应用的不同与传统的开发过程

Structured

Semi-Structured

Unstructured

获取各种不同来源的信息,并将他们关联在一起,同时对非结构化信息的解析

自动对数据进行管理 – 不需要预先建立模型

在Studio自由拖拉各种组件

迭代

交互式搜索,导航和数据

可视化

Page 39: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

随时间逐步增加更多的数据来创建不同的数据混搭探索 探索多个结构化和非结构化数据之间的内在信息关联

Oracle Confidential – Internal/Restricted/Highly Restricted 39

通过搜索,过滤,报表图形上的点击下载 看见过滤后的数据集的图形和报表 希望增加或者采用新的数据集,来探索不同数据集之间的信息关联,找寻问题的原因

Oracle BI 服务器模型

云和社交服务

个人Excel文件

IT 提供的企业数据

增加新的白列表和文本和标记记录,,对数据进行规范化

输出新的关键值,和原有数据基于值匹配关联

Page 40: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

地图可视化

Oracle Confidential – Internal/Restricted/Highly Restricted 40

带数字的点展现形式 位置点展现形式 热图展现形式

地图可以放置多个不同展现形式的层 不同层可以选择基于的数据

可以搜索地图

可以范围搜索

Page 41: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

数据模型

引导式导航

完整的搜索

非结构化内容

迭代式部署

Page 42: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

议程

Oracle Confidential – Internal 42

数据发掘的价值

技术架构

应用案例讨论

Q&A

1

2

3

4

Page 43: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

DPI数据可以告诉我们什么?

43

账单数据

• 属性:54

• 数据:354KB

详单数据

• 属性:48

• 数据:413MB

DPI数据

• 记录:1543万

• 属性:16

• 数据:3.56GB

Page 44: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Page 45: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Page 46: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Page 47: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

宽带上网资料

通过Tag Cloud可以突出那些网站是最常去的

通过搜索框,输入facebook关键字,进行搜索

Page 48: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

目标群: 常上facebook的客戶 分析维度: 客戶的数据资费及居住地

对于台南市的客戶可以推一些专门上facebook的资费包方案

利用Search Box锁定常上facebook的客戶

对于数据资费DO449的客戶可以推一些专门上facebook的资费包方案

此時段上facebook的客戶年龄分布为20~50

Page 49: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

目标群 : 常上facebook的客戶 分析维度:客戶的年纪段及语音资费

可切换成年龄范围来分析年龄层占比

了解这些客户群是偏好哪些语音资费方案

Page 50: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

指向30-40年龄层次, 可直接显示这个年龄层次用户占92.57%, 可以直接往下下钻到具体明细

Page 51: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

可以增加30-40年龄层次到过滤条件

可以了解资费包的具体占比

这些客户的性别分布情况

Page 52: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

可以直接知道常上FB客户明细并且可以导出

Page 53: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Endeca探索FB客户分析結果 •了解那些常上FB客戶的居住地或年龄层次, 可针对那些客戶市场部门可提供专门的资费方案或特别流量包。

•了解那些常上FB客戶花长时间? 哪些时段上FB ?

•那些客戶有哪些语音资讯? 数据资费 ? 促销方案? 性別等。

•对于这些高度使用客戶市场部门可依据提供个性化服务

Page 54: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 54

如何知道市场活动是有效的?

Page 55: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

客户投诉管理 针对手厅和网厅三个月的投诉内容进行分析

从图中我们可以看到网厅的投诉在1月30日~2月5日有明

显的增加趋势,那几天正好是春节假日,可能是造成投诉增加的原因。

Page 56: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

客户投诉管理 可以点击1月31日,下面是具体的内容 可以看到那天的投诉时347条,

同时当天投诉的热点和具体的内容都展现出来。

Page 57: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

客户投诉管理 可以根据地区进行相应的分析。我们发现投诉量上,广州排名第一,这比较好理解,但宁夏居然排名第二,我们下钻到宁夏各市区,可以发现吴忠市排名第一,同时可以看到来自吴忠的投诉内容热词和相应的具体内容。

可以点击信号,看一下跟信号相关的投诉:那么可以很快的发现信号有哪些问题

Page 58: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Demo-联通客户投诉管理 搜索功能,可以帮助你快速定位你需要的信息。例如,我想看一下跟流量相关的投诉,在搜索栏输入流量,系统的预见性查询功能帮助你很快列出跟流量相关的 语句或词组

选择网络流量,网络流量相关的统计和内容很快能显示出来。

Page 59: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

客户问题记录可以发现什么?

处理和加载 Link the data together using a simple drag & drop tool

探索和发现 Data is ready for business users to explore & discover

Siebel 数据库 Trouble Tickets

Customers

Assets (Equipment)

Service Addresses

Products Orders

Order Activity

Trouble Ticket Activity

外部数据

Forum Posts Twitter

Feed

Network Data

SFM

EMS

CCS

2 3

59

Page 60: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Customers who raised TT

30% (190,751)

Customers who did not raise TT

70% (443,754)

发现 #1 – 38.8%的问题可以通过简单的重启设备解决 Total Trouble Tickets 389,867

Customers who raised TT 190,751

Total Customers 634,505

% of customer who raised TT 30.06%

• Insight #1: Provide Customer Advice resolves TT of ~98k customers (38.8% of Total Trouble Tickets)

Discovery 1

38.8 % Advise Customer

( 98k Cust)

50.7% Others

6.0% VDSL Replaced

(21k Cust)

25% customers were advised by CSR to

reboot their equipment

• 给客户发送一些简单问题解决的方法列表

• Utilize first call resolution as an indicator of good customer experience

Recommendations and Next Steps

6% customers was advice by CSR to call back upon self troubleshooting Almost all customer called back within 1 day to raise another Trouble Ticket

(Customers who did not raised TT)

• Insight #2: 6% of TT was closed by advising customers to call back again if problems persist (~22k TT)

• Note: Almost all customers called back to raise TT within the day

389k TT - Resolution Type

1

2

1

3

1 2

60

4.5% PG Replaced

• Insight #2: CSR has advice customers to reboot their equipment for 25% of the trouble ticket.

2

Almost ALL customers call back

within the same day

3

Page 61: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Total Trouble Tickets 389,867

Customers who raised TT 190,751

Total Customers 634,505

% of customer who raised TT 30.06%

• Insight #4: VDSL Modem Replacement resolves ~23k TT (~21k customers)

Discovery 2

• Insight #5: HuHawei VDSL modem replacement rate (13.91%) is higher than ZTE (11.65%) Huawei: (~91k customer) ZTE: (~69k customer)

• Insight #6: 8.1% of VDSL customer needed replacement for their modem within the 12 months warranty period Huawei: 8.6% ZTE: 7.6%

~69k installed

11.6% Replaced

ZTE

~91k installed

13.9% Replaced

Huawei

Customers who raised TT

30% (190,751)

Customers who did not raise TT

70% (443,754)

发现 #2 – 通过更换设备解决客户所在地的故障问题

• ITM和终端代理商之间签订 SLA. Note that if it cost RM300 per truck roll, ~12k of truck roll would cost TM up to RM3.6 million to resolve the trouble ticket.

Note: 如果超过一年的保修期,TM 收取用户RM50 的替换费用

Recommendations and Next Steps

0%

2%

4%

6%

0m - 6m 7m - 12m 13m - 18m > 18m

% o

f V

DSL

Mo

de

m R

ep

lace

HUAWEI ZTE

3

4

5

6

5

6

61

38.8 % Advise Customer

( 98k Cust)

50.7% Others

6.0% VDSL Replaced

(21k Cust)

4.5% PG Replaced

389k TT - Resolution Type

DIR-615 – modem model

often complained in Lowyat forum. The are 16,866 trouble ticket which replace PG

• Insight #3: LOWYAT complaints on connectivity issues for DIR-615 model

7

4

7

Page 62: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

#TTDI

#Connection

#bad

#NotWorking

#down #outage

Total Trouble Tickets 389,867

Customers who raised TT 190,751

Total Customers 634,505

% of customer who raised TT 30.06%

• Insight #7: Based on twitter feeds, Unifi services are severely interrupted in TTDI areas from 14-Feb-13 to 17-Feb-13

Discovery 3

• Identify steps to ensure all CTT with CTT Resolution Code “NTT_Resolved” must be tied to NTT. This ensures proper reporting on the impact of NTT

Recommendations and Next Steps

major root cause

TDI outage spikes up in

April 2012 (858 TT)

5% of all TT created is due to Major Outage in April

2012 (858 out of 17,000)

Major outage across all exchanges peak in June 2012 (3,806 TT)

• Insight #8: Major outage is one of the major root cause of TDI exchange TT with a spike on Apr’12 (5% of all TT from TDI)

• Insight #9: Major outage across all exchanges peak in June’12 instead of April’12

25% of Major Outage CTT are not linked to NTT

Major Outage

• Insight #1 25% (4,621 TTs) of Major Outage CTT are not tied to NTT (TT closed using Resolution Code of NTT Resolved)

TDI E

xch

ange

(1

,27

4 T

T)

All

Exch

ange

s (2

4,8

80

TT)

发现 #3 – “大面积停电” 相关的问题

4

8

9

10

11

8

9

10

11

62

Page 63: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

变现大数据资产的能力

大数据 管理

大数据 分析

大数据 应用

大数据 集成

数据 资产

任何数据源的 连接、装载和治理

简化对所有 数据的访问

探索发现、 分析和预测

加速数据驱动 的决策和行动

Page 64: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

可行动的 事件

数据水库 数据工厂 数据仓库 BI 和报表

发现实验室

可行动的 洞察

实时数据流

决策行动

探索创新

发现信息 输出 & 共享

事件 & 数据

从探索到行动,让大数据驱动运营成为业务常态

运营流程数据

通信网 & 业务网 & 物联网 & 社交数据

事件引擎 决策引擎

有价值的 信息

Page 65: How to Use the PowerPoint Template - Oracle · original data sets in Hadoop –Publish blended data sets back to HDFS •Leverage results in other tools –Publish data to Hadoop

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

议程

Oracle Confidential – Internal 65

数据发掘的价值

技术架构

应用案例讨论

Q&A

1

2

3

4