oltp에서 대용량 실시간 다차원 델링 구현 사례download.microsoft.com › download ›...

OLTP에서 대용량 실시간 다차원 모델링 구현 사례

(Cube를 OLTP 영역에서 활용하기)

㈜인브레인 우철웅

강사 소개

경력 • [현] 인브레인 BI 사업부 기술 이사

• 다우데이터 강사

• 이천일아울렛 전산실

주요영역 : DW/BI Modeling

& Architecture

저서 • ADO & MTS 공저

• SQL Server 2000 Programming

• 실무를 고려한 SQL Server

• 프로그래머 그들만의 이야기 공저

취미 : 목공

Lesson 1 | OLTP vs OLAP

OLAP, Cubes and Multidimensional Analysis

OLTP vs. OLAP

Why using OLAP at OLTP

OLAP, Cubes & Multidimensional Analysis

Basically OLAP is an awful name, author of the OLAP report calls the same thing FASMI.

• Fast - 90% of queries back in under 10 secs and no query takes longer than 30 secs.

• Analysis - Drill down, multiple aggregation techniques, sophisticated graphics, trends all form part of this

• Shareable - good security at the back end and available to a wide community of users. also multi currency, multi lingual to cope with the global economy.

• Multi-Dimensional - Excel pivot tables but more so. The ability to have any multiple dimensions of information on each axis of a cross-tab with other dimensions being used to further filter the results returned.

• Information - Real world KPI's rather than raw numbers.

※ Fast Analysis of Shared Multidimensional Information (FASMI) is an alternative term for OLAP. The term was coined by Nigel Pendse of The OLAP Report (now known as The BI Verdict), because ...

OLTP vs. OLAP

OLTP vs. OLAP

OLTP Online Transaction Processing

(Operational System)

OLAP Online Analytical Processing

(Data Warehouse)

Source of data Operational data Consolidation data

Purpose of data To control business tasks To help with decision support,

problem solving, planning

What the data Ongoing business processes Multi-dimensional views

Inserts & Updates Short & fast inserts/updates Periodic long-running batch

Queries Simple queries, few records Complex queries, aggregations

Processing Speed Very fast Many hours

Space Requirements Small if historical data is archived Larger

Database Design Normalized, many tables De-normalized, fewer tables, Star or snowflake schemas

source: www.rainmakerworks.com

OLTP vs. OLAP

http://www.rainmakerworks.com/


대량 데이터 조회 성능

Matrix Mass data inquiry

Multi relation measure group

Multi dimension cross

간단하게 대량 데이터 일괄 Write-back

Intermediate Level에서 Write-back 가능

Update Cube ~ 구문으로 하위 모든 Leaf Cell 갱신

4개의 Disaggregation Method 제공

배분 구성비에 대해 참조 measure 선택 가능

OLTP vs. OLAP


대량 데이터 조회 성능

Matrix Mass data inquiry

Multi relation measure group

Multi Dimension cross

RDB에서 어려움

Big Table vs Big Table Join Performance

Matrix Format

OLTP vs. OLAP

실매출, 재고, 생산/출고/Shipping/입고계획, 다양한 매출 목표 등

Account(5,000) * Product(10,000) * 24 Week * 5 Measures 60억 Cell

Account(5) * Product(1,000) * 24 Week * 5 Measures 60만 Cell

Index Key 큼 : PlanWeekID(int), AccountID(int), ProductID(varchar 30), WeekID(int) 42 Byte

Index create & update & join Resource 많이 듬

DB 작업 가능하나 어려움, Middle이나 Front에서 수행 시 비용 많이 듬


간단하게 대량 데이터 일괄 Write-back

Intermediate Level에서 Write-back 가능

Allocation Method 및 할당 비율 설정

Product * Week Product * Month

Account * Product Group * Month

Account * Product * Week

Intermediate Level Leaf Level

UPDATE [CUBE] <Cube_Name> SET <tuple>.VALUE = <value> [,<tuple>.VALUE = <value>...] [ USE_EQUAL_ALLOCATION | USE_EQUAL_INCREMENT | USE_WEIGHTED_ALLOCATION [BY <weight value_expression>] | USE_WEIGHTED_INCREMENT [BY <weight value_expression>] ] UPDATE CUBE [SalePlans] SET ([Measures].[SalesPlanQty], [Date].[Month].&[2012-12] , [Product].[Product Family].[Product Family].&[Drink]) = 2560 , ~~ = 4300, USE_WEIGHTED_ALLOCATION BY [Measures].[Sales3MonthAvg_Ratio]

OLTP vs. OLAP

Lesson 2 | Case Study

Project Overview

Customer Requirement

System Architecture

Issue & Solution

Optimize Result

Interactive Simulation

User friendly UI

• 60여 법인

• 5,000 Account

• 5,000 User

• 10,000 Product Item

User

Time line

Resource

• Biz 설계 컨설팅 : 6개월

• 1차 Roll-out 시작 : 7개월

• 2차 Roll-out & upgrade : 7개월

• 3차 Roll-out & upgrade : 3개월

• CPFR : 6개월 : Overlap 4개월

• 컨설팅 : 3 명

• Roll-out : 12~15 명

• 개발 : 15~25 명

Project Overview

Biz 컨설팅 6개월, 개발과 Roll-out 21개월

System Resource

• DB Server : 2 Ea

• Cube Server : 7 Ea

• Web Server : 4 Ea

• Storage : 6Tera byte

Case Study

Customer Requirement

Requirement Description

• Data query : less than 10 sec. (30 sec.)

Mass data : over 3 백만 cell

Average 20 만 Cell

• Data simulation : real time

• Data save : less than 1 min. (3 min.)

Performance

Convenience

Simulation • Rule based Data Aggregation &

Disaggregation

• Formula

• Simulation data can be saved

• User friendly Operation Function

• Display Optimization

Case Study

Intranet

스토리지 (USP#6)

L2 스위치 L4 스위치

거래선

사용자

ABCD

WEB #1

Internet Public 망

[YY DMZ]

OLAP

#1

WEB

#1

WEB

#2

ABCD

WEB #2

OLAP

#2

OLAP

#3

OLAP

#4 ODS DW

ABCD

OLAP#1

ABCD

OLAP#2

L4 스위치

MSCS MSCS MSCS MSCS

OLAP#1 (372GB)

OLAP#2 (310GB)

OLAP#3 (557GB)

OLAP#4 (310GB)

ODS (2TB)

DW (4TB)

SAN

Switch

내부

사용자

스토리지 (USP#2)

OLAP#1 (300GB)

OLAP#2 (300GB)

System Architecture

OLAP

#5

MSCS

Case Study

Hardware 구성도 – OLAP Server Partition

OLAP#5 (420GB)

~

1) Data Partition is - used for basic unit with OLAP process and possible to reduce conflicts through simultaneous access - Partitioned by Period(Year) and Subsidiary 2) Data Partition for Delta is - ROLAP Delta Partition for UI Disaggregation and increment data insert 3) Write-back Partition is - Possible to write-back for Cube Disaggregation 4) Tables for Measure Group Delta - match up with data partition for Dalta2) with cubes included other OLAP servers

Server Partition Dimension Members

A H

DW / DM Cube UI

Table_1_D

Table_1

Table_10 OLAP 1

OLAP 2

OLAP 3

Save

UI

Table for Measure Group 1

Table for Measure Group 1 Delta4)

Measure Group 1_2011



…

Measure Group 1 Delta

Measure Group 10 Write-back

…

Data Partition1) Data Partition for Delta2)

A B C

D E F

G H

A B C

D E F

G H

System Architecture

Cube 구성도 – Measure Group Partition

…

Write-back Partition3)

System Architecture

OLAP Server : Load balance by separate SSAS Cube considering of system performance

Smart client

Legacy

System

Client

OLAP

Server

Smart client

SSAS’s Cube

MS-SQL

Smart Client : All user operation (data querying, inputting, saving) be executed in smart client environment based on .NET

DW, ODS Server : SQL Server Enterprise

Data I/F (ETL) : SQL Integration Services

…

…

Case Study

Software 구성도

Web

2Tier 3Tier Cellsets Tag 축소 압축 Connection pooling

DW

Server

Web Server : Optimize Data Cellsets Tag, zip and Connection pooling for lower network and oversea user

MS-SQL

ODS

Server

SSIS SSIS

Issue & Solution Case Study

Issue Solution

Inquiry Performance

DB : SQL Server Analysis Service 사용 Server Partition, Cube Partition Cube Model 최적화, MDX Generate 최적화

UI : Data CellSet XML Tag 축약, 통신 압축/풀기 Grid Binding 시 Lazy Biding 처리 Calculation Engine 탑재

Save 시 단수 차이 발생

거시적 Planning : Cube Write-back 사용 ex) Statistical Forecast Adj, Marketing Plan, Top-down 등

미시적 Planning : Table Write-back 사용 (UI 구현) ex) Bottom-up, Weekly Consensus 등

Simulation시 Leaf Level 재 계산 필요

Table Write-back 사용 (UI 구현 : Calculation Engine 사용) ex) 총 금액 수정 : 하위 금액 배분 단가로 수량 계산 단위 금액 재 계산 총 금액 재 계산

Cube Lock 동시 사용자가 많은 경우, Write-back partition에 대해 ROLAP Zero base Aggregation 설정

Optimize Result

Old System New System

Average data loading time ( 20,000 ~250,000 cell)

Data Loading Time Key Features

Data Size (No of Sell)

New

Old

Sec

SQL Server

Multi DB Engine 사용

해외 사용자를 위한

다양한 접속 방식 지원

대량 데이터 Grid

Loading 최적화를 위한

Lazy Binding 적용

계산 로직 향상을 위한

Calculation Engine사용

Matrix 조회에 대해 RDB 대비 5~20배 빠름

대량 데이터에 대해 Partition을 통해 최적화

실시간 처리 위한 ROLAP 이용 Hybrid 방식으로 적용

2 Tier : 네트웍이 좋은 국내 사용자

3 Tier : 네트웍이 열악한 해외 국가 사용자

최적화를 위한 압축/풀기 방식 적용

Mass Option : 컴퓨터 사양이 낮은 경우 메모리 풀기가 아닌 하드디스크에서 풀기 방식 적용

Data Set과 Grid Viewer Set 분리

Override 기법을 이용하여 보이는 부분에 한해 먼저 Binding과 Event 재 적용

2~3단 배분이 참조값 배분을 위한 최적화 알고리즘 적용

Finding이나 Filtering 최적화를 위한 Data Indexing

Data CellSets의

불필요 Attribute 제거와

Tag 최적화

Client에서 사용치 않는 불필요 Entity, Attribute 제거

XML의 Tag 최소화로 Size 최소화,

30% 이하로 줄임

Case Study

Interactive simulation

Key Features

• Summary와 Detail Grid간 배분/집계 처리

• Grid 내의 All 값에서 배분/집계 처리 Jan-11 Feb-11 Mar-11 Apr-11 May-11

Summary Forecast 180

Sales 200

Jan-11 Feb-11 Mar-11 Apr-11 May-11

Item G1 Forecast

Sales 30

Item G2 Forecast

Sales 20

Grid 1

Grid 2

100

60

40

Disaggregation Aggregation

Rule1 : Forecast Rule2 : Sales •

Jan-11 Feb-11 Mar-11 Apr-11 May-11

Item1 Forecast

Sales 20

Item2 Forecast

Sales 20

Grid 3

Disaggregation Aggregation

30

30

다양한 Dis/Aggregation

Multi Level

배분/집계

배분 참조

Measure 선택

• 수정한 Measure 기준으로 배분

• 선택한 참조 Measure 기준 배분 가능

• Qty 수정 시 Price 참조하여 Amt 자동 계산

• Amt 수정 시 Price 참조하여 Qty 자동 계산

Qty & Amt

자동 전환

단수 처리

최적화

• 수량 단수나 금액 단수 처리 최적화 지원

• 사사오입 후 총량 맞추는 것 값 큰 단위

Item부터 가감 처리

각종 배분

Ratio 관리

• Background 배분 필요 Ratio 관리 적용

• UI 배분 필요 Ratio 관리 적용

Case Study

User friendly UI

Key Features

• 조회하고 싶은 Measure에 대해 사용자가 직접 선택하여 사용 가능

다양한 UI Function

Measure 선택

Filter & Find • 조회된 데이터 안에서 관심 있는 항목에 대

해 Find하거나 Filter 기능 제공

• 사용자 선택적 Summary Grouping 제공 Summary Tab

Cell Coloring & Blocking

• Blocking 종류 별로 인지 가능한 컬러링 • NPI/EOL, Un Assigned Item, Frozen

Period 등의 Blocking

Menu & My Menu

• Docking 패널로 넓은 사용 가능 • 잦은 수행에 대한 빠른 수행 도구 모음 지원

Grouping Level

• 입력화면에서 Grouping Level 선택으로 Grouping된 항목 기준으로 FCST 입력 기능

• Row와 Col에 대해 +,- 버튼을 통해 Collapse & Expand 가능

Collapse & Expand

통화 전환 처리 • Local 통화 입력 후 USD, KWD로 전환 가능

Excel Down & Upload

• Excel Down시 산식 그대로 적용 • Upload로 DB 저장 지원

Chart/Graph • 기본적인 Chart와 Graph 지원 • OLAP에서 사용자 선택에 따른 지원

세부 데이터 Pop-up 조회

• Link 활성화를 통해 세부 데이터 조회 지원

Case Study

All

Group

Item

Qty Price Amt

1

2

3

4

5 5

6 6

Lesson 3 | Large Volume Key Point

Cube Design

Hierarchies & Relationship

RelationshipType은 'Rigid'로

Measure Group Merge

Measure dimension 사용

Write-back과 Locking

Cube Design

계층 구성 시 가능한 특성 관계를 정의하자.

대량 멤버1) 차원의 경우에 Numeric key 컬럼을 사용하자. 특히 Distinct Count에서 Character와 Numeric은 많은 성능 차이 발생.

RelationType은 가능한 ‘Rigid‘ 옵션을 설정하자.

동일 차원과 세분성(Granularity)를 가진 측정값 그룹들의 통합을 고려하자.

큰 행을 가진 측정값 그룹에 대해 Cube 파티션을 만들자.

파티션당 200만 행이하 또는 50Mb 이하가 될 수 있도록

큰 행2)을 가진 다대다 차원 관계 또는 측정값 간에 다대다 관계를 피하자.

Large Volume Key Point

※ 제공되는 참조 수치는 서버 환경이나 Cube 설계, 데이터 내용에 따라 다를 수 있음 1) 50만 이상 2) 100만행

미흡한 특성 관계 연결 시 성능이 떨어지는 이유 집계 과정에서 불필요한 단위 집계 생성

대신 주요한 집계가 대상에서 제외

Inquiry시 많은 성능 차이 발생 조회 튜플 집계 존재에 따라

조회 튜플 근접한 하위 튜플 레벨 집계 존재에 따라

자식 멤버 리스트 조회

자식 멤버들의 그룹핑

부모 멤버 참조해야 하는 경우

Products


계층 구성 시 특성 관계 연결 종속관계가 명확한 경우에 반드시 특성 관계 연결

Case1 : 올바른 특성 관계 연결

Case2 : 미흡한 특성 관계 연결


Group1

Group2

Product 1,000

50

10 5배수

20배수

Count

Fact


구분1 구분2 Cell 수 Case1 Case2 Case2/1 비율

Duration CPUTime Duration CPUTime Duration CPUTime

Select

최상위 1 0 0 0 0 - -

특정 PL2 모든 Product 272 0 32 16 62 ∞ % 194%

1개 Cell 1 0 0 0 0 - -

Update

최상위 3,127 266 0 266 78 100% -

특정 PL2 UPDATE 272 251 0 204 16 81% -

1개 Cell 1 219 32 220 32 100% 100%

계층 구성 시 특성 관계에 따른 차이. Case1 : Natural hierarchy (계층에 따른 특성 관계 설정)

Case2 : Unnatural hierarchy (특성 관계 무시나 미설정) 단위 millisec

결과 조회 튜플이나 하위 튜플의 집계 존재 유무 따라 많은 성능 차이 발생

가능한 Natural hierarchy가 되도록 유도 필요


※ 주의 : 제공되는 수치는 Cube 설계나 집계, 수행 환경에 따라 비율 차이가 있을 수 있음

Numeric key 컬럼

대량 멤버 차원에 대해 DataID를 위한 internal unique identifier 작업 & Lookup 최소화

Distinct 연산 최적화


Key Store Property Store Relationship Store

Key Hash Name Hash Bitmap Indexes

: :

Data ID Data IDs Of Related Attributes

Dimension Data

Attribute Store

Hierarchy Store

Measure Group Data

Aggregations

Fact data

Storage Engine Storage Engine Cache Data

Retrieval

DataID Key member values

DataID, Att Property 들 Translations

Attribute lookups

Key Hash Table

Member name Hash Table

Dim Data

Retrieval

DataID Att’s relationships

Ex) Relation Ship Store

DataID Color Size . .

:

567 2 3

:

DataID 567 : Touring BK-T44U-60 Color 2 : Black Size 3 : 46

Ex) Color Bitmap Index

DataID Black Blue Red

1 1 0 0

2 0 1 0

3 1 0 0

: :

Distinct Count가 큰 Att에는 AttributeHierarchyOptimizedState를 Not Optimized로 Bitmap Index 생성 않도록 설정 필요

RelationshipType은 'Rigid'로

특성 관계가 시간에 따라 변경될지 유무에 따라 설정

특성 관계에 대한 정의 기준

Rigid(고정 관계) : 멤버 간의 관계가 시간에 따라 변경되지 않는 경우

Flexible(유연한 관계) : 멤버 간의 관계가 시간에 따라 변경되는 경우

설정 않을 시 기본값은 Flexible

성능적 영향

Flexible로 정의하면 증분 업데이트의 일부로서 집계가 삭제되고 다시 계산됨

고정된 관계로 정의할 경우 차원이 증분 업데이트되면 Analysis Services가 집계를 보유함.

문제의 예


2011년 Partition

2012년 Partition

2010년 Partition Products Dim 2010년 Partition Agg2

2010년 Partition Agg1

2011년 Partition Agg2 2011년 Partition Agg1

2012년 Partition Agg2 2012년 Partition Agg1

Group1

Group2

Products

A상품 Group2 변경 Group2_H Group2_Q

Group1 Agg Group2 Agg

1

최근 Partition만 Full Processing Batch

Daily Dim Incremental Processing Batch

2 3

2012년 Partition Agg2 2012년 Partition Agg1 Inquiry

대응 가능 영역

Measure Group Merge

구분1 구분2 Cell 수 Case1 Case2 Case2/1 비율


Select

최상위 1 16 32 31 217 194% 678%

특정 PL2 모든 Product 272 16 32 31 186 194% 581%

1개 Cell 1 0 32 16 112 - 350%

동일 차원과 세분성의 측정값 그룹 Merge 유무 차이. Case1 : 6개 측정값을 1개 측정값 그룹에 포함

Case2 : 6개 측정값을 6개 측정값 그룹 각각 생성

결과 Duration 약 2배, CPU는 측정값 그룹 조합 수 만큼 차이 발생

따라서 Write-back을 수행해야 하는 Forecast 테이블은 분리

Reading만 하는 테이블 중 세분성이 같은 경우는 가능한 View를 통해 통합


단위 millisec

Measure Dimension 사용

Pair Measure들에 대해 MDX Scope 사용

Case1 : 구분자를 가지고 한 컬럼에 여러 Measure 저장

Case2 : 각 컬럼에 저장하고 MDX Scope 구문 사용

ex) Measure Dimension에 해당하는 Qty, Price, Amt구현에 대한 방법

구분1 구분2 Cell 수

Case1 Case2 Case2/1 비율


Select

최상위 1 16 80 16 32 100% 40%

특정 PL2 모든 Product 272 48 142 31 48 65% 34%

1개 Cell 1 16 0 0 0 0% -

Update

최상위 3,127 12281 21407 14250 15172 116% 71%

특정 PL2 UPDATE 272 1905 3213 1689 1531 89% 48%

1개 Cell 1 250 110 251 62 100% 56%

결과 : 전반적으로 각 컬럼에 저장하고 MDX Scope 사용 50% 저렴

Select 경우 : Duration 30%, CPU 사용도 60% 정도 적게 사용

Update 경우: Duration은 비슷하나 CPU 사용도 40% 적게 사용함

데이터 추출 적재 작업도 Case2가 더 편리함


Write-back과 Locking

Write-back Commit 동시 수행에 대한 병목 테스트 Case1 : 파티션별 순차적으로 개별 Write-back Commit 실행

Case2 : 모든 파티션을 동시 Write-back Commit 실행

구분1 Account Partition Cell 수 Case1 Case2 Case2/1 비율


Update

A Account 3,128 2,937 3,414

B Account 3,128 14,188 9,047

C Account 3,128 13,861 14,687

합계 9,384 30,986 27,148 60,251 32,747 194% 121%

결과 : 순차적 Commit보다 동시 Commit이 시간이 더 걸림

Update Execute 각 세션에서 처리되므로 병목에 관계 없으므로

모든 Update Execute를 수행 후 일괄 Commit 수행으로 최소화 필요

Commit에 대한 Locking 메커니즘은 RDB와 다르며 리소스도 많이 들어감

다수의 동시 Write-back 수행을 위해서는 Cube의 분리나 Server 분리를 고려할 필요 있음


Lesson 4 | Write back method

Write-back Architecture

Write-back Method

Cube Write-back

Table Write-back

MS Intelligence planning

Disk storage : 1 or more

Write-back Architecture

Microsoft Management Console(MMC)

Analysis Server

Analysis Add-in

Manager

Enterprise Manager

Pivo

tTable

Service

Analysis Manager

Object model (Analysis Management

Object)

Data source

Data source

Cube Cube

Meta data

repository

Mining model

Mining model

Custom application

Custom add-in

Client

Data source

Data source

for local

data mining

models

Local data

Mining

model

Data source

Data source

for local

cubes

Local data

Mining

model

Local cubes

Local cubes

Excel 의 가상 분석

Management Studio MDX 창

ADOMD.NET Server cache

Cube

Session cache

Update Cube~ Update Cube~ Update Cube~

Update Session cache

2

1

Commit

Cube Write-back

3

Write-back Partition processing

Piv

otT

able

Serv

ice

Write-back method

Write-back method Write-back method

Cube Write-back Direct Cube Write-back

Cube에 Write-back Partition 설정 필요

Table Write-back Table에 Increment data Insert

해당 Table에 연결된 Cube Partition 필요

구분 장점 단점

Cube Write-back

Intermediate Level Update 가능

관련 Tuple하위의 모든 Members에 대해 일괄 갱신 가능

Commit을 통한 Partition Process 필요, 즉 Locking 비용 필요

최종 데이터 추출은 MDX로 해야함

Table Write-back

복잡한 Disaggregation 요구에 대응 가능

ROLAP Zero Aggregation인 경우 Partition Process 불필요

최종 데이터 추출은 Query

Leaf Tuple에 대해서만 Update 가능

개발적 난이도 있음

Data Cellsets에 대한 이해

Front에서 증분값 계산 필요

Fact 테이블 구조에 맞게 Insert 필요

Locking 과정이 없으로 중복값 검증 및 보정 작업 필요

Cube Write-back

Measure Group

Excel 의 가상 분석

Partition 2011

Write-back method

Partition 2012

Write-back Partition

Management Studio MDX 창

ADOMD.NET

Direct Cube Write-back

Cube에 Write-back Partition 설정 필요

최종 데이터 추출 MDX 사용

Database

Fact Table

Up

da

te C

ub

e ~

최종 데이터 추출

Use MDX

1

3

Commit에 의한 Partition Processing

2

Commit

Table Write-back

Measure Group

Partition 2011

Write-back method

Partition 2012

General Partition

For Write-back

120115

Fact Table

Table에 Increment data Insert

해당 Table에 연결된 Cube Partition 필요 ROLAP Zero Aggregation인 경우 Partition Process 불필요

Front에서 Increment Value 계산 필요

최종 데이터 추출 Query 사용

Fact Table

3월 4월 5월

S-Phone I Sales 80

Forecast 85 90 80

S-Phone II Sales 105

Forecast 110 120 130

Database

최종 데이터 추출

Use Query

Insert ~ Values(S-Phone II, 4월, Forecast, -5)

120

-5

115

Cube Write-back vs. Table

Case 1 Case 2 Case2/Case1 비율 비고

Duration 4,273 883 20%

CPUTime 9,243 1,489 16%

Cube Write-back과 Table Inert & Processing 비교 Case1 : Cube Write-back Partition에 Write-back 수행

Case2 : Table에 Insert 후 파티션 Processing

(Forecast UI에서 증감분 계산 후 Insert 하는 경우)

결과 Forecast UI에서 증감분을 만들어 Insert 하는 Table Write-back이

5배 정도 리소스와 시간을 적게 사용함을 알 수 있음

따라서 동시 Write-back이 많은 시스템에 대한 병목을 해결할 수 있으나, Locking 과정이 없으므로 중복값 검증 및 보정 작업 필요

Write-back method

Locking과 중복값 보정 Write-back method

Cube Write-back 일반적인 Locking 메커니즘과 동일

1

2

Session 1 : Read 60

Session 2 : Read 60

4

3

Session 1 : Write 66

Session 2 : Write 75

Locking

Interval

607566

Table Write-back Write대상 Partition Table에 직접 저장, 즉 Delta Table

Cube Lock을 사용치 않기 때문에 별도 보정 작업 필요

60758166

Tuple Read Write Delta 값 값

A null 60 ①②

A 60 75 15 75 ③

A 60 66 6 81 ④

A -15 66 Latest 값으로 보정

Allocation Method

Allocation method Description

USE_EQUAL_ALLOCATION

해당 Tuple의 모든 리프 셀에 대해 Count로 신규값을 균등 배분하여 갱신

<leaf cell value> = <New Value> / Count(leaf cells that are contained in <tuple>)

USE_EQUAL_INCREMENT

신규값과 기존값의 차이분에 대해 Count로 나누어 균등 배분

<leaf cell value> = <leaf cell value> + (<New Value > - <existing value>) / Count(leaf cells contained in <tuple>)

USE_WEIGHTED_ALLOCATION

신규값을 Weight Expression에 비율로 배분 Ex) A제품군 Forecast 입력값에 대해 과거3개월평균매출로 비율

<leaf cell value> = < New Value> * Weight_Expression

USE_WEIGHTED_INCREMENT

신규값과 기존값의 차이분에 대해 Weight Expression에 비율로 배분 Ex) 기존 입력값들을 부분 존중하여 증감분에 대해서만 Weight Expression에 비율로 배분

<leaf cell value> = <leaf cell value> + (<New Value> - <existing value>) * Weight_Expression

※ 주의 : 정수가 포함된 측정값에서 사용되는 경우, 가중치 적용한 USE_WEIGHTED_ALLOCATION 메서드는 증분적인 반올림 변화로 인한 일부 부정확한 결과를 반환할 수 있음. -. Weight_Expression는 0과 1사이의 값에 해당하는 값이나 산식이 여야 함 -. Allocation method 지정 않은 경우 Weight_Expression = <leaf cell value> / <existing value> 으로 처리됨

4개의 Allocation Method를 제공하며 주로 USE_WEIGHTED_ALLOCATION와 USE_WEIGHTED_INCREMENT를 사용함

Write-back method

Allocation Method

구분 Case 1 Case 2

입력 값 유지 입력값의 근사값 처리됨 입력값 유지됨

처리 방법 각 멤버에 해당하는 가중치로 적용하고 각 값을 반올림하여 적용함

Method에 따라 몇 가지 방법으로 보정함

0 또는 NULL 값

Initial 이 NULL이나 0이더라도 별도 가중치가 주어진 경우 값이 할당됨

기존 값이 0 이거나 NULL인 경우 0 이 됨 첫 번째 멤버일 경우 0 이 아닐 수도 있음

Write-back method

정수 Measure에 가중치를 사용한 경우 데이터 정합성 Case1 : USE_WEIGHTED_ALLOCATION로 가중치 지정한 경우

Case2 : 그 외 (USE_EQUAL_ALLOCATION, USE_EQUAL_INCREMENT USE_WEIGHTED_INCREMENT)

결과 유사값 처리가 허용되는 경우가 아니라면 가중치 설정한

USE_WEIGHTED_ALLOCATION외의 Method 사용 고려

Leaf Level의 값이 작거나 정확해야 한다면 가중치 적용한USE_WEIGHTED_ALLOCATION 외의 method 사용 고려

Allocation Method

가중치 사용치 않고 Allocation : 18 37 38 Step1. 입력 멤버에 대한 총 증가 배수를 계산 : 54/25 = 2.16 Step2. 하위 멤버에 반올림 한 증가 배수 곱한 수 반올림 Step3. 전체 합계와의 차이를 차이 큰 멤버부터 순차적으로 가감 반영

가중치 사용 Allocation : 30 61 60 Step1. 신규값을 Weight Expression에 비율로 배분 Step2. 배분값을 반올림 처리

Current Weight Measure Weight Ratio 증감비값 반올림값

Group 1 30 200 1 61 60

Member1 3 14 0.07 4.27 4

Member2 7 50 0.25 15.25 15

Member3 9 62 0.31 18.91 19

Member4 11 54 0.27 16.47 16

Member5 (null) 20 0.10 6.10 6

기존값 신규값 증감비값 반올림값 차이분 차이분 가감 최종값

Group 1 18 38 38 37 1 38

Member1 1 2.11111111 2 0.11111111 2

Member2 2 4.22222222 4 0.22222222 4

Member3 3 6.33333333 6 0.33333333 6

Member4 4 8.44444444 8 0.44444444 1 9

Member5 5 10.5555556 11 -0.4444444 11

Member6 3 6.33333333 6 0.33333333 6

MS Intelligence planning

Centralized data model with

Analysis Services.

Dimensional data modeling with

PowerPivot for Excel.

Form and report authoring through

Excel 2010 PivotTables.

Data entry and What-If analysis through

Excel PivotTables.

Online document storage and collaboration with security and workflow for forms and reports through

SharePoint Server.

http://technet.microsoft.com/en-us/library/gg558556.aspx

OLTP vs. OLAP (solutions and scenarios)




Excel 가상 분석(What-if) 사용

Write-back 수행

Allocation Method 설정

1

2

3

4

선택 조합 Allocation method

USE_EQUAL_ALLOCATION

USE_EQUAL_INCREMENT

USE_WEIGHTED_ALLOCATION

USE_WEIGHTED_INCREMENT

1

2

3

3

1

2

4

4

Write-back 활성화

Demo

Cube Write-back

Summary

다차원 DB인 Analysis Services 활용

RDB상에서 속도가 느리거나 사용에 불편함이 있는 경우에 Background 변경 검토

Forecast와 같은 Matrix 형태의 데이터 조회는 Analysis Services를 사용하는 것이 효과적

특히 DW의 데이터 용량이 크거나 화면 표현 데이터량이 많은 경우 필수적이라 할 수 있음.

Write-back 선택

Forecast 요건이 복잡하지 않은 경우라면 간단하게 Excel의 가상 분석(Whit-if)를 사용하여 구현.

복잡한 배분이 필요한 Biz 요건이 있다면 별도 구현 고려할 수 있음

oltp에서 대용량 실시간 다차원 델링 구현 사례download.microsoft.com › download ›...

Documents