[164] pinpoint

122
Pinpoint 대규모 분산환경 APM 강운덕 네이버 서비스플랫폼개발센터

Upload: naver-d2

Post on 07-Jan-2017

10.694 views

Category:

Technology


0 download

TRANSCRIPT

Pinpoint대규모분산환경 APM

강운덕네이버서비스플랫폼개발센터

Contents

1. 분산환경의문제점2. Pinpoint의특징과기술

- CallStack Trace - Distributed Transaction Trace

3. 분산환경의트러블슈팅4. RPC Timeline Pattern 5. 신기능& 발전방향

1. 분산환경의문제점

가이드페이지-섹션구분

Why

과거

Why

NOW

문제점

상황

수십수백대의서버

많은소프트웨어모듈

복잡하게연동된서비스

문제

어떻게연동되고있는지파악안됨

다른서비스에의해장애가발생

개별서버의모니터링으로는전체상황의파악이안됨

분산되면더어려운이유

Network 의존성관찰이매우어려움

기존방식으로는문제를잘파악하기힘듬

Logging

Single WAS Monitoring

GC Log, Heap Dump, Thread Dump

System Monitoring

복잡한시스템의성능문제

해외Proxy

APIGATEWAY

ServiceDB,

CACHE,RPC

CACHE

아무튼느린 Request추적해봅시다

Tomcat에느린 Log를찍어봅시다

HttpClient호출 Log도찍어봅시다

Apache Access Log도찍어봅시다

아무튼느린 Request추적해봅시다

때려쳐~ 때려쳐~

분산아키텍쳐의현실

분산아키텍쳐의현실

분산아키텍쳐의현실

분산아키텍쳐의현실

통짜

2.Pinpoint 소개

Pinpoint

대규모분산시스템의성능정보수집과문제분석을위한 APM 도구

- APM (Application Performance Management)

분산트랜잭션추적

애플리케이션토폴로지자동발견& 가시화

수평확장성

코드수준의가시성

코드를수정하지않고성능정보수집

http://github.com/naver/pinpoint

무엇이가능해졌는가?과거에는발견하지도못했던문제를발견가능해짐

문제를쉽게빠르게해결

문제진단과수정시간이대폭단축

Architecture

Collector

Host 1 Host 2 Host n

Web

Host JVMJVM Option JavaAgent명시

프로파일링머신대상군

Host Application

Java Agent

HBaseHBase

HBase

Send Profile DataUDP/TCP (Thrift)

HBase Write

HBase Read

Architecture

Collector

Host 1 Host 2 Host n

Web

Host JVMJVM Option JavaAgent명시

프로파일링머신대상군

Host Application

Java Agent

HBaseHBase

HBase

Send Profile DataUDP/TCP (Thrift)

HBase Write

HBase Read

Architecture

Collector

Host 1 Host 2 Host n

Web

Host JVMJVM Option JavaAgent명시

프로파일링머신대상군

Host Application

Java Agent

HBaseHBase

HBase

Send Profile DataUDP/TCP (Thrift)

HBase Write

HBase Read

Architecture

Collector

Host 1 Host 2 Host n

Web

Host JVMJVM Option JavaAgent명시

프로파일링머신대상군

Host Application

Java Agent

HBaseHBase

HBase

Send Profile DataUDP/TCP (Thrift)

HBase Write

HBase Read

TomcatA

TomcatC

TomcatD

TomcatB

TomcatF

Mysql1

Cubrid

Mysql2

Cache

원격지주소…

원격지주소…

TomcatATomcatA

TomcatA

TomcatC

TomcatD

TomcatB

TomcatF

Mysql1

Cubrid

Mysql2

Cache

원격지주소…

원격지주소…

TomcatATomcatA

TomcatA

TomcatC

TomcatD

TomcatB

TomcatF

Mysql1

Cubrid

Mysql2

Cache

원격지주소…

원격지주소…

TomcatATomcatA

TomcatA

TomcatC

TomcatD

TomcatB

TomcatF

Mysql1

Cubrid

Mysql2

Cache

원격지주소…

원격지주소…

TomcatATomcatA

TomcatA

TomcatC

TomcatD

TomcatB

TomcatF

Mysql1

Cubrid

Mysql2

Cache

원격지주소…

원격지주소…

TomcatATomcatA

CallStack Trace

Distributed Transaction Trace

Distributed Transaction Trace

HttpClient.execute()

Distributed Transaction Trace

Tomcat.receive();

Tomcat.receive();

Distributed Transaction Trace

TOMCAT A

TOMCAT B

Pinpoint의핵심기능

CallStack Trace

Distributed Transaction Trace

CallStack Trace

AMethod(); -> BMethod(); -> CMethod();

ClassABC

CallStack Trace

AMethod() {

BMethod() {

CMethod() {

}

}

}

AMethod(); -> BMethod(); -> CMethod();

CallStack Trace

AMethod() {

BMethod() {

CMethod() {

}

}

}

CallStack

CMethod();

BMethod();

AMethod();

AMethod(); -> BMethod(); -> CMethod();

CallStack Trace JVM Classloader

public void AMethod() {BMethod();

}

public void AMethod() {

BMethod();

}

AInterceptor.before();

AInterceptor.after();

Class Loading시점에 Code를가로채 bytecode를변경

PinpointAgent

AMethod() {

AIntercetor.before()

BMethod() {

BInterceptor.before()

CMethod() {

CInterceptor.before();CInterceptor.after();

}

BInterceptor.after ()}

AIntercetor.after ()

}

CallStack Trace

AMethod() {

AIntercetor.before()

BMethod() {

BInterceptor.before()

CMethod() {

CInterceptor.before();CInterceptor.after();

}

BInterceptor.after ()}

AIntercetor.after ()

}

CallStack Trace

FramePointer

StackFrame

Depth

ROOT -1

New Stack &Bind ThreadLocal

AMethod() {

AIntercetor.before()

BMethod() {

BInterceptor.before()

CMethod() {

CInterceptor.before();CInterceptor.after();

}

BInterceptor.after ()}

AIntercetor.after ()

}

CallStack Trace

AMethodSequence:0

FramePointer

StackFrame

Depth

ROOT -1

AMethod() {

AIntercetor.before()

BMethod() {

BInterceptor.before()

CMethod() {

CInterceptor.before();CInterceptor.after();

}

BInterceptor.after ()}

AIntercetor.after ()

}

CallStack Trace

FramePointer

StackFrame

Depth

ROOT -1

PUSHStackFrame

AMethodSequence:0

AMethod() {

AIntercetor.before()

BMethod() {

BInterceptor.before()

CMethod() {

CInterceptor.before();CInterceptor.after();

}

BInterceptor.after ()}

AIntercetor.after ()

}

CallStack Trace

FramePointer

StackFrame

Depth

0

ROOT -1

AMethod

PUSHStackFrame

AMethod Sequence:0

AMethod() {

AIntercetor.before()

BMethod() {

BInterceptor.before()

CMethod() {

CInterceptor.before();CInterceptor.after();

}

BInterceptor.after ()}

AIntercetor.after ()

}

CallStack Trace

FramePointer

StackFrame

Depth

1

0

ROOT -1

PUSHStackFrame

BMethod Sequence:1

AMethod

BMethod

AMethod() {

AIntercetor.before()

BMethod() {

BInterceptor.before()

CMethod() {

CInterceptor.before();CInterceptor.after();

}

BInterceptor.after ()}

AIntercetor.after ()

}

CallStack Trace

FramePointer

StackFrame

Depth

2

1

0

ROOT -1

PUSHStackFrame

CmethodSequence:2

AMethod

BMethod

CMethod

AMethod() {

AIntercetor.before()

BMethod() {

BInterceptor.before()

CMethod() {

CInterceptor.before();CInterceptor.after();

}

BInterceptor.after ()}

AIntercetor.after ()

}

CallStack Trace

FramePointer

StackFrame

Depth

2

1

0

ROOT -1

POPStackFrame

AMethod

BMethod

CMethod

AMethod() {

AIntercetor.before()

BMethod() {

BInterceptor.before()

CMethod() {

CInterceptor.before();CInterceptor.after();

}

BInterceptor.after ()}

AIntercetor.after ()

}

CallStack Trace

FramePointer

StackFrame

Depth

2

1

0

ROOT -1POPStackFrame

AMethod

BMethod

WriteQueue

C

C

AMethod() {

AIntercetor.before()

BMethod() {

BInterceptor.before()

CMethod() {

CInterceptor.before();CInterceptor.after();

}

BInterceptor.after ()}

AIntercetor.after ()

}

CallStack Trace

FramePointer

StackFrame

Depth

1

0

ROOT -1POPStackFrame

AMethod

BMethod

WriteQueueC

AMethod() {

AIntercetor.before()

BMethod() {

BInterceptor.before()

CMethod() {

CInterceptor.before();CInterceptor.after();

}

BInterceptor.after ()}

AIntercetor.after ()

}

CallStack Trace

FramePointer

StackFrame

Depth

1

0

ROOT -1POPStackFrame

AMethod

WriteQueue

B

C B

AMethod() {

AIntercetor.before()

BMethod() {

BInterceptor.before()

CMethod() {

CInterceptor.before();CInterceptor.after();

}

BInterceptor.after ()}

AIntercetor.after ()

}

CallStack Trace

FramePointer

StackFrame

Depth

0

ROOT -1POPStackFrame

WriteQueue

A

C B A

AMethod() {

AIntercetor.before()

BMethod() {

BInterceptor.before()

CMethod() {

CInterceptor.before();CInterceptor.after();

}

BInterceptor.after ()}

AIntercetor.after ()

}

CallStack Trace

FramePointer

StackFrame

Depth

ROOT -1

WriteQueueC B A

EmptyStack

AMethod() {

AIntercetor.before()

BMethod() {

BInterceptor.before()

CMethod() {

CInterceptor.before();CInterceptor.after();

}

BInterceptor.after ()}

AIntercetor.after ()

}

CallStack Trace

FramePointer

StackFrame

Depth

ROOT -1

WriteQueueC B AQueueFlush

NetworkWrite

CallStack Trace

HBase

C B A

CallStack Trace

Web

HBase

C B A

CallStack Trace

WAS

C

B

A

Sequence

0

1

2

CallStack Trace

WAS

C

B

A 0

1

2

Depth

0

1

2

Sequence

CallStack Trace

WAS

C

B

A 0

1

2

Depth

0

1

1

Sequence

Distributed Transaction Trace

Distributed Transaction Trace

RPC간의관계를찾는방법

Request안에추적 Tag를포함시킨다

Http : HttpHeader

Distributed Transaction Trace

TraceId

- TransactionID

- SpanID

- pSpanID

Node 1 Node 2

Node 3

Node 4TxId:Node1^Time^1SpanId =1pSpanId = -1

TxId:Node1^Time^1SpanId = 3pSpanId = 2

TxId:Node1^Time^1SpanId = 4pSpanId = 2

RPC 1

RPC 2

RPC 3

TxId:Node1^Time^1SpanId = 2pSpanId =1

Distributed Transaction Trace

TransactionID : GUID로전체메시지아이디

각노드마다동일한 ID가할당

TxId:Node1^Time^1SpanId = 4pSpanId = 2

TxId:Node1^Time^1SpanId = 3pSpanId = 2

Node 1 Node 2

Node 3

Node 4TxId:Node1^Time^1SpanId =1pSpanId = -1

RPC 1

RPC 2

RPC 3

TxId:Node1^Time^1SpanId = 2pSpanId =1

Distributed Transaction Trace

SpanID, pSpanID : 부모자식관계정렬을위한 ID

Node 1 Node 2

Node 3

Node 4TxId:Node1^Time^1SpanId =1pSpanId = -1

TxId:Node1^Time^1SpanId = 3pSpanId = 2

TxId:Node1^Time^1SpanId = 4pSpanId = 2

RPC 1

RPC 2

RPC 3

TxId:Node1^Time^1SpanId = 2pSpanId =1

Distributed Transaction TraceTomcatA

@Controllerpublic class TestController {

@RequestMapping("/test")@ResponseBodypublic String test() throws IOException {

HttpGet get = new HttpGet("http://TomcatB/hello");

HttpResponse response = httpClient.execute(get);

return EntityUtils.toString(response.getEntity());}

}

TomcatB@Controllerpublic class HelloController {

@RequestMapping("/hello")@ResponseBodypublic String hello() {

return "world!";}

}

TomcatA@Controllerpublic class TestController {

@RequestMapping("/test")@ResponseBodypublic String test() throws IOException {

HttpGet get = new HttpGet("http://TomcatB/hello");

HttpResponse response = httpClient.execute(get);

return EntityUtils.toString(response.getEntity());}

}

TomcatB@Controllerpublic class HelloController {

@RequestMapping("/hello")@ResponseBodypublic String hello() {

return "world!";}

}

Distributed Transaction Trace

TraceId생성

TRANSACTION_ID : TomcatA^시작시간^1

SPAN_ID : 10

PARENT_SPAN_ID : -1

Distributed Transaction Trace

Spring Controller Method 정보레코딩

TomcatA@Controllerpublic class TestController {

@RequestMapping("/test")@ResponseBodypublic String test() throws IOException {

HttpGet get = new HttpGet("http://TomcatB/hello");

HttpResponse response = httpClient.execute(get);

return EntityUtils.toString(response.getEntity());}

}

TomcatB@Controllerpublic class HelloController {

@RequestMapping("/hello")@ResponseBodypublic String hello() {

return "world!";}

}

Distributed Transaction Trace

HttpClient 호출을가로채 Next TtraceId 를저장

TRANSACTION_ID : TomcatA^시작시간^1

SPAN_ID : 20 (신규발급)

PARENT_SPAN_ID : 10 (부모의 SpanId 10)

TomcatA@Controllerpublic class TestController {

@RequestMapping("/test")@ResponseBodypublic String test() throws IOException {

HttpGet get = new HttpGet("http://TomcatB/hello");

HttpResponse response = httpClient.execute(get);

return EntityUtils.toString(response.getEntity());}

}

TomcatB@Controllerpublic class HelloController {

@RequestMapping("/hello")@ResponseBodypublic String hello() {

return "world!";}

}

TomcatA@Controllerpublic class TestController {

@RequestMapping("/test")@ResponseBodypublic String test() throws IOException {

HttpGet get = new HttpGet("http://TomcatB/hello");

HttpResponse response = httpClient.execute(get);

return EntityUtils.toString(response.getEntity());}

}

TomcatB@Controllerpublic class HelloController {

@RequestMapping("/hello")@ResponseBodypublic String hello() {

return "world!";}

}

Distributed Transaction Trace

Tag Request

Distributed Transaction TraceTomcatA

@Controllerpublic class TestController {

@RequestMapping("/test")@ResponseBodypublic String test() throws IOException {

HttpGet get = new HttpGet("http://TomcatB/hello");

HttpResponse response = httpClient.execute(get);

return EntityUtils.toString(response.getEntity());}

}

TomcatB@Controllerpublic class HelloController {

@RequestMapping("/hello")@ResponseBodypublic String hello() {

return "world!";}

}

Tag Request

TomcatB는 Header에서 TraceId를인식하여 Child로동작

TRANSACTION_ID : TomcatA^시작시간^1

SPAN_ID : 20 (신규발급)

PARENT_SPAN_ID : 10 (부모의 SpanId 10)

TomcatA@Controllerpublic class TestController {

@RequestMapping("/test")@ResponseBodypublic String test() throws IOException {

HttpGet get = new HttpGet("http://TomcatB/hello");

HttpResponse response = httpClient.execute(get);

return EntityUtils.toString(response.getEntity());}

}

TomcatB@Controllerpublic class HelloController {

@RequestMapping("/hello")@ResponseBodypublic String hello() {

return "world!";}

}

Distributed Transaction Trace

HBase

RowKeyTomcatA^시작시간^1

20

10Hello() 호출정보

TraceData

Collector

TomcatA@Controllerpublic class TestController {

@RequestMapping("/test")@ResponseBodypublic String test() throws IOException {

HttpGet get = new HttpGet("http://TomcatB/hello");

HttpResponse response = httpClient.execute(get);

return EntityUtils.toString(response.getEntity());}

}

TomcatB@Controllerpublic class HelloController {

@RequestMapping("/hello")@ResponseBodypublic String hello() {

return "world!";}

}

Distributed Transaction Trace

HBase

RowKeyTomcatA^시작시간^1

20 10

10Hello() 호출정보

-1Test()호출정보

Collector

TraceData

HBase

RowKeyTomcatA^시작시간^1

20 10

10Hello() 호출정보

-1Test()호출정보

WEB

3.분산환경의TroubleShooting

Pinpoint가없었던시절

연동시스템에장애가발생한다면…

Pinpoint가없었던시절Caused by: java.net.SocketTimeoutException: Read timed out

at java.net.SocketInputStream.socketRead0(Native Method)

at java.net.SocketInputStream.read(SocketInputStream.java:150)

at java.net.SocketInputStream.read(SocketInputStream.java:121)

Caused by: ◂◊╩◌♪♦♂◘◦▸╫╛╟╤❶╦╧[afg00101101aj..

Caused by: …

Pinpoint가없었던시절Caused by: java.net.SocketTimeoutException: Read timed out

at java.net.SocketInputStream.socketRead0(Native Method)

at java.net.SocketInputStream.read(SocketInputStream.java:150)

at java.net.SocketInputStream.read(SocketInputStream.java:121)

Caused by: ◂◊╩◌♪♦♂◘◦▸╫╛╟╤❶╦╧[aDgfRhaj..

Caused by: …

Pinpoint사용시

원격지A

B

A

Pinpoint 사용시

A

Pinpoint 사용시

A

Pinpoint 사용시

Pinpoint 사용시

Pinpoint 사용시

어려운문제

해외Proxy

APIGATEWAY

ServiceDB,

CACHE,RPC

CACHE

전체아키텍쳐가시화

전체아키텍쳐가시화

APIGATEWAY1

호주Proxy

미국Proxy

Mobile-app

Server-WEB

APIGATEWAY2

일본Proxy

호주Proxy

미국Proxy

유럽Proxy

남미Proxy

APIGW-1

APIGW-2

Service

DB

RPC

RPC…

Cache

일본Proxy

미국Proxy

유럽Proxy

남미Proxy

APIGW-1

APIGW-2

Service

DB

RPC

RPC…

Cache

호주Proxy

일본Proxy

미국Proxy

유럽Proxy

남미Proxy

APIGW-1

APIGW-2

Service

DB

RPC

RPC…

Cache

호주Proxy

일본Proxy

미국Proxy

유럽Proxy

남미Proxy

APIGW-1

APIGW-2

Service

DB

RPC

RPC…

Cache

호주Proxy

개별 Request흐름가시화

미국Proxy

개별 Request흐름가시화

해외 Proxy APIGateway Service

RPC-A

RPC-B

RPC-C

MySql

RPC-D

해외 Proxy6초

APIGateway6초

해외 Proxy6초

APIGateway6초

APIGateway6초

Service응답시간

6초…

Service응답시간

6초…

해외 Proxy APIGateway Service

MySql

RPC-A

RPC-B

RPC-C

RPC-D

해외 Proxy APIGateway Service

RPC-A

RPC-B

RPC-C

RPC-D

MySql

Service6초…

개별 Request흐름가시화

XXX YYY

ABC

http://A.naver

http://B.naver

개별 Request흐름가시화

APIGateway해외 Proxy

Service

개별 Request흐름가시화

APIGateway해외 Proxy

Service

개별 Request흐름가시화

APIGateway해외 Proxy

Service

개별 Request흐름가시화

APIGateway해외 Proxy

Service

개별 Request흐름가시화

APIGateway해외 Proxy

Service

개별 Request흐름가시화

APIGateway해외 Proxy

Service

해외Proxy APIGateway Service

RPC-A

RPC-B

RPC-C

RPC-D

MySql

4.RPC Timeline Pattern

RPC Timeline PatternRpc Timeline, CallStack의시간분포패턴

TCP connect가지연된상황

Socket Option : ConnectTimeout , Socket Backlog

WebServer : Apache, Nginx

Network Switch : LoadBalancer(L4)

Client 특성 : HttpClient의내부 retry 로직

RPC Timeline Pattern 1

TCP 연결에문제가있는패턴

Client execute

Server

RPC Timeline Pattern 2

Network이느린경우

Client execute

Server

해외서버에서버가존재하는경우

Network 트래픽, 서버의위치점검

HTTP KeepAlive, HTTP2 활용

Gzip과같은압축활용

RPC Timeline Pattern 3

Client execute

Server

TargetServer의처리가느림

Client의전면장애로파급될가능성이있음

Socket Timeout

Circuit breaker : Netflix Hystrix

TargetServer가느림

RPC Timeline Pattern 4

Client execute

Server

Response 를받은후 Stream에서데이터를추가로읽는경우

-대용량파일다운로드

보통정상상태

이상황이문제를유발한다면,별도서버구축이필요

응답데이터가많음

5.신기능& 발전방향

1.5 신기능Plugin System

사용자가필요한 API의정보수집이가능

Google Gson Plugin

- com.navercorp.pinpoint.plugin.gson.GsonPlugin

1.5 신기능Real Time 강화

Was ActiveThread Monitoring

발전방향

예측, 제안, 패턴분석JVM 메모리가OOM 패턴이라면 -> 경고

동일한WAS의응답시간패턴이다르다면 -> 경고

발전방향

발전방향

발전방향

예측, 제안, 패턴분석문제가있는 Lib를쓰고있다면 -> 버전업제안

JVM Version, Option 이바람직하지않다면 -> 권장설정제안

발전방향

Java가아닌구간도프로파일링WebServer구간의성능수집

- Apache, Nginx

Q&A

Thank You