[오픈소스컨설팅] open stack kilo with dvr_ceph_v1.1

50
Neutron DVR and Ceph Integration 염 진 영([email protected]) 2015.08.24 주식회사 오픈소스 컨설팅 Tel : 02-516-0711 e-mail : [email protected]

Upload: ji-woong-choi

Post on 22-Jan-2018

6.107 views

Category:

Software


2 download

TRANSCRIPT

Page 1: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

Neutron DVR and Ceph Integration

염 진 영([email protected])

2015.08.24

주식회사 오픈소스 컨설팅

Tel : 02-516-0711

e-mail : [email protected]

Page 2: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

2

About Neutron DVR and CEPH

Neutron

DVR

Page 3: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

3

구축 환경

물리 서버 : 6대(HP DL380 외)

OpenStack : Kilo

CEPH : Hammer

Page 4: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

4

목차

1. Neutron

2. Neutron DVR

4. OpenStack with CEPH

3. CEPH

Page 5: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

5

SDN (Software Defined Network)

출처 : http://2.bp.blogspot.com/-8uKSOcP-FDQ/T5ZhZpef-wI/AAAAAAAAAOE/lw-Pw0aMed4/s1600/FatTree.png

• 복잡, 유연한 네트워크 구축 한계

• 변경의 어려움

• 트래픽 증가 대처 어려움

• 새로운 서비스 적용 어려움

• 지속적인 유지보수비

• 클라우드 환경에 부적함

현재 Network architecture의 문제점

Page 6: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

6

SDN (Software Defined Network)

<Software Defined Networks (SDN) Architecture>

SDN: 전통적인 통신장비의 하드웨어와 소프트웨어를 분리(decoupling)하여 소프트웨어로 구현된 네트워크 기능을 범용의 클라우드 인프라 환경에서 필요에 따라 동적으로 구성하고 운용함으로써 소프트웨어 중심의 네트워크 인프라 실현 예) OpenFlow, OpenVSwitch

Control Plane : packet을 어떻게 컨트롤 할지에 대한 정보를 관리, Data Plane으로 전달 Data Plane : packet을 받아 control plane에 정의된 rule에 따라 forwarding, drop

Page 7: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

7

Neutron

OpenStack에서 복잡한 cloud 네트워크 환경을 구현하는 컴포넌트

SDN 기반으로 구현

OpenVSwitch, Linux Bridge, Linux Network Namespace, VxLAN, VLAN, GRE 등

기술 활용

멀티 테넌트 네트워크 지원

Load Balance, Firewall, VPN 기능 등 제공

다양한 plugin 제공

Page 8: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

8

목차

1. Neutron

2. Neutron DVR

4. OpenStack with CEPH

3. CEPH

Page 9: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

9

Legacy Neutron

Network node provides :

IP forwarding

– Inter-subnet (east-west) : VM간 통신

– Floating IP (north-south) : 외부 네트워크와 VM간 통신

– Default SNAT (north-south) : VM에서 외부 네트워크로의 통신

Metadata Agent

–Nova metadata service 접근

이슈 :

성능 저하

제한적인 확장성

SPOF(Single Point of Failure)

참고자료: http://www.slideshare.net/vivekkonnect/openstack-kilosummitdvrarchitecture20140506mastergroup?qid=74211292-5ccb-4c08-881b-

f76b7f06a8d3&v=default&b=&from_search=1

Page 10: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

10

Neutron with DVR

Compute Node :

IP forwarding

– Inter-subnet (east-west) : VM간 통신

– Floating IP (north-south) : 외부 네트워크와 VM간 통신

Metadata Agent

–Nova metadata service 접근

장점 :

Floating IP 통신과 VM간의 east-west traffic 통신이 각 compute

node에서 직접 이루어지는 구조

네트워크 성능 향상

Fail 시, 대상 노드의 서비스만 영향

단점:

Default SNAT : 아직까지 네트워크 노드를 경유해야 하는

구조(SPOF)

Public IP를 Compute Node에 할당 필요

Packet control을 위한 Compute Node의 자원 이용

Page 11: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

11

DVR Installation

[DEFAULT] router_distributed=True

[DEFAULT] agent_mode=dvr_snat

[agent] enable_distributed_routing = True

[DEFAULT] agent_mode=dvr

Legacy에 L3 agent와 Metadata agent를 compute node에 추가

DVR 적용을 위한 추가 설정

Page 12: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

12

DVR Installation

# vi /etc/neutron/neutron.conf

router_distributed=True

# vi /etc/neutron/l3_agent.ini agent_mode=dvr_snat

router_delete_namespaces=True

# vi /etc/neutron/plugin.ini

[ml2]

mechanism_drivers = openvswitch,l2population

[agent]

enable_distributed_routing = True

tunnel_types = vxlan

l2_population = True

Neutron Server

Network Node

# vi /etc/neutron/l3_agent.ini agent_mode=dvr

router_delete_namespaces=True

# vi /etc/neutron/plugin.ini

[ml2]

mechanism_drivers = openvswitch,l2population

[agent]

enable_distributed_routing = True

tunnel_types = vxlan

l2_population = True

Compute Node

Page 13: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

13

Packet Flow

Page 14: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

14

용어 정리

참고자료출처 – OpenStack Networking – Juno – DVR & L3 HA: http://www.slideshare.net/janghoonsim/open-stack-networking-juno-l3-ha-dvr

Page 15: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

15

기본 네트워크 흐름

Network Node의 dhcp-agent에서 IP 할당

vm에서 나오는 모든 packet은 DVR namespace에 생성된 DVR default gateway port로 흐름

참고자료 – OpenStack Networking – Juno – DVR & L3 HA: http://www.slideshare.net/janghoonsim/open-stack-networking-juno-l3-ha-dvr

Page 16: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

16

기본 네트워크 흐름

Subnet : 88.0.10.0/24

Network Node의 dhcp-agent에서 IP 할당

vm에서 나오는 모든 packet은 DVR namespace에 생성된 DVR default gateway port로 흐름

참고자료출처 – OpenStack Networking – Juno – DVR & L3 HA: http://www.slideshare.net/janghoonsim/open-stack-networking-juno-l3-ha-dvr

Page 17: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

17

기본 네트워크 흐름

Subnet : 88.0.10.0/24

Network Node의 dhcp-agent에서 IP 할당

vm에서 나오는 모든 packet은 DVR namespace에 생성된 DVR default gateway port로 흐름

참고자료출처 – OpenStack Networking – Juno – DVR & L3 HA: http://www.slideshare.net/janghoonsim/open-stack-networking-juno-l3-ha-dvr

Page 18: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

18

기본 네트워크 흐름

Subnet : 88.0.10.0/24

Network Node의 dhcp-agent에서 IP 할당

vm에서 나오는 모든 packet은 DVR namespace에 생성된 DVR default gateway port로 흐름

[root@compute01 ~]# ip netns exec qrouter-2ba64fb9-9440-4f39-8e40-649aed249055 ip a qr-d3497e03-39: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether fa:16:3e:9a:37:3b brd ff:ff:ff:ff:ff:ff inet 88.0.10.1/24 brd 88.0.10.255 scope global qr-d3497e03-39

DVR default gateway port

참고자료출처 – OpenStack Networking – Juno – DVR & L3 HA: http://www.slideshare.net/janghoonsim/open-stack-networking-juno-l3-ha-dvr

Page 19: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

19

기본 네트워크 흐름

Subnet : 88.0.10.0/24

Network Node의 dhcp-agent에서 IP 할당

vm에서 나오는 모든 packet은 DVR namespace에 생성된 DVR default gateway port로 흐름

[root@compute01 ~]# ip netns exec qrouter-2ba64fb9-9440-4f39-8e40-649aed249055 ip a qr-d3497e03-39: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether fa:16:3e:9a:37:3b brd ff:ff:ff:ff:ff:ff inet 88.0.10.1/24 brd 88.0.10.255 scope global qr-d3497e03-39

DVR default gateway port

참고자료출처 – OpenStack Networking – Juno – DVR & L3 HA: http://www.slideshare.net/janghoonsim/open-stack-networking-juno-l3-ha-dvr

Page 20: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

20

기본 네트워크 흐름

Subnet : 88.0.10.0/24

Network Node의 dhcp-agent에서 IP 할당

vm에서 나오는 모든 packet은 DVR namespace에 생성된 DVR default gateway port로 흐름

[root@compute01 ~]# ip netns exec qrouter-2ba64fb9-9440-4f39-8e40-649aed249055 ip a qr-d3497e03-39: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether fa:16:3e:9a:37:3b brd ff:ff:ff:ff:ff:ff inet 88.0.10.1/24 brd 88.0.10.255 scope global qr-d3497e03-39

DVR default gateway port

DHCP IP : 88.0.10.4

참고자료출처 – OpenStack Networking – Juno – DVR & L3 HA: http://www.slideshare.net/janghoonsim/open-stack-networking-juno-l3-ha-dvr

Page 21: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

21

기본 네트워크 흐름

Subnet : 88.0.10.0/24

Network Node의 dhcp-agent에서 IP 할당

vm에서 나오는 모든 packet은 DVR namespace에 생성된 DVR default gateway port로 흐름

[root@compute01 ~]# ip netns exec qrouter-2ba64fb9-9440-4f39-8e40-649aed249055 ip a qr-d3497e03-39: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether fa:16:3e:9a:37:3b brd ff:ff:ff:ff:ff:ff inet 88.0.10.1/24 brd 88.0.10.255 scope global qr-d3497e03-39

DVR default gateway port

DHCP IP : 88.0.10.4

참고자료출처 – OpenStack Networking – Juno – DVR & L3 HA: http://www.slideshare.net/janghoonsim/open-stack-networking-juno-l3-ha-dvr

Page 22: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

22

SNAT : Compute node

[root@compute01 ~]# ip netns exec qrouter-cda7f413-2981-4eda-95fc-79ee21964b64 ip a 24: qr-836e6efc-9e: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether fa:16:3e:00:b0:5a brd ff:ff:ff:ff:ff:ff inet 192.168.100.1/24 brd 192.168.100.255 scope global qr-836e6efc-9e

IP : 88.0.10.4 [root@compute01 ~]# ip netns exec qrouter-2ba64fb9-9440-4f39-8e40-649aed249055 ip a qr-d3497e03-39: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether fa:16:3e:9a:37:3b brd ff:ff:ff:ff:ff:ff inet 88.0.10.1/24 brd 88.0.10.255 scope global qr-d3497e03-39

DVR default gateway port

88.0.10.0/24 : Private Subnet

snat는 network node를 경유

VM에서 외부 네트워크가 목적지인 packet이 발생하면, DVR default gateway port로 전달

참고자료 출처– OpenStack Networking – Juno – DVR & L3 HA: http://www.slideshare.net/janghoonsim/open-stack-networking-juno-l3-ha-dvr

Page 23: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

23

SNAT : Compute node

[root@compute01 ~]# ip netns exec qrouter-cda7f413-2981-4eda-95fc-79ee21964b64 ip a 24: qr-836e6efc-9e: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether fa:16:3e:00:b0:5a brd ff:ff:ff:ff:ff:ff inet 192.168.100.1/24 brd 192.168.100.255 scope global qr-836e6efc-9e

[root@compute01 ~]# ip netns exec qrouter-2ba64fb9-9440-4f39-8e40-649aed249055 ip rule ls 0: from all lookup local 32766: from all lookup main 32767: from all lookup default 1476397569: from 88.0.10.1/24 lookup 1476397569 [root@compute01 ~]# ip netns exec qrouter-2ba64fb9-9440-4f39-8e40-649aed249055 ip route show table 1476397569 default via 88.0.10.12 dev qr-6c64a24e-19

88.0.10.0/24 : Private Subnet

Network node에 설정된 default public gateway port

qrouter namespace에서 목적지 주소가 88.0.10.0/24 대역이 아니면, routing rule에 의해 Network Node에 있는 DVR default

gateway port(88.0.10.12)로 packet 전달

참고자료 출처– OpenStack Networking – Juno – DVR & L3 HA: http://www.slideshare.net/janghoonsim/open-stack-networking-juno-l3-ha-dvr

Page 24: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

24

SNAT : Network node

snat namespace의 DVR gateway port가 packet을 전달 받은 후, 목적지 주소가 외부이면 default public gateway port로

packet 전달

참고자료 출처– OpenStack Networking – Juno – DVR & L3 HA: http://www.slideshare.net/janghoonsim/open-stack-networking-juno-l3-ha-dvr

Page 25: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

25

SNAT : Network node

[root@network ~]# ip netns exec snat-2ba64fb9-9440-4f39-8e40-649aed249055 ip a 26: sg-ff949255-82: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether fa:16:3e:7b:a8:8c brd ff:ff:ff:ff:ff:ff inet 88.0.10.12/24 brd 88.0.10.255 scope global sg-ff949255-82 27: qg-86abfb37-dd: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether fa:16:3e:a8:e1:76 brd ff:ff:ff:ff:ff:ff inet 192.168.0.91/24 brd 192.168.0.255 scope global qg-86abfb37-dd

88.0.10.0/24 : Private Subnet 192.168.0.0/24 : Public Subnet

[root@network ~]# ip netns exec snat-2ba64fb9-9440-4f39-8e40-649aed249055 iptables -t nat –nL target prot opt source destination SNAT all -- 0.0.0.0/0 0.0.0.0/0 to:192.168.0.91

Network Node에 생성된 snat namespace의 DVR default gateway port(88.0.10.12)로 전달된 packet은 iptables SNAT rule에

의해 외부로 전달

[root@network ~]# ip netns exec snat-2ba64fb9-9440-4f39-8e40-649aed249055 ip route show default default via 192.168.0.1 dev qg-86abfb37-dd 88.0.10.0/24 dev sg-ff949255-82 proto kernel scope link src 88.0.10.12 192.168.0.0/24 dev qg-86abfb37-dd proto kernel scope link src 192.168.0.91

참고자료 출처– OpenStack Networking – Juno – DVR & L3 HA: http://www.slideshare.net/janghoonsim/open-stack-networking-juno-l3-ha-dvr

Page 26: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

26

Floating IP

compute node에 DVR router namespace가 복제되어 생성되며, 이를 통해 network node를 거치지 않고 floating IP가 할당된

VM은 바로 외부 네트워크로 통신

IP : 88.0.10.4

DVR default gateway port

88.0.10.1/24

88.0.10.0/24 : Private Subnet 192.168.0.0/24 : Public Subnet

Floating IP : 192.168.0.92

DNAT : 88.0.10.4 -> 192.168.0.92

참고자료출처 – OpenStack Networking – Juno – DVR & L3 HA: http://www.slideshare.net/janghoonsim/open-stack-networking-juno-l3-ha-dvr

Page 27: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

27

Floating IP

[root@compute01 ~]# ip netns exec fip-6bfe731c-436f-4003-8052-2144fd52cd49 ip a 2: fpr-2ba64fb9-9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether b6:47:cc:09:4f:15 brd ff:ff:ff:ff:ff:ff inet 169.254.31.29/31 scope global fpr-2ba64fb9-9 21: fg-c6ef4b33-29: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether fa:16:3e:f3:f8:c0 brd ff:ff:ff:ff:ff:ff inet 192.168.0.93/24 brd 192.168.0.255 scope global fg-c6ef4b33-29

[root@compute01 ~]# ip netns exec qrouter-2ba64fb9-9440-4f39-8e40-649aed249055 ip a 3: rfp-2ba64fb9-9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 8a:d4:74:fd:19:23 brd ff:ff:ff:ff:ff:ff inet 169.254.31.28/31 scope global rfp-2ba64fb9-9 valid_lft forever preferred_lft forever inet 192.168.0.92/32 brd 192.168.0.92 scope global rfp-2ba64fb9-9 19: qr-6c64a24e-19: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether fa:16:3e:3c:f1:ce brd ff:ff:ff:ff:ff:ff inet 88.0.10.1/24 brd 88.0.10.255 scope global qr-6c64a24e-19

[root@compute01 ~]# ip netns exec qrouter-2ba64fb9-9440-4f39-8e40-649aed249055 iptables -t nat -nL Chain neutron-l3-agent-OUTPUT (1 references) DNAT all -- 0.0.0.0/0 192.168.0.92 to:88.0.10.4 Chain neutron-l3-agent-PREROUTING (1 references) DNAT all -- 0.0.0.0/0 192.168.0.92 to:88.0.10.4

88.0.10.0/24 : Private Subnet 192.168.0.0/24 : Public Subnet

VM의 packet이 DVR router로 전달되면, pair로 생성된 rfp로 전달, fip namespace의 fpr이 받아 fg를 통해 외부로 통신

compute node의 fip namespace의 fg port는 floating IP 중 하나를 fip namespace의 gateway port에 할당

Page 28: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

28

Floating IP

[root@compute01 ~]# ip netns exec fip-6bfe731c-436f-4003-8052-2144fd52cd49 ip a 2: fpr-2ba64fb9-9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether b6:47:cc:09:4f:15 brd ff:ff:ff:ff:ff:ff inet 169.254.31.29/31 scope global fpr-2ba64fb9-9 21: fg-c6ef4b33-29: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether fa:16:3e:f3:f8:c0 brd ff:ff:ff:ff:ff:ff inet 192.168.0.93/24 brd 192.168.0.255 scope global fg-c6ef4b33-29

Proxy ARP : 192.168.0.92(VM floating IP)

[root@compute01 ~]# ip netns exec qrouter-2ba64fb9-9440-4f39-8e40-649aed249055 iptables -t nat -nL Chain neutron-l3-agent-OUTPUT (1 references) DNAT all -- 0.0.0.0/0 192.168.0.92 to:88.0.10.4 Chain neutron-l3-agent-PREROUTING (1 references) DNAT all -- 0.0.0.0/0 192.168.0.92 to:88.0.10.4

88.0.10.0/24 : Private Subnet 192.168.0.0/24 : Public Subnet

반대의 경우 통신은 fg 가상 인터페이스가 VM에 mapping 된 공인 IP에 대한 arp 질의가 올 경우, Proxy ARP 기능을 이용하여

자신이 대상 packet을 받아 vm에 전달을 하게 됨.

DST : 192.168.0.92

참고자료출처 – OpenStack Networking – Juno – DVR & L3 HA: http://www.slideshare.net/janghoonsim/open-stack-networking-juno-l3-ha-dvr

Page 29: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

29

East west traffic

vm001(compute01 host) 와 vm003(compute02)은 dvr-router로 연결

compute01

compute01

compute02

Page 30: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

30

East west traffic(Node 내의 통신)

Subnet이 서로 다르지만 Host가 같은 VM간의 통신

77.0.10.1/24 88.0.10.1/24

88.0.10.4/24 77.0.10.12/24

참고자료 출처– OpenStack Networking – Juno – DVR & L3 HA: http://www.slideshare.net/janghoonsim/open-stack-networking-juno-l3-ha-dvr

Page 31: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

31

East west traffic(Node 간의 통신)

Subnet이 서로 다르고 Host도 다른 VM간의 통신

compute01 -> compute02로 packet 전달

참고자료 출처– OpenStack Networking – Juno – DVR & L3 HA: http://www.slideshare.net/janghoonsim/open-stack-networking-juno-l3-ha-dvr

Page 32: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

32

East west traffic

99.0.10.1/24

DVR router는 pre-populated arp table 적용, vm003을 dst mac으로 설정

Compute node 간의 통신은 global DVR MAC address를 사용하여 통신

“neutron.conf” dvr_base_mac = fa:16:3f:00:00:00 [neutrondb]> select * from dvr_host_macs; +-----------+-------------------+ | host | mac_address | +-----------+-------------------+ | network | fa:16:3f:89:8a:a1 | | compute01 | fa:16:3f:c2:22:07 | | compute02 | fa:16:3f:da:93:11 | +-----------+-------------------+

77.0.10.1/24

77.0.10.12/24

77.0.10.1/24 99.0.10.1/24

99.0.10.3/24

[root@compute01 ~]# ip netns exec qrouter-d70b623f-9287-40cf-80d2-7b0a9f1e46ad ip neighbor 99.0.10.3 dev qr-d3497e03-39 lladdr fa:16:3e:d5:77:f1 PERMANENT 77.0.10.12 dev qr-50565694-ea lladdr fa:16:3e:33:63:22 PERMANENT

참고자료 출처– OpenStack Networking – Juno – DVR & L3 HA: http://www.slideshare.net/janghoonsim/open-stack-networking-juno-l3-ha-dvr

Page 33: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

33

목차

1. Neutron

2. Neutron DVR

4. OpenStack with CEPH

3. CEPH

Page 34: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

34

SDS (Software Defined Storage)

현재 Storage architecture의 문제점

변경의 어려움 / 데이터 증가 대처 어려움 / scale-up 형태의 한계 / 확장 / 지속적인 유지보수비

Software defined storage architecture

이미지: http://www.vmware.com/

SDS(Software Defined Storage) : • 데이터의 양이 폭발적으로 증가를 하드웨어 기반 스

토리지를 압도 • 이에 대응하기 위해 스토리지 제공과 관리를 소프트

웨어를 이용하여 구현 • 예) GlusterFS, HDFS, CEPH

기대효과 :

• Scale-out 형태

• 민첩하고 유연하게 서비스를 제공

• 특정 스토리지 장비 제조사에 대한 종속(lock-in) 탈피

• 고가의 장비에 대한 투자비용 X

• 운용비용 절감

Page 35: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

35

About CEPH

RADOS & CRUSH 참고 : http://ceph.com/papers/weil-rados-pdsw07.pdf

object, block, file storage를 제공하는 신뢰성과 확장성을 가진 고성능 분산 스토리지 시스템

• 2007년 논문으로 시작,

2012년 ceph 서비스를 위해 Inktank라는 회사를

설립하였고, Red Hat이 2014년 4월 인수

• Ceph는 상용 하드웨어(commodity HW)를 가지고

확장 가능하며, 빠른 복구성과 replication이 기본

구성

• Ceph는 RADOS (Reliable Autonomic Distributed

Object Store)라는 Storage Clusters를

기반으로 구성

• CRUSH (Controlled Replication Under Scalable

Hashing) 알고리즘을 통해 RADOS 내에서 file

저장

Page 36: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

36

About CEPH

• Monitor :

cluster map 관리

OSD 관리

data read & write 기능은 하지 않음

quorum을 위해 홀수 구성, 최소 3대 이상

• OSD Daemon

client에 data read & write 제공

data replication, recovery, re-balancing

자신과 다른 OSD의 health check

data를 Object 단위로 저장

• Data read & write :

Monitor에서 최신의 cluster map을 수신

CRUSH(cluster map + pool ID + file name) 를 이용하여 어떤

OSD에 read & write 할지 client에서 연산

Page 37: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

37

목차

1. SDN의 필요성

2. Neutron DVR

4. OpenStack with CEPH

3. CEPH

Page 38: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

38

CEPH 사례 : CERN(유럽 입자 물리 연구소)

2013년 1월, 시작(250TB - OpenStack Block, AFS/NFS)

3 petabyte

약 100TB의 ZFS를 기반으로 한 가상 NFS 운영

OpenStack Cinder 와 Glance 운영

약 3000 블록 디바이스, 1200 볼륨, 1800 이미지

2회 이슈 발생 – 모두 복구

Ceph를 이용하여 NetApp 등 대체

Software Tuning

SSD 사용

OSD journals(5~10배 IOPS 성능 향상)

Mon LevelDB - backfilling

Page 39: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

39

Openstack and Ceph

Ceph BLOCK DEVICE(RBD)를 이용한 Cinder & Glance 연동

Page 40: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

40

Ceph installation

KVM KVM

Openstack과 ceph 는 10.0.0.0/24로 분리

Cinder & Glance 와 Ceph 연동

Page 41: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

41

Calamari on CentOS 7

Page 42: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

42

Glance with Ceph

[root@compute01 ~]# rbd -m 10.0.0.108:6789,10.0.0.109:6789 --user glance --pool images ls 7aad1189-4276-4061-a4d3-504ad1717185 8083b202-b175-4a27-964e-c2cd48c367f6

Ceph의 images pool에 생성된 glance image 파일

Page 43: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

43

Cinder with Ceph

[root@compute01 ~]# rbd -m 10.0.0.108:6789,10.0.0.109:6789 --user cinder --pool volumes ls volume-4f0df88b-d955-4dfa-9575-cd834004835d volume-54aa9734-a220-493b-a815-6f8ea5896dba volume-940ce08f-3304-4794-9356-f0def9936617 volume-96d21ae5-35bc-4a7b-a665-8c9786177dae

Ceph의 volumes pool에 생성된 cinder volume 파일

Page 44: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

44

Nova with CEPH

[root@compute01 ~]# virsh dumpxml instance-00000040 <disk type='network' device='disk'> <driver name='qemu' type='raw' cache='none'/> <auth username='cinder'> <secret type='ceph' uuid='f6768a42-c7b3-4060-99d2-9ae0eab43e87'/> </auth> <source protocol='rbd' name='volumes/volume-96d21ae5-35bc-4a7b-a665-8c9786177dae'> <host name='10.0.0.108' port='6789'/> <host name='10.0.0.109' port='6789'/> </source> <backingStore/> <target dev='vda' bus='virtio'/> <serial>96d21ae5-35bc-4a7b-a665-8c9786177dae</serial> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk>

[root@compute01 ~]# netstat -alpt | grep 'qemu-kvm' tcp 0 0 compute01:51250 10.0.0.109:acnet ESTABLISHED 9726/qemu-kvm tcp 0 0 compute01:41708 10.0.0.108:smc-https ESTABLISHED 10839/qemu-kvm tcp 0 0 compute01:33084 10.0.0.108:6807 ESTABLISHED 9726/qemu-kvm tcp 0 0 compute01:41694 10.0.0.108:smc-https ESTABLISHED 9726/qemu-kvm tcp 0 0 compute01:41206 10.0.0.108:acnet ESTABLISHED 10839/qemu-kvm tcp 0 0 compute01:41209 10.0.0.108:acnet ESTABLISHED 9726/qemu-kvm tcp 0 0 compute01:43613 10.0.0.109:6805 ESTABLISHED 10839/qemu-kvm tcp 0 0 compute01:51251 10.0.0.109:acnet ESTABLISHED 10839/qemu-kvm tcp 0 0 compute01:43598 10.0.0.109:6805 ESTABLISHED 9726/qemu-kvm tcp 0 0 compute01:33083 10.0.0.108:6807 ESTABLISHED 10839/qemu-kvm

KVM -> ceph block device

Page 45: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

45

Openstack Integration with Ceph

작업 전 필요한 내용

# qemu-img | grep "Supported formats" Supported formats: vvfat vpc vmdk vhdx vdi sheepdog rbd raw host_cdrom host_floppy host_device file qed qcow2 qcow parallels nbd iscsi gluster dmg cloop bochs blkverify blkdebug

Cinder와 Glance에서 사용할 Ceph 계정과 pool 생성 및 ceph.conf에 keyring 추가

# ceph osd pool create volumes 64 # ceph osd pool create images 64 # ceph-authtool --create-keyring /etc/ceph/ceph.client.glance.keyring # ceph-authtool --create-keyring /etc/ceph/ceph.client.cinder.keyring # ceph-authtool -C /etc/ceph/ceph.client.glance.keyring -n client.glance --cap osd 'allow rwx pool=images’ --cap mon 'allow rwx' --cap mds 'allow' --gen-key # ceph-authtool -C /etc/ceph/ceph.client.cinder.keyring -n client.cinder --cap osd 'allow rwx pool=volumes' --cap mon 'allow rwx' --cap mds 'allow' --gen-key # ceph auth add client.glance -i /etc/ceph/ceph.client.glance.keyring # ceph auth add client.cinder -i /etc/ceph/ceph.client.cinder.keyring # vi /etc/ceph/ceph.conf -------------------------------------------------------- [client.glance] keyring=/etc/ceph/ceph.client.glance.keyring [client.cinder] keyring = /etc/ceph/ceph.client.cinder.keyring --------------------------------------------------------

Compute Node 가 CEPH Mon과 Osd node와 통신이 가능해야 함.

설치된 qemu-img가 “rbd”를 지원하는지 확인

Page 46: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

46

Openstack Integration with Ceph

# uuidgen f6768a42-c7b3-4060-99d2-9ae0eab43e87 # vi secret.xml <secret ephemeral='no' private='no'> <uuid>f6768a42-c7b3-4060-99d2-9ae0eab43e87</uuid> <usage type='ceph'> <name>client.cinder secret</name> </usage> </secret> # virsh secret-define --file secret.xml # virsh secret-list [on ceph server] # ceph auth get-key client.cinder | ssh ceph-mgmt tee /etc/ceph/cinder.keyring # virsh secret-set-value --secret f6768a42-c7b3-4060-99d2-9ae0eab43e87 --base64 $(cat ~/cinder.keyring)

Libvirt에서 CEPH RBD 접근을 위한 설정

# scp ceph-mgmt:/etc/ceph/ceph.conf /etc/ceph/ # controller, compute node # scp ceph-mgmt:/etc/ceph/ceph.client.cinder.keyring /etc/ceph/ # controller, compute node # scp ceph-mgmt:/etc/ceph/ceph.client.glance.keyring /etc/ceph/ # controller node

# yum install python-rbd ceph-common librados2 librbd1

rbd사용을 위한 package 설치

Ceph conf 및 keyring 파일 복사

Page 47: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

47

Openstack Integration with Ceph

# vi /etc/nova/nova.conf [libvirt] vif_driver=nova.virt.libvirt.vif.LibvirtGenericVIFDriver images_type=rbd images_rbd_pool=volumes images_rbd_ceph_conf=/etc/ceph/ceph.conf use_virtio_for_bridges=true rbd_user=cinder rbd_secret_uuid=f6768a42-c7b3-4060-99d2-9ae0eab43e87

nova.conf 수정

# vi /etc/cinder/cinder.conf rbd_pool=volumes rbd_user=cinder rbd_ceph_conf=/etc/ceph/ceph.conf rbd_secret_uuid=f6768a42-c7b3-4060-99d2-9ae0eab43e87 volume_driver=cinder.volume.drivers.rbd.RBDDriver

cinder.conf 수정

# vi /etc/glance/glance-api.conf [glance_store] stores=glance.store.rbd.Store default_store=rbd rbd_store_ceph_conf=/etc/ceph/ceph.conf rbd_store_user=glance rbd_store_pool=images

glance.conf 수정

Page 48: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

48

OPEN

SHARE

CONTRIBUTE

ADOPT

REUSE

Page 49: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

49

용어정리

br-int(ovs integration bridge) : - Openvswitch 가상스위치

- VM, 가상라우터, br-tun을 연결

- 트래픽을 정의된 flow table에 따라 컨트롤

br-tun(ovs tunnel bridge) : - Openvswitch 가상스위치

- node와 node를 VxLAN을 이용하여 tunnel 연결

- 트래픽을 정의된 flow table에 따라 컨트롤

br-ex(ovs external bridge) : - Openvswitch 가상스위치

- 외부 네트워크와 연결

patch-port : ovs에 정의된 bridge들(br-int, br-tun, br-ex)을 서로 연결

Linux network namespace : 다른 network stack과 격리된 network stack 제공하기 위해 사용, routing table, iptables, arp

table로 트래픽 컨트롤

DVR namespace : virtual router가 생성되는 격리된 network 공간

snat namespace : snat를 위해 Public IP와 SANT rule이 생성되는 격리된 network 공간

fip namespace : Floating IP 통신을 위해 Public IP와 DNAT rule이 생성되는 격리된 network 공간

dhcp namespace : dhcp 데몬으로 IP 할당

qrouter-xxx : DVR namespace 안에 정의되는 virtual port로, Subnet gateway 역할

veth-pair : virtual patch-cable, 한 쌍을 이루는 가상 네트워크 인터페이스, namespace, linux bridge를 연결할 때 사용

pre-populated ARP : 각 namespace는 다양한 가상 네트워크에 대한 mac address를 미리 가지고 있음.

Page 50: [오픈소스컨설팅] Open stack kilo with DVR_CEPH_v1.1

50

참고 URL

Slide 9 page: http://www.slideshare.net/vivekkonnect/openstack-

kilosummitdvrarchitecture20140506mastergroup?qid=74211292-5ccb-4c08-881b-

f76b7f06a8d3&v=default&b=&from_search=1

SDN Image: http://2.bp.blogspot.com/-8uKSOcP-FDQ/T5ZhZpef-wI/AAAAAAAAAOE/lw-Pw0aMed4/s1600/FatTree.png

http://aitpowersurge.co.uk

OpenStack Networking – Juno – DVR & L3 HA(심장훈님) : http://www.slideshare.net/janghoonsim/open-stack-networking-

juno-l3-ha-dvr

CERN: http://www.slideshare.net/noggin143/20140509-cern-openstacklinuxtagv3?qid=99472383-7ae8-42ec-86c7-

73d97338c8c7&v=qf1&b=&from_search=9