automating the cloud with digitalocean, terraform, and ansible

Post on 21-Jan-2018

42 Views

Category:

Software

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Automate the Cloudwith Terraform, Ansible, and DigitalOcean

Hi. I'm Brian.

— Programmer (http://github.com/napcs)

— Author (http://bphogan.com/publications)

— Musician (http://soundcloud.com/bphogan)

— Teacher

— Technical Editor @ DigitalOcean

The Plan

— Introduce Immutable Infrastructure

— Create a Server with Terraform

— Provision the Server with Ansible

— Add Another Server and a Load Balancer

— Review

Disclosure and Disclaimer

I am using DigitalOcean in this talk. I work for them. They're cool.

Want $10 of credit? https://m.do.co/c/1239feef68ae

Also we're hiring. http://do.co/careers

If you want to argue or make statements, I'll happily engage with you after the talk in exchange for a beer

Rules

— This is based on my personal experience.

— If I go too fast, or I made a mistake, speak up.

— Ask questions any time.

— If you want to argue, buy me a beer later.

Immutable Infrastructure

Changing existing servers in production results in servers that aren't quite the sameThis includes security updates! These changes result in problems that are hard to diagnose and reproduce.

Snowflake servers and Configuration Drift

"Each server becomes unique"

— So!ware updates

— Security patches

— Newer versions installed on some servers

Infrastructure as Code

Rotate machines in and out of service.

— Create processes to create new servers quickly

— Use code to destroy them and replace them when they are out of date.

How?

— Base images (cloud provider)

— Infrastructure Management tools (Terraform)

— Configuration Management tools (Ansible)

A base setup with some things preconfigured. Your cloud provider has them or you can make your own. The more complex your image is, the more testing you'll need to do and the more time it'll take to bring up a new box.

Base Images

— Ready-to-go base OS with user accounts and services

— Barebones.

— Keep it low-maintenance.

Terraform

— Tool to Create and Destroy infrastructure components.

— Uses "providers" to talk to cloud services

— Define resources with code

— Provider to use, image, size, etc

Example Terraform Resource

resource "digitalocean_droplet" "web-1" { image = "ubuntu-16-04-x64" name = "web-1" region = "nyc3" ...}

Terraform and DigitalOcean

— DigitalOcean account

— Credit card or payment method hooked up

— SSH Key uploaded to DigitalOcean

— SSH Key fingerprint

— DigitalOcean API Key

Finding your Fingerprint

Getting an API Token

Demo: Create Server with Terraform

— Set up Terraform

— Configure and Install the DigitalOcean provider

— Create a host

Set up Terraform

$ mkdir cloud_tutorial$ cd cloud_tutorial$ touch provider.tf

We have two pieces of data we need to inject. Our DO API key and our fingerprint. Set environment variables so you keep sensitive info out of your code and scripts.

Environment Variables

API key

$ echo 'export DO_API_KEY=your_digitalocean_api_token' >> ~/.bashrc

Fingerprint

$ echo 'export SSH_FINGERPRINT=your_ssh_key_fingerprint' >> ~/.bashrc

Make sure they saved!

$ . ~/.bashrc$ echo $DO_API_KEY$ echo $SSH_FINGERPRINT

Define a Provider

touch rovider.tf

variable "do_api_key" {}variable "ssh_fingerprint" {}

provider "digitalocean" { token = "${var.digitalocean_token}"}

Install provider

$ terraform init

Initializing provider plugins...- Checking for available provider plugins on https://releases.hashicorp.com...- Downloading plugin for provider "digitalocean" (0.1.3)...

Define a server

touch web-1.tf

resource "digitalocean_droplet" "web-1" { image = "ubuntu-16-04-x64" name = "web-1" region = "nyc3" monitoring = true size = "1gb" ssh_keys = [ "${var.ssh_fingerprint}" ]}

output "web-1-address" { value = "${digitalocean_droplet.web-1.ipv4_address}"}

DigitalOcean's API lets you find the images and sizes available.

Get Images and Sizes from DigitalOcean API

curl -X GET -H "Content-Type: application/json" \-H "Authorization: Bearer $DO_API_KEY" \"https://api.digitalocean.com/v2/images"

Sizes

curl -X GET -H "Content-Type: application/json" \-H "Authorization: Bearer $DO_API_KEY" \"https://api.digitalocean.com/v2/sizes"

See what will happen

$ terraform plan \-var "do_api_key=${DO_API_KEY}" \-var "ssh_fingerprint=${SSH_FINGERPRINT}"

Apply!

$ terraform apply \-var "do_api_key=${DO_API_KEY}" \-var "ssh_fingerprint=${SSH_FINGERPRINT}"

...

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Outputs:

web-1-address = 159.89.179.202

Demo

Ansible lets you define how things should be set up on your servers. It's designed to be idempotent, so you can run the same script over and over. Ansible will only change what needs changing. If you have more than one machine, you can run the commands on many machines at once. And you can define roles or use existing roles to add additional functionality.

Provision Server with Ansible

— Idempotent machine setup

— Define how things should be, not necessarily what to do

— Supports parallel execution

— Only needs SSH and Python on target machine

— Supports code reuse through roles

Provision with Ansible

— Create Ansible configuration

— Create a configuration file

— Create an inventory file listing your servers

— Define a "playbook" of tasks

— Run the playbook.

The Inventory

— Lists all the hosts Ansible should work with

— Lets you put them into groups

— Lets you specify per-host or per-group options (keys, users, etc)

A Playbook

---- hosts: all remote_user: deploy gather_facts: false

tasks: - name: Update apt cache apt: update_cache=yes become: true

- name: Install nginx apt: name: nginx state: installed become: true

Demo: Creating a Web Server with Ansible

— Create a deploy user

— Install Nginx

— Upload a Serve Block (virtual host)

— Create web directory

— Enable server block

— Upload web page

Ansible connects to your servers using SSH and uses host key checking. When you first log in to a remote machine with SSH, the SSH client app will ask if you want to add the server to your "known hosts." If you have to rebuild your server, or add a new server, you'll get this prompt when Ansible tries to connect. It's a nice security feature, but you should turn it off. Add this section to the new file:By default, Ansible makes a new SSH connection for each command it runs. This is slow. As your playbooks get larger, this will take more time. You can tell Ansible to share SSH connections using pipelining. However, this requires your servers to disable the requiretty for sudo users.

Create ansible.cfg

touch ansible.cfg

[defaults]host_key_checking = False

[ssh_connection]pipelining = True

Ansible uses an inventory file to list out the servers. We're going to start with one. First we define a host called web-1 and assign it the IP address of our machine. We need to tell Ansible what private key file we want to use to connect to the server over SSH, and since we'll use the same one for all our servers, we'll create a group called servers. We put the web-1 host in the servers group, and then we create variables for the servers. We're using Ubuntu 16, which only ships with Python3. Ansible uses python2 by default, so we're just telling Ansible to use Python3 for all members of the servers group.

Creating an Inventory

touch inventory

web-1 ansible_host=xx.xx.xx.xx

[servers]web-1

[servers:vars]ansible_private_key_file='/Users/your_username/.ssh/id_rsa'ansible_python_interpreter=/usr/bin/python3

Creating a Playbook

touch playbook.yml

---- hosts: all remote_user: root

Adding a User

— Use the user module to add the user

— Can only use hashed passwords in playbooks

— Get a hashed password

Getting the password with Python

$ pip install passlib

$ python -c "from passlib.hash import sha512_crypt; import getpass; print sha512_crypt.using(rounds=5000).hash(getpass.getpass())"

(command taken shamelessly from Ansible docs)

This sets the username to deploy, sets the password, and adds the user to the sudo group. It also sets up the shell. The append option says to add the new group, rather than replacing any existing groups. Finally, we're telling Ansible not to ever change the password on subsequent runs. We want the state to be the same every time. If we need to change the password, we'll provision a new server from scratch and decommission this one.

Task to Create User

The password is d3ploy

tasks: - name: Add deploy user and add to sudoers user: name: deploy password: $6$zsQNYitEkWYJzVYj$/6sa8XlOAbfWAtn2S7ww1ok.w1ipqQ1dfHY1Mlo6f9p/xFsp1sp0N9grxLyN6qMcnlvyx266vbPczJd0EacOC1 groups: sudo append: true shell: /bin/bash update_password: on_create

Run the playbook

ansible-playbook -i inventory.txt playbook.yml

PLAY [all] *********************************************************************

TASK [Gathering Facts] *********************************************************ok: [web-1]

TASK [Add deploy user and add to sudoers] **************************************changed: [web-1]

PLAY RECAP *********************************************************************web-1 : ok=2 changed=1 unreachable=0 failed=0

On DigitalOcean, once you upload a public key to your account, password logins are disabled for all your users. The root user already gets your public key added, but subsequent users need your public key too. Ansible has a module for uploading your public key to a user.

Add public key auth for user

- name: add public key for deploy user authorized_key: user: deploy state: present key: "{{ lookup('file', '/Users/your_username/.ssh/id_rsa.pub') }}"

Since the user is already there, Ansibe won't try creating it again. But it will add the key:

Apply the change to the server

$ ansible-playbook -i inventory.txt playbook.yml

TASK [Add deploy user and add to sudoers] **************************************ok: [web-1]

TASK [add public key for deploy user] ******************************************changed: [web-1]

PLAY RECAP *********************************************************************web-1 : ok=3 changed=1 unreachable=0 failed=0

Adding the Webserver Tasks

— Install package

— Update config file

— Create web directory

— Upload home page

We're creating another section in our file that sets a new remote user. Then we define a new set of tasks, and define a task that uses the apt module. We then add become: true to tell Ansible it should execute the command with sudo access.

Update Cache

- hosts: all remote_user: deploy gather_facts: false

tasks: - name: Update apt cache apt: update_cache=yes become: true

In order to use sudo, you have to provide a password. Ansible is non-interactive, so if you try to run the playbook, it'll stall out and error saying there was no password provided. You provide the password for sudo access by adding the --ask-become-pass flag.

Run Ansible and apply changes

ansible-playbook -i inventory.txt playbook.yml \--ask-become-pass

SUDO password:

PLAY [all] *********************************************************************

...

TASK [Update apt cache] ********************************************************

changed: [web-1]

...

Let's install the Nginx web server on our box and set up a new default web site. Once again, use the apt module for this.

Installing Software

- name: Install nginx apt: name: nginx state: installed become: true

Now we'll create the new website directory by using the file module to create /var/www/example.com and make sure it's owned by the deploy user and group. This way we can manage the content in that directory as the deploy user rather than as the root user.

Create the Web Directory

- name: Create the web directory file: path: /var/www/example.com state: directory owner: deploy group: deploy become: true

We need to remove the default site. Nginx on Ubuntu stores server block configuration files in the /etc/nginx/sites-available directory. When a site is enabled, a symbolic link is created from that folder to /etc/nginx/sites-enabled. To disable a site, you remove the symlink from /etc/nginx/sites-enabled. This makes it easy to enable and disable configurations as needed.

Disabling the default web site

— Web site definitions are in /etc/nginx/sites_available

— Live sites are in /etc/nginx/sites_enabled

— Live sites are symlinks from sites_available to sites_enabled

— Remove the symlink to disable a site.

We're checking to see if there's no file in the destination. If it's absent, we're good. If it's not, Ansible will remove it.

Task to remove the default site

- name: Disable `default` site file: src: /etc/nginx/sites-available/default dest: /etc/nginx/sites-enabled/default state: absent notify: reload nginx become: true

The notify directive lets us tell Ansible to fire a handler. A handler is a task that responds to events from other tasks. In this case, we're saying "we've dropped the default Nginx web site configuration, so reload Nginx's configuration to make the changes stick.To make this work, we have to define the handler that explains how this works.

Handlers

notify: reload nginx

Defining a Handler

tasks: ...

handlers: - name: reload nginx service: name: nginx state: reloaded become: true

Install and Configure nginx

$ ansible-playbook -u deploy -i inventory playbook.yml --ask-become-pass

TASK [Update apt cache] ********************************************************changed: [web-1]

TASK [Install nginx] ***********************************************************changed: [web-1]

TASK [Create the web directory] ************************************************changed: [web-1]

TASK [Disable `default` site] **************************************************ok: [web-1]

Templates

— Local files we can upload to the server

— Can use variables to change their contents

— Uses the Jinja language

Creating the Server Block with a Template

touch site.conf

server { listen 80; listen [::]:80;

root /var/www/example.com/; index index.html;

server_name example.com

location / { try_files $uri $uri/ =404; }}

This task uses the template module, which uploads the template to the location on the server. Templates can have additional processing instructions which we'll look at later. Right now we'll just upload the file as-is.

Upload the file to the server

- name: Upload the virtual host template: src: site.conf dest: /etc/nginx/sites-available/example.com become: true

Enable the new host

- name: Enable the new virtual host file: src: /etc/nginx/sites-available/example.com dest: /etc/nginx/sites-enabled/example.com state: link become: true notify: reload nginx

Make a home page

touch index.html.j2

<!DOCTYPE html>

<html lang="en-US">

<head>

<meta charset="utf-8">

<title>Welcome</title>

</head>

<body>

<h1>Welcome to my web site</h1>

</body>

</html>

This time we don't use become: true because we want the file owned by the deploy user, and we've already made sure the /var/www/example.com directory is owned by the deploy user.

Upload the file

- name: Upload the home page template: src: index.html dest: /var/www/example.com

Deploy the site

ansible-playbook -u deploy -i inventory playbook.yml --ask-become-pass

Roles

— Reusable compnents

— Tasks

— Templates

— Handlers

— Sharable!

The tasks folder contains the task definitions. The handlers folder contains the definitions for our handlers, and the templates folder holds our template files. Create this structure:

Anatomy of a Role

▾ role_name/ ▾ handlers/ main.yml ▾ tasks/ main.yml ▾ templates/ some_template.j2

Create a role for our server

— Create website role

— Move tasks, handlers, and templates out of our playbook

— Add the role to the playbook

Create the Role structure

$ mkdir -p roles/website/{handlers,tasks,templates}$ touch roles/website/{handlers,tasks}/main.yml$ mv {index.html,site.conf} roles/website/templates

Move handler into roles/website/handlers/main.yml

---- name: reload nginx service: name=nginx state=reloaded become: true

Move tasks into roles/website/tasks/main.yml

---- name: Update apt cache apt: update_cache=yes become: true

- name: Install nginx apt: name: nginx state: installed become: true

- name: Create the web directory

...

- name: Enable the new virtual host file: src: /etc/nginx/sites-available/example.com dest: /etc/nginx/sites-enabled/example.com state: link become: true notify: reload nginx

Add role to playbook

- hosts: all remote_user: deploy gather_facts: false

# all other stuff moved into the role

roles: - website

Make sure it still works!

$ ansible-playbook -u deploy -i inventory playbook.yml --ask-become-pass

Demo: Roles

Scaling Out

— Add another host

— Add a load balancer

We'll just clone the web-1 definition using sed and replace all occorrances of web-1 with web-2.

Create a web-2.tf file

sed -e 's/web-1/web-2/g' web-1.tf > web-2.tf

Create web-2

$ terraform apply \-var "digitalocean_token=${DO_WORK_TOKEN}" \-var "ssh_fingerprint=${SSH_FINGERPRINT}"

Update Inventory

web-1 ansible_host=xx.xx.xx.xxweb-2 ansible_host=xx.xx.xx.yy

[servers]web-1web-2

...

Provision the servers

$ ansible-playbook -u deploy -i inventory playbook.yml \--ask-become-pass

Add a Load Balancer

— Floating IP

— Two HAProxy or Nginx instances

— Each instance monitoring the other

— Each instance pointing to web-1 and web-2

OR

— Digital Ocean Load Balancer

We define the forwarding rule and a health check, and then we specify the IDs of the Droplets we want to configure.

Add a DO Load Balancer with Terraform

touch loadbalancer.tr

resource "digitalocean_loadbalancer" "web-lb" { name = "web-lb" region = "nyc3"

forwarding_rule { entry_port = 80 entry_protocol = "http"

target_port = 80 target_protocol = "http" }

healthcheck { port = 22 protocol = "tcp" }

droplet_ids = ["${digitalocean_droplet.web-1.id}","${digitalocean_droplet.web-2.id}" ]}

Show Load Balancer IPloadbalancer.tf

output "web-lb-address" { value = "${digitalocean_loadbalancer.web-lb.ip}"}

Apply!

$ terraform apply \-var "do_api_key=${DO_API_KEY}" \-var "ssh_fingerprint=${SSH_FINGERPRINT}"

...

Outputs:

web-1-address = xx.xx.xx.xx

web-2-address = xx.xx.xx.yy

web-lb-address = xx.xx.xx.zz

And you have your infrastructure.

Demo

Tear it down

$ terraform destroy \-var "do_api_key=${DO_API_KEY}" \-var "ssh_fingerprint=${SSH_FINGERPRINT}"

Rebuild

$ terraform destroy \-var "do_api_key=${DO_API_KEY}" \-var "ssh_fingerprint=${SSH_FINGERPRINT}"

Add IPs to inventory... then:

$ ansible-playbook -u deploy -i inventory playbook.yml \--ask-become-pass

Going Forward

— Add more .tf files for your infra

— Add them to your loadbalancer.tf file

— Add new IPs to Inventory

— Provision them with Ansible

— Remove old hosts from loadbalancer when you make config changes or need security patches

— Investigate Ansible variables to handle domains, user accounts, passwords, etc.

— Add new IPs to inventory automatically using Terraform's provisioner

Things I learned

— Using other people's Ansible roles is awful

— Build everything from scratch and read the docs

— Ansible module docs are great... if you know what module you need.

— StackOverflow is full of deprecated syntax. Use the Ansible Docs!

— Don't be clever. Be explicit. DRY rule isn't always preferred. Or good.

Questions?

— Slides: https://bphogan.com/presentations/automate2018

— Twitter: @bphogan

Reminder:$10 of DO credit? https://m.do.co/c/1239feef68ae

top related