a hybrid cloud - public
TRANSCRIPT
Faculty of Information Technology
Department of Databases
Specialization: Databases
Tomasz Kłosiński
9125
A Hybrid Cloud: Amazon Web Services and OpenStack
Master’s thesis written under the supervison of
prof. Krzysztof Stencel
Warsaw, March, 2015
2
Wydział Informatyki
Katedra baz danych
Specjalizacja: Bazy danych
Tomasz Kłosiński
9125
Chmura hybrydowa: Amazon Web Services oraz OpenStack
Praca magisterska napisana pod kierownictwem
prof. Krzysztofa Stencela
Warszawa, marzec, 2015
3
Abstract: The thesis aims to explain the notions of a cloud (Chapter I) and how its usage
influenced modern software delivery processes (Chapter II). It introduces authorial software
project developed using Chef and Vagrant that show the possibilities of a hybrid cloud based on
Amazon Web Services and OpenStack (Chapters III-IV). The principle problem of the thesis is the
question: how to make use of a hybrid cloud? My project presents one of many possible ways
in which this question can be answered.
Streszczenie: Celem tej pracy jest wyjaśnienie koncepcji chmury obliczeniowej (rozdział I) oraz
jak jej użycie wpłynęło na nowoczesny proces dostarczania oprogramowania (rozdział II). Praca
opisuje autorski projekt oprogramowania opracowany przy użyciu platformy Chef oraz
programu Vagrant, który pokazuje możliwości chmury hybrydowej opartej na Amazon Web
Services oraz OpenStack (rozdziały III-IV). Głównym problemem tej dysertacji jest pytanie: jak
zrobić użytek z chmury hybrydowej? Mój projekt jest jednym z wielu możliwych sposobów
odpowiedzi jakie można udzielić na to pytanie.
4
“In today’s computer industry, we still typically
install and maintain computers the way the
automotive industry built cars in the early 1900s. An
individual craftsman manually manipulates a machine
into being, and manually maintains it afterwards.
The automotive industry discovered first mass
production, then mass customization using standard
tooling. The systems administration industry has a
long way to go, but is getting there.”
— Steve Traugott and Joel Huddleston (www.infrastructures.org, circa 2003)
5
Table of Contents
Introduction .......................................................................................................................................................... 7
Domain, topic and aim ...................................................................................................................................... 7
History of the research ..................................................................................................................................... 8
Chapter I: The problem’s domain ....................................................................................................................... 11
Cloud Computing ............................................................................................................................................ 13
Related concepts ........................................................................................................................................ 13
Definition .................................................................................................................................................... 14
History ........................................................................................................................................................ 15
Layers .......................................................................................................................................................... 16
Deployment models ................................................................................................................................... 17
OpenStack ....................................................................................................................................................... 18
Core Services .............................................................................................................................................. 19
Shared Services ........................................................................................................................................... 20
RDO OpenStack .......................................................................................................................................... 21
Amazon Web Services .................................................................................................................................... 21
Chapter II: Solutions of the problem .................................................................................................................. 26
DevOps ............................................................................................................................................................ 26
Agile ............................................................................................................................................................ 27
Infrastructure as a Code ............................................................................................................................. 27
Chapter III: Description of the project ................................................................................................................ 29
Assumptions and requirements ..................................................................................................................... 29
Development environment ........................................................................................................................ 31
System’s design .............................................................................................................................................. 32
Deployment flow ........................................................................................................................................ 32
System’s implementation ............................................................................................................................... 33
Git ............................................................................................................................................................... 33
Vagrant ....................................................................................................................................................... 34
Chef............................................................................................................................................................. 42
6
Koji cookbook ............................................................................................................................................. 49
Tests ................................................................................................................................................................ 68
Chapter IV: Conclusion ........................................................................................................................................ 70
Potential applications ..................................................................................................................................... 70
Suggestions on further studies and investigations ......................................................................................... 70
Testing ........................................................................................................................................................ 70
Continuous Integration, Deployment and Delivery .................................................................................... 71
Appendix A: Koji build system ............................................................................................................................ 72
Architecture .................................................................................................................................................... 72
Koji-hub ...................................................................................................................................................... 73
Koji-web ...................................................................................................................................................... 74
Kojira ........................................................................................................................................................... 74
Koji builder (kojid) ...................................................................................................................................... 74
Koji client .................................................................................................................................................... 74
Additional tools .......................................................................................................................................... 75
Appendix B: Project’s Vagrant files ..................................................................................................................... 76
Vagrantfile.vbox .............................................................................................................................................. 76
Vagrantfile.openstack ..................................................................................................................................... 79
Vagrantfile.aws ............................................................................................................................................... 81
Vagrantfile.production ................................................................................................................................... 83
Appendix C: Directories and files tree of the project ......................................................................................... 87
Figures ................................................................................................................................................................. 89
Bibliography ........................................................................................................................................................ 91
7
Introduction
Domain, topic and aim The domain of this thesis is the concept of cloud computing and its practical applications. In particular it is
focusing on automation and configuration management in a hybrid cloud environment.
The topic of this thesis consists of two parts: theoretical and practical. In the first one, it is explained in detail
the concept of cloud computing and its various types. It includes also a description of two particular cloud
technologies that were used in the project: OpenStack and Amazon Web Service. Second part consists of
practical example how to use those two cloud technologies. It describes how to create automation scripts to
deploy Koji cluster.
The goal of this paper is, on one hand, to present the possibilities that cloud computing enables on the
example of OpenStack and Amazon Web Services, and on the other, to exemplify its application using
configuration management and deployment automation (or so called “DevOps”) software: Vagrant and Chef.
I approached the problem from the practical side: let’s imagine that our new IT business requires a fast and
easily scalable solution to build RPM packages. How we could solve this problem using a hybrid cloud? My
answer to this problem is: find a software to build RPMs (see
8
Appendix A: Koji build system) and employ Chef and Vagrant to treat Infrastructure as a Code and make use
of best DevOps practices. In fact, the chosen technologies, are only small part of the rich and diverse market of cloud technologies and
related software (including configuration management and deployment automation tools). There exists a
number of both public cloud providers and private cloud solutions. However it would be unreasonable to try
to describe them all – the paper would expand exponentially. Instead I decided to pick one of public and private
cloud and use them as an example of a general concept. To show that cloud computing capabilities are not
only “buzzwords” on advertising flyers but they are real and practical enterprise-ready solutions, I decided to
show how to use Vagrant, Chef and ruby scripting with them. There are also of course plenty of others
applications that utilize the power of cloud computing, but, again, my aim was to show that this is possible and
how it works, and not to write a comprehensive review of all existing software stacks.
This thesis is inspired by my own professional experience. During my short career I have witnessed the major
shift in Polish IT industry that is still ongoing. Namely, adoption of cloud computing and adjusting development,
quality assurance and system administration procedures and technology stacks to the new paradigm. New
technology not only brings new software features but – what is even more important – it changes the
operations of organizations and businesses. It implies new ways of provision of IT services, new methods of
management of IT infrastructure and new possibilities for development of new products.
The tools that are applied in my project are also inspired by my past professional experience. I have worked
with such cloud computing technologies as OpenStack, Eucalyptus, OpenNebula, AWS, Rackspace and with
such cloud-related tools as Chef, Puppet, Ansible and Vagrant. There are also many others used in current IT
development and operations. My opinion is that in cloud computing world AWS and OpenStack are striving to
dominate the market.
History of the research Before the project started, I’ve had to choose the cloud technologies. I decided to use Amazon Web Services
and OpenStack because they are the most mature clouds and I personally have the best experience using these
two.
The research was conducted on my personal laptop with CentOS 7 Linux and RDO OpenStack installed and my
personal Amazon Web Services account. After investigation of existing literature and Internet resources related
to my project, I’ve started working on the code.
The project consists of one Chef cookbook and four Vagrant’s virtual environment configuration files (main
one, Vagrantfile.production, deploys Koji cluster on AWS and OpenStack). The cookbook is a group of scripts
that use Ruby as its reference language, with an extended DSL for specific tasks. Vagrant also uses Ruby scripts
for configuring virtual environments.
Since both Chef cookbook and Vagrant configuration files are executed like typical procedural programs, the
usage of Ruby in the project is very limited and omits more advanced programming topics (like object-oriented
paradigm). In fact most of the code is written in Chef’s and Vagrant’s Domain-Specific Languages.
9
For the purpose of the code development I’ve established a version control repository. I’ve decided to use Git
as a version control software and Bitbucket.com for my private web repository. I’ve created four branches in
Git repository for the development of four versions of the code:
VirtualBox version (for testing),
Amazon Web Services version,
OpenStack version,
Hybrid: AWS and OpenStack version.
In the first step I’ve had to document in detail the entire process of installation and configuration of Koji cluster.
For this purpose I’ve used three Virtual Machines on VirtualBox. After documenting the installation and
configuration process of Koji, I've started working on Chef’s cookbook that would automate this process.
Development of a cookbook requires few tools that are included in Chef Development Kit.
Additionally, for the purpose of deployment I needed Vagrant - a tool that can automatically deploy any set of
Chef cookbooks on any virtualization or cloud provider. I've tested the cookbook on VirtualBox and once it was
finished, cookbook was deployed on AWS and OpenStack. To achieve this result I prepared four configuration
files:
Vagrantfile.vbox for VirtualBox deployment
Vagrantfile.aws for Amazon EC2 deployment
Vagrantfile.openstack for OpenStack deployment
Vagrantfile.production for hybrid (Amazon EC2 and Openstack) deployment
During the course of development I have worked on three different Chef installations (for the types of Chef
versions see
10
Chapter III: Description of the project, section Chef). Firstly, for simplicity I have used Chef Solo for the
deployment. I had to abandon it because it didn’t have support of very important cookbook resource – search
of the nodes. Then I have switched to Chef Zero, which is very useful tool for testing cookbooks (especially
when we need search option) because it is type of minimal Chef Server that runs in RAM and doesn't require
configuration (this version is used in Vagrant configuration file for VirtualBox). Another version of chef that I
have used is Hosted Enterprise Chef, which is simply Chef Server operated by Chef Company (there is also on
premise version of Chef Server but I didn’t used it in the project).
One rarely writes Chef’s cookbook without usage of other cookbooks. Every new cookbook has a list of
dependent cookbooks in metadata.rb file. The usage of code of other projects in new project is called “software
reusability”. Code reuse is a basic principle of modern object-oriented programming languages like Java, C++,
Python or Ruby. The ability to reuse code relies in an essential way on the ability to build larger things from
smaller pieces and being able to identify commonalities among those parts.
My cookbook installs and configures Koji. It is software used by Red Hat, Fedora, CentOS and other
organizations to build RPM packages on mass scale. In my project I have reused the cookbooks that configure
the services required by Koji (e.g. Apache HTTPD, PostgreSQL, NFS server) and few others that were helpful in
Koji installation and configuration (e.g. EPEL yum repository, SELinux, iptables).
Currently there are no means of automatic deployment of Koji using any of popular automation and
configuration management software (like Chef, Puppet, Ansible, Salt or other dedicated software). In Internet
there is only outdated documentation on Fedora Project’s wiki. This documentation and my personal
experience were the basis for the development of the cookbook. During my professional career I have worked
on bash scripts for Koji automatic installation and configuration. This experience helped me in the development
of similar (although more sophisticated) code in Ruby and Chef’s DSL.
Last step of my project was the development of Vagrant virtual environments' configuration files. The first file
was created for the purpose of testing of the Koji cookbook. Before I started working on configuration files
using other Vagrant providers (Amazon EC2 and OpenStack), I have decided that testing the cookbook will be
more comfortable, faster and cheaper on Vagrant's default provider (VirtualBox). This configuration files
defines four Virtual Machines: one with chef-zero installed, one with koji-hub installed, and two with koji-
builder installed. Second Vagrant configuration file included three Virtual Machines deployed on Amazon EC2:
one koji-hub and two koji-builders (chef-zero was replaced by Hosted Enterprise Chef). Third configuration files
also used three VMs but deployed on OpenStack. Fourth configuration file, the final one (production), deploys
koji-hub on Amazon EC2 and two koji-builders on OpenStack.
As a result of conducted research I developed Chef cookbook for Koji cluster installation and Vagrant
configuration files that enable to run it on hybrid cloud: one part of the cluster on Amazon Web Services and
another part on OpenStack.1
1 Although it would be very easy to rewrite the Vagrant configuration files to run the cookbook on other cloud providers. The cookbook itself is provider agnostic – it can run any RPM-based Linux machine.
11
At the end of this brief introduction, I will shortly summary each chapter. In the first chapter I described in
detail what is cloud computing and what are its taxonomies - that's the domain of my project. First chapter
also includes the description of Amazon Web Services and OpenStack cloud technologies – the two cloud
technologies that are used in the project.
Second chapter introduces DevOps, an umbrella term under which solutions to problem stated in the thesis
emerged in recent years. IT Operations found in those methodologies and software new ways of handling a
dynamic and software-defined infrastructure based on hybrid clouds.
Third chapter constitutes the documentation of the software project that I have authored to solve the
problem stated in my thesis. It also contains description of technologies that were used in my project.
However, those description are limited in size as much as possible, to make the reading of document more
comfortable and more usable in practice. Documentation in each IT project is absolute necessary element of
the system. “Ink is better than the best memory” teaches old Chinese proverb. As time passes it is harder to
understand how the system works. However, any author of IT system has to know that voluminous
documentation is as bad, as lack of it. Too much documentation makes system unusable.
Fourth chapter includes conclusions. There is my opinion whether the project occurred to be success or
failure and an outline of what could be done better. This chapter also contains suggestions for further
improvement of the project.
12
Chapter I: The problem’s domain What is “cloud”? Is it just good old Internet? Cloud computing simply can be defined as storing and having
access to computer data and software on the Internet, rather than running it on personal computer or office
server. In fact, programs such as Gmail or Office365 are commonly described as cloud computing technologies.
On the plus side, data and business computing programs are running online, rather than exclusively on office
computers, so it means that company’s staff has access to them anytime, anywhere there's an Internet
connection.2
However, not everyone share the enthusiasm of the new IT paradigm. In 2008 Oracle CEO Larry Ellison said in
regards to cloud computing that “the computer industry is the only industry that is more fashion-driven than
woman’s fashion.”3
McKinsey Quarterly mentions cloud computing as seventh most important technology-enabled business trend
out of ten. McKinsey values cloud computing for enabling new business models. Technology now enables
companies to monitor, measure, customize, and bill for asset use at a much more fine-grained level than ever
before. Asset owners can therefore create services around what have traditionally been sold as products.
Business-to-business (B2B) customers like these service offerings because they allow companies to purchase
units of a service and to account for them as a variable cost rather than undertake large capital investments.
Consumers also like this “paying only for what you use” model, which helps them avoid large expenditures, as
well as the hassles of buying and maintaining a product. This development has created a wave of computing
capabilities delivered as a service, including infrastructure, platform, applications, and content. And vendors
are competing, with innovation and new business models, to match the needs of different customers. 4
According to IDC IT analysts the transformation towards cloud computing is “third wave”, analogously to the
“second wave” – dissemination of the PC’s and computer networks, and the “first wave” – era of mainframe
and terminals dominance. Frank Gens, vice director and chief analyst of IDC, says that “those companies, that
haven’t adapted to the new model, are already forgotten.” Thus, analysis of past events leads to the conclusion
that today, just as in 1986, when a PC appeared, many IT giants will have to decide in which direction lead its
future actions – whether to remain with the second wave, or begin to develop solutions used in the third.5
In report from 2008 by Gartner Group, it was stated that Cloud Computing is the most important trend in IT
world. According to Gartner analysts, these techniques are sufficiently mature to become profitable in a short
period of time. At the same time more and more widespread knowledge about the potential benefits and costs
2 http://www.businessweek.com/smallbiz/content/oct2009/sb20091026_937390.htm (03/08/2014) 3 Anthony Velte, Toby J. Velte, Robert C. Elsenpeter, Cloud Computing, A Practical Approach, McGraw-Hill Prof
Med/Tech, 2009, p. 3 4 https://www.mckinseyquarterly.com/Strategy/Growth/Clouds_big_data_and_smart_assets_Ten_tech-
enabled_business_trends_to_watch_2647#Trend7 (03/08/2014) 5
http://www.computerworld.pl/artykuly/376694/Cloud.Computing.to.dopiero.poczatek.zmian.ktore.nastapia.w.branzy.IT.html (03/08/2014)
13
and limitations associated with these trends is available. Therefore, firms should decide whether to implement
these technologies.6
However, despite of overall enthusiasm there are also security concerns raised by some. Ernst and young in its
report claims that only 50% of respondents have documented information security strategy. More than half of
respondents did not introduce any procedures of minimization of risks resulting from implementing cloud
computing technologies.7
Richard M. Stallman, founder of the GNU project and Free Software Foundation, sees cloud computing as a
danger to the privacy of users of the software and their freedom to use and modify the software according to
their needs. He claims that cloud computing takes away control of the software from the users. In his opinion
it can be more dangerous than proprietary software. In closed source software users usually obtains an
executable (binary) file without the source code. Without the users (or rather users-programmers) of the code
really can’t study the program, so they can’t determine what the program really does (for instance, it may spy
you or sends your personal information to produces without your agreement). So the problem is that you can’t
change it. Whereas in cloud computing model, the users there is no even executable file in user’s hands: users
have access only to the interface of the program, the executable is on the server. Thus users can’t exactly know
what this software really does. It is even harder than in proprietary software to change it. Moreover, cloud
computing usually includes elements that can be classified as the malicious software. In case of proprietary
software, not many of them are “spyware”: that is the program sends out data about users' computing
activities. With cloud computing users have to send their data to the server in order to use it. This fulfils the
first requirement of the spyware software: user does not control the data anymore, now the server’s owner
controls it. Thus cloud computing providers have dominant power over their users.8
Eric Schmidt doesn’t share Stallman’s Orwellian predictions. On the contrary, he claims that those technologies
will serve people. He claims that when people have infinitely powerful personal devices, connected to infinitely
fast networks and servers with lots of content, it will enable a new kind of application and it will be personal.
It will use all of that computing power that’s in the cloud, as we call it. So this vision of nearly infinite computing
power, network power, and these powerful devices is the basis of the next generation of computing. 9
6 http://www.computerworld.pl/news/162476/Gartner.Cloud.computing.i.Green.IT.najwazniejszymi.trendami.najblizszych.lat.html (03/08/2014) 7
http://www.computerworld.pl/news/376998/Raport.ErnstYoung.media.spolecznosciowe.i.chmura.grozne.dla.firmowych.danych.html (03/08/2014) 8 http://www.gnu.org/philosophy/who-does-that-server-really-serve.en.html (03/08/2014) 9
https://www.mckinseyquarterly.com/Googles_view_on_the_future_of_business_An_interview_with_CEO_Eric_Schmidt_2229 (03/08/2014)
14
In conclusion, there are hopes and fears related to development of cloud computing technologies – no matter
whether, we are skeptical or enthusiastic about it, we can agree that it is a field of study that needs more
research and investigation.
Cloud Computing On one hand cloud computing is often described as the on-demand delivery of IT resources via the Internet
with pay-as-you-go pricing.10 Cloud computing is about leasing servers and storage from a provider (like
Amazon Web Services). But, on the other hand, it’s also about much more. The cloud offers IT businesses major
cost savings and agility.
In addition, cloud computing offers significant scalability. With a single line of code, it is possible to provision
thousands of servers and it is paid only for what is really needed. Furthermore, because it is based on pay-as-
you-go per hour, running one server for a thousand hours costs the same amount as running a thousand servers
for one hour.
Finally, cloud computing enabled automation of server provisioning. It supports the automation of software
development, testing and production delivery. Combining scalability with automation provides the ability to
build an application that responds to load.
Related concepts Cloud computing has its origin in few former IT concepts that have dominated in “pre-cloud” IT industry.
Utility computing is the provision of computational, networking and storage resources as a metered service.
“Utility” indicates that the model works analogously to public utilities.11
Grid computing is a combination of distributed resources from various institutions (resource providers), to
meet the demands of clients consuming them.12
Distributed computing is computing over distributed autonomous computers that communicate only over a
network. Such systems are often treated differently from parallel computing systems or shared-memory
systems, where multiple computers share a common memory that is used for communication between the
processors.13
Virtualization enables running several virtual operating systems, independent from each other, on a single
physical host. Thanks to maximal utilization of a physical computer, the return on investment is significantly
higher.14 Resource virtualization is at the heart of most cloud architectures. The concept of virtualization allows
10 http://aws.amazon.com/what-is-cloud-computing/ (03/02/2015) 11 John W. Rittinghouse, James F. Ransome, Cloud Computing: Implementation, Management and Security, CRC
Press, 2009, p. 26 12 Borko Furht, Armando Escalante, Handbook of Cloud Computing, Springer, 2010, p. 185 13 Dinkar Sitaram, Geetha Manjunath, Moving To The Cloud: Developing Apps in the New World of Cloud
Computing, Syngress, 2011, p.381 14 John W. Rittinghouse, James F. Ransome, op. cit., p. 24
15
an abstract, logical view on the physical resources and includes servers, data stores, networks, and software.
The basic idea is to pool physical resources and manage them as a whole. Individual requests can then be
served as required from these resource pools.15
Definition The term cloud has its origin in symbol used in network diagrams which has symbolized the Internet. 16
Cloud computing is also a new business model replacing old model based on the traditional data center.
However, traditional data center not necessarily goes away to be replaced with a cloud. Sometimes the
traditional data center is the best fit. Nevertheless, for business agility and economic reasons, the cloud is
becoming an increasingly important option for companies. Cloud computing can be perceived as the
foundation for the industrialization of computing.17
Key characteristics The "five essential characteristics" was proposed by the National Institute of Standards and Technology18:
On-demand self-service – a user, with an appropriate delegation of rights (permission), can individually
provision computing resources when he needs them and without the need of human interaction with
service’s operator.
Broad network access – resources can be accessed by users over the network and through
standardized solutions that enables heterogeneous usage independent from the type of a device (e.g.,
mobile phones, laptops, PDAs).
Resource pooling – resources of a provider are divided into pools to serve numerous consumers in a
architecture consisting of multiple tenants, with various virtual capabilities dynamically allocated in
response to the demand generated by users.
Rapid elasticity – resources are provisioned fast and in elastic way, often automatically; it scales out
rapidly. In user perspective the capabilities of the cloud seems to be almost unlimited.
Measured service – control and optimization of resources is automatic in the cloud system, so that
their utilization is adjusted by the metering functionality. The users’ usage of resources is monitored
constantly. Reports based on metering of the utilized service are used by both user and the cloud
provider.19
15 Baun, C., Kunze, M., Nimis, J., Tai, S., Cloud Computing: Web-Based Dynamic IT Services, Springer, 2011, p. 5 16 John W. Rittinghouse, James F. Ransome, op. cit., p. 26 17 Judith Hurwitz, Robin Bloor, Marcia Kaufman, Fern Halper , Cloud Computing for Dummies, For Dummies, 2009,
p. 19 18 Peter Mell, Timothy Grance, The NIST Definition of Cloud Computing, National Institutes of Technology, U.S. Department of Commerce, Special Publication 800-145, September 2011, available at http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf (20/10/2014) 19 David E. Y. Sarna, Implementing and Developing Cloud Computing Applications, Auerbach Publications, 2010, p. 16
16
Benefits and Drawbacks The benefits of cloud computing are20:
1. Reduction of costs related to infrastructure implementation and maintenance.
2. Cloud computing boosts mobility of the global IT employment.
3. Datacenters became more scalable and more flexible.
4. Faster time to market than traditional datacenter.
5. Facilities transformation and change to enable an innovation-friendly environment.
6. Enables usage of “green” technology and methods of operation.
7. Better affordability enabled SME’s development and usage of high-performance software.
However, there exists also significant area of potential risks of the cloud computing paradigm21:
1. Potential problems with availability of the cloud.
2. Data lock-in.
3. Data privacy and traceability.
4. Compliance with national legislation by geographical data storage.
5. Data transfer bottlenecks.
6. Poor performance predictability.
7. Scalability of persistent storage space.
8. Errors in large, distributed systems.
9. Reputation and liability.
10. Software licenses.
History The notion of cloud computing can be dated to at least to 1961, when John McCarthy22 wrote that time-sharing
computer technology can be transformed in the future into an “utility computing” model, in which computer
resources or even applications could be provided remotely. At the time, the late 1960s, IT technology was not
ready for this futuristic concept. When the idea revitalized in the turn of the millennium, the term “cloud
computing” replaced and extended previous one.23
Google became a pioneer during this revitalization thanks to several key factors.
● First, the collection of data and the processing of that data had to be as automated as possible.
20 John W. Rittinghouse, James F. Ransome, op. cit., p.14 21 Baun, C., Kunze, M., Nimis, J., Tai, S., op. cit., p. 70 22 John McCarthy (September 4, 1927 – October 24, 2011) was an American computer scientist and cognitive scientist. McCarthy was one of the founders of the discipline of artificial intelligence. He coined the term "artificial intelligence" (AI), developed the Lisp programming language family, significantly influenced the design of the ALGOL programming language, popularized timesharing, and was very influential in the early development of AI. [Source: http://en.wikipedia.org/wiki/John_McCarthy_(computer_scientist) (20/01/2015)] 23 John W. Rittinghouse, James F. Ransome, op. cit., p. 26
17
● It had to be cost effective, so the infrastructure was constructed out of commodity components
(“cheap stuff that breaks”).
● Data had to be stored in a simple and fairly reliable manner to facilitate scaling (instead of a using
traditional database, Google created its own data store called GFS).
● New types of application development architectures and processing algorithms (including map-reduce
family among others).
● Operations had to be automatic and dependable.
● Outages in parts of the application were tolerable.
In order to scale cheaply its search facility, Google had created much of what can probably be first recognized
as a cloud.
Another interesting pioneer in cloud computing is Amazon. In the first years the company built its IT
infrastructure the traditional way: using big, heavy servers with relational databases. This model worked well
in early days. As commerce on the Internet expanded, it became clear for Amazon that its computing
architecture had to change. At the same time, in order to build customer and vendor relationships Amazon
had begun exposing individual services as callable services. This model has accelerated decomposition of many
of Amazon’s applications into individually callable services. In 2006 Amazon began to offer basic computing
resources: computing, storage, and network bandwidth in highly flexible, easily provisioned services, all of
which could be paid for in pay-as-you-go model.
Salesforce.com was the first public cloud service that was targeted at the enterprise customer keeping
customers’ sensitive data outside of their own facilities. They introduced an easy, pay as you go CRM (customer
relationship management) implementation that have risen to meaningful market share and then eventual
dominance largely at the expense of the traditional, install-in-your-own-shop application with an overwrought,
often painful, and unintentionally costly implementation.
During the era of this three major cloud computing pioneers the vision of virtualized utility computing finally
begin to become true:
Computing—computation, storage, communication—is relatively cheap, scales up or down as needed,
operates itself automatically, and always works.24
Layers Each cloud computing layer can be characterized by properties of its own. Additionally, some layers are
subdivided into sub-layers and into their services. There are three basic layers of cloud computing: Software as
a Service (SaaS), Platform as a Service (PaaS) and Infrastructure as a Software (IaaS).
Clouds are divided into multiple services and interfaces which make them usable. Although each cloud
computing technology consists usually of multiple layers, they are mapped to the highest layer possible. This
is the layer through which potential users are primarily addressed. Cloud computing is currently under dynamic
24 Eric A. Marks, Bob Lozano, Executives Guide to Cloud Computing, Wiley, 2010, p.20
18
evolution, so following classification is not intended to be complete, but rather it represents an outline of the
archetypal cloud services.25
Application Software Cloud software applications (or Software as a Service, SaaS) directly serves the end user. In this model the
customers are free from the need of the installation of the software locally. Instead they use it usually via web
browser because cloud software interface is mostly web. From the cloud architecture perspective, the SaaS
model can be developed and operated by the provider on the basis of a PaaS or IaaS models.26
Platform The cloud services provided on the PaaS layer are targeted to developers. These are mostly programming
environments (PE) and execution environments (EE) where software written in a specific programming
language can be executed. PaaS usually extends existing programming environments, e.g., by adding class
libraries which have a specific application focus.27
Infrastructure The IaaS layer provides the users (typically system administrators) an abstracted view on the hardware, i.e.,
computers, mass storage systems, networks, etc. User manages via its interface a number of resources. It
enables the users to allocate a subset of the resources for their own use. Typically user has a possibility of
creating or removing operating system images, scaling required capacities, or defining network topologies,
connecting volume storage to instances, and basic operations on instances: starting, stopping or destroying
them.28
Deployment models There are three models of cloud deployment are recognized by The National Institute for Standards and
Technology (NIST), Information Technology Laboratory.29
Public cloud In this model cloud infrastructure is owned by an organization selling cloud services. It is made available to
general public.30
Computing resources in public cloud are dynamically provisioned over the Internet via web. Public clouds
(sometimes also called external clouds) are run by third party companies. Typically cloud services from different
25 Baun, C., Kunze, M., Nimis, J., Tai, S., op. cit., p. 17 26 Ibid., p. 20 27 Ibid., p. 20 28 Ibid., p. 18 29 David E. Y. Sarna, op. cit., p. 17 30 Ibid., p. 17
19
companies are likely to be mixed together to form an organization’s IT infrastructure: cloud servers, storage
systems, and networks.31
Private cloud This type of cloud infrastructure is operated individually by and for one organization. Its management may be
outsourced to a third party and it may exist on premise or off premise.32 Private cloud (also called internal
cloud) is a model in which cloud services are running on private networks. They are used exclusively by one
organization that keeps full control over data, security, and quality of service. Private clouds are created and
administrated by a company’s own IT department or outsourced to other third party.33
Hybrid cloud This model is a combination of two or more private and public clouds that are independent units but
standardized technology bounds them to enable portability of data and applications (e.g., load-balancing
between clouds).34 A hybrid cloud environment combines multiple public and private cloud models. Hybrid
clouds introduce the complexity of determining how to distribute applications across both a public and private
cloud.35
OpenStack OpenStack is called by their creators a “cloud operating system”. It controls large pools of compute, storage,
and networking resources throughout a datacenter. The cloud is managed through a dashboard that gives
administrators control and empowers their users to individually provision resources through a web
interface.36
The project aims at creating an open source cloud computing platform for public and private clouds providing
scalability without complexity. Initially it focused on Infrastructure as a Service (IaaS) model, but the scope of
projects and models are constantly growing. One of the core values on which the project is based is openness
with both open standards and open source code. OpenStack has been released under the Apache 2.0
license.37 In addition, OpenStack promotes open standards through the OpenStack API.
The OpenStack project was created by Rackspace Hosting (a large US hosting firm) and NASA (the US Space
agency). They decided to work together and released their internal cloud object storage and cloud compute
code bases (respectively) as a one common open source project.38
31 Borko Furht, Armando Escalante, op. cit., p. 7 32 David E. Y. Sarna, op. cit., p. 17 33 Borko Furht, Armando Escalante, op. cit., p. 7 34 David E. Y. Sarna, op. cit., p. 17 35 Borko Furht, Armando Escalante, op. cit., p. 7 36 http://www.openstack.org/software/ (19/12/2014) 37 Apache 2.0 license is available online: http://www.apache.org/licenses/LICENSE-2.0.txt (22/01/2015) 38 Ken Pepple, Deploying OpenStack, O'Reilly Media, 2011, p. 1
20
Core Services The project currently encompasses five main components: Compute (Nova), Object Store (Swift), Networking
(Neutron) and Dashboard (Horizon).
Compute (Nova) In this service instances of virtual machines are run by the users on numerous hosts. This solution offers
scalability and redundancy. The major goal of this project is to be hardware and hypervisor agnostic.
OpenStack Compute is base for some of the public cloud providers – i.e. it runs Rackspace Open Cloud.39
Object Store (Swift) It is the service that provides storage that is massively scalable and in the same time it can be build using
commodity hardware. Inspiration for creation of OpenStack Object Store was Amazon's S3 storage service.
Users can keep data of almost unlimited size (limited by hardware resources) and extend their storage on
demand. OpenStack Object Storage is highly redundant and thus it is perfect for data archiving (e.g. logs) or
providing a storage system that OpenStack Compute can use for instance templates (VM images).40
Networking (Neutron) OpenStack supports many modes of networking. Main three are Flat networking, VLAN Manager and
Software Defined Networking (SDN). Software Defined Networking is an approach to networking in which
network administrators and cloud operators can programmatically define virtual network services. The
Software Defined Network component of OpenStack Networking is called Neutron.
Using Neutron, users can create complex networks in a secure multi-tenant environment. It overcomes the
issues often associated with previous networking systems: Flat and VLAN. For Flat networks all tenants work
within the same IP subnet. VLAN networking separates the tenant IP ranges with a VLAN ID, but VLANs are
limited to 4096 IDs, which is a problem for larger installations, and the user is still limited to a single IP range
within their tenant to run their applications.
SDN in OpenStack is also a pluggable architecture: it enables to plug-in and control various switches,
firewalls, load balancers and achieve various functions as Firewall as a Service — everything software–
defined providing full control over complete virtual network infrastructure.41
Dashboard (Horizon) Administrating OpenStack installation through a CLI allows to control of the cloud environment, but web
interface gives easier access to the cloud for users, operators and administrators. OpenStack Dashboard
39 Kevin Jackson , Cody Bunch, OpenStack Cloud Computing Cookbook, 2nd Edition, Packt Publishing, 2013, p. 52 40 Ibid., p. 86 41 Ibid., p. 168
21
provides web interface that runs from an Apache server, using WSGI and Django. With OpenStack Dashboard
installed it is possible to manage all the core components of the OpenStack environment.42
Shared Services The project also includes “OpenStack Shared Services” that are used commonly by the core services.43
Identity service (Keystone) It is a service that provides for authenticating and managing user accounts and roles for OpenStack cloud.
Identity service authenticates and verifies also connections between all other OpenStack cloud services, thus
it is the first service that needs to be installed within an OpenStack environment. To authenticate a user or a
service it sends back an authorization token that is passed between the services, once validated. This token is
subsequently used as user’s authentication and verification that can be proceed to use any OpenStack service
(like Computer or Object Store). Configuration of the OpenStack Identity service includes creating
appropriate roles for users and services, tenants, the user accounts, and the service API endpoints that make
up the cloud infrastructure.44
Image Service (Glance) It is the service that allows user to register, discover, and retrieve virtual machine images. They can be stored
in a variety of backend locations: local filesystem, distributed filesystems such as OpenStack Storage (Swift)
and others.45
Block Storage (Cinder) Data written to currently running instances on disks is not persistent – after termination of such instance any
disk writes will be lost. Volumes are persistent storage that can be attached to a running VM instances. It
works like an external USB drive that you can attach to an instance. Block volumes, similarly to USB drives,
can be attached only to one instance at a time.
OpenStack Block Storage is very similar to Amazon EC2's Elastic Block Storage – the difference is in how
volumes are presented to the running instances. Under OpenStack Compute, volumes can easily be managed
using an iSCSI exposed LVM volume group named cinder-volumes, which must be present on any host
running the service Cinder volume.46 Cinder can also use hardware storage arrays or storage servers (like NFS,
GlusterFS, Nexenta, Cepth RBD, VMware, Windows Server 2012, Solaris ZFS, etc.) and other data transfer
protocols (like AoE, NFS, RBD, Fibre Channel, etc.).47
42 Ibid., p. 217 43 http://www.openstack.org/software/openstack-shared-services/ (13/01/2015) 44 Kevin Jackson , Cody Bunch, op. cit., p.5 45 Ibid., p. 35 46 Ibid., pp. 151-152 47 Full list of supported storage systems and protocols is available at OpenStack wiki: https://wiki.openstack.org/wiki/CinderSupportMatrix (22/10/2014)
22
Telemetry Service This service aggregates usage and performance data from other OpenStack services. It provides visibility of
the usage of the cloud across the data points and it enables users and operators to view metrics globally or
by individual resources.
Orchestration Service It is a template-driven service allowing application developers to automate the deployment of infrastructure.
It provides template language that can specify compute, storage and networking configurations as well as
detailed post-deployment configuration to automate the full provisioning of services and applications. It also
integrates with Telemetry service to provide automatic scaling of infrastructure resources according to load
requirements.
Database Service The service has the goal of enabling users to utilize the features of a relational database in the cloud. Users
and database administrators can provision and manage multiple database instances. The service focuses on
providing resource isolation at high performance and automation of complex administrative tasks (like
deployment, configuration, patching, backups, restores, and monitoring).
RDO OpenStack OpenStack is a complicated array of software services. To make easier and faster the process of installation
and configuration, it is usually delivered in form of “software distributions”. Typical OpenStack distribution
handles the installation and provides tools to manage and monitor the services. One of the major OpenStack
distributions, known as RDO, is developed by Red Hat’s engineers and Fedora’s community.48
RDO is a freely-available, community-supported distribution of OpenStack that runs on Red Hat Enterprise
Linux, CentOS, Fedora, and their derivatives. In addition to providing a set of software packages, it's also a
community of users of cloud computing platform on Red Hat and Fedora Linux operating systems to get help
and compare notes on running OpenStack.49
Amazon Web Services Amazon Web Services (AWS) is a collection of remote computing services (web services) constituting a cloud
computing platform by Amazon.com. Two central services are Amazon EC2 and Amazon S3: they provides a
large computing capacity.50 AWS is a set of on-demand computing resources and services in the cloud, with
pay-as-you-go pricing. Using its resources instead of building traditional datacenter is like purchasing
electricity from a power company instead of running own power generator. Thus it provides many of the
same benefits: capacity exactly matches need, payment only for what is used, economies of scale result in
lower costs, and the service is provided by a vendor experienced in running large-scale networks.51
48 https://openstack.redhat.com/ (18/12/2014) 49 https://openstack.redhat.com/Frequently_Asked_Questions (18/12/2014) 50 http://en.wikipedia.org/wiki/Amazon_Web_Services (18/12/2014) 51 http://docs.aws.amazon.com/gettingstarted/latest/awsgsg-intro/gsg-aws-intro.html (18/12/2014)
23
Regions AWS is divided into multiple, independent from themselves, regions placed around the world. The isolation
between them enables designing highly available applications that span the globe with low-latency response
times to users.52 The map of current regions is presented in Figure 1.
Figure 1 AWS Global Infrastructure (Regions) (Source: http://aws.amazon.com/ (22/01/2015))
By selecting a region closest to users, it is possible to deliver the best experience by minimizing latency. The
division into regions also enables to start and stop easily new applications in different geographical regions if
needed. It allows to “fail fast”, which lets to try new projects that would have been too expensive in a
traditional datacenter.
Another advantage of using multiple regions is data privacy. Many companies are required to store data in a
specific region. The European Union requires that data about its citizens be stored in Europe. In this case, the
eu-west-1 (Dublin) or eu-central-1 (Frankfurt) would be best choice. The specific regions and locations are
listed in Figure 2.
Region Location ap-northeast-1 Asia Pacific (Tokyo) ap-southeast-1 Asia Pacific (Singapore) ap-southeast-2 Asia Pacific (Sydney) eu-central-1 EU (Frankfurt) eu-west-1 EU (Ireland) sa-east-1 South America (Sao Paulo) us-east-1 US East (N. Virginia)
us-west-1 US West (N. California)
us-west-2 US West (Oregon) Figure 2 List of AWS Regions and Locations (Source: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html (28/11/2014))
There is also one additional region called GovCloud, which is specifically designed to store data for the U.S.
government. It is located in the Northwestern United States.
52 Brian Beach, Pro PowerShell for Amazon Web Services, Apress, 2014, p. 1
24
Regions provide a possibility to deliver application from the location closest to its users and build redundant
applications served from multiple regions. Amazon Web Services also offers another layer of redundancy
called availability zones.
Availability zones Each region is divided into two or more availability zones (see Figure 3). Each availability zone (AZ) within a
region is a separate data center. They are isolated from failures but connected with high-speed, low-latency
links with each other. Each AZ has separate power, cooling, and Internet access. Additionally their locations
are chosen so they are never in the same flood plain, etc. This allows designing highly available applications
that span multiple data centers.53
Figure 3 AWS - Availability Zones (Source: Brian Beach, Pro PowerShell for Amazon Web Services, Apress, 2014, p. 3)
Regions and availability zones are two layers of separation that enable to build a highly available, low-latency
applications that couldn’t been possible in pre-cloud computing data center. Only a handful of companies
around the globe have the resources to match this functionality in their own data centers.
Services Amazon Web Services can be grouped into five major categories: Management, Storage, Network, Compute
and Monitoring (see Figure 4). Currently there are more services provided by Amazon than presented in this
Figure, but these are the most important that are also substantial for the thesis’s project.54
53 Ibid., p. 3 54 For complete list of AWS services and products see: https://aws.amazon.com/products/ (22/01/2015).
25
Figure 4 AWS – Services (Source: Brian Beach, Pro PowerShell for Amazon Web Services, Apress, 2014, p. 3)
The services are accessed over HTTP, using the REST architectural style and SOAP protocol. All services are
billed based on usage, but how usage is measured for billing varies from service to service.
Management The services in the management category are used to access and configure AWS. 55
AWS Management Console – a web GUI for configuring AWS.
Identity and Access Management (IAM) – it allows to control access to an account. Administrator can
create users and groups and write policies to control access to resources.
Storage In the bottom of Figure 4 there are listed multiple storage options. 56
Elastic Block Storage (EBS) – it is a storage area network used to create disks for instances. It is a
network-based solution similar to iSCSI. It is possible to create volumes from 1GB to 1TB and manage
its IO operations per second (IOPS).
Simple Storage Service (S3) – it is highly durable object storage in the cloud. It is used to store an
unlimited number of files up to 5GB each. S3 uses HTTP/S to read and write objects. It has
99.999999999% durability.
55 Ibid., p. 4 56 Ibid., p. 5
26
Amazon Glacier – Glacier is a low cost, cold storage solution. Glacier offers the same high durability
as S3 for about 1/10 the cost, but stores data offline and requires advanced notice to access your
data. This is a great alternative to tape backup.
Network In the middle of Figure 4, there are multiple network services that work together. 57
Virtual Private Cloud (VPC) – it allows to create a private network to isolate instances from those of
other AWS tenants. It enables to create a custom network topology and control network security.
Elastic Load Balancers (ELB) – it enables to balance traffic between multiple servers across
availability zones. It is possible to create public ELBs on the Internet or use a private ELB to balance
traffic between layers of a multitier application.
Route 53 – it is Amazon’s managed DNS solution. It can balance traffic between multiple regions, and
AWS will determine which region is closest to the user and route them automatically.
Compute At the top of the stack there are two compute services. 58
Elastic Compute Cloud (EC2) – it is Amazon’s virtual server service. It is used to launch servers, called
instances, in the cloud. EC2 offers thousands of images and hardware configurations.
Relational Database Service (RDS) – it is Amazon’s managed database service. RDS supports MySQL,
Oracle, PostgreSQL, and Microsoft SQL Server. Users can install any of these on an EC2 instance, but
with RDS, Amazon manages the administration for them.
Monitoring Finally, there is a collection of monitoring services. 59
CloudWatch – it is used to monitor the environment. CloudWatch allows to create custom alarms
and defines what actions to take when an issue arises. For example, it can raise an alarm when CPU
utilization is above 80% for an extended period of time.
Auto Scaling – combined with CloudWatch, allows to automatically respond to changing conditions.
For example, it can create an application that automatically launches new instances when the
application is under high load.
Simple Notification Service (SNS) – it is Amazon’s notification system. CloudWatch can publish
messages to SNS whenever an alarm occurs. SNS can send events using e-mail, SMS text messages,
and many other options.
57 Ibid., p. 6 58 Ibid., p. 6 59 Ibid., p. 7
27
Chapter II: Solutions of the problem Although the problem of utilization of a hybrid cloud is new topic in IT world, it stimulated for last few years
emergence of new ideas, new businesses and new software. One of these novelties that cloud computing
made possible and facilitated is the wide range of configuration management and deployment automation
software. Outsourcing of the infrastructure in cloud computing model required changes also in mindset,
methods and tooling of system administrators, software architects and developers.
Among existing solutions to the problem are tools integrated in public and private clouds (i.e. OpenStack
Orchestration) and tools that can be easily employed to serve as a glue between public and private cloud.
Although there are many software stacks and tools for this purpose, their creation was inspired by new
movement that has sprang in the environment of IT professionals working with cloud computing – its name is
DevOps.
DevOps Many individual aspects and traits of DevOps have been well known for years, whereas others are new. It
started as a movement that addressed the motivational conflict between software development department
and operations (systems administration) department in many companies. The conflict is a result of different
goals and incentives between departments. DevOps was invented as a set of practices and tools to improve
collaboration between development and operations. It integrates the complete delivery process in a holistic
way by providing processes and tools for Agile approaches to all parts of the software delivery process.60
Figure 5 Basic elements of DevOps software development method
DevOps in fact is not proper name for this software development and delivery method because it is lacking in
its name third crucial element: Quality Assurance. Perhaps “DevOpsQA” would be better one. As Figure 5
presents Development, Operations and Quality Assurance and three deeply interrelated elements. The
common sphere between those three circles can be understood as an essence of DevOps.
60 Michael Huttermann, DevOps for Developers, Apress, 2012, p. 12
Development
OpertationsQuality
Assurance
28
Agile Agile is a set of methods and methodologies for IT teams. It enables them organizing work more efficiently
and making better business and engineering decisions. Agile covers entire software development process:
including project management, software design and process improvement. Agile practices are usually
designed to be easy to use and adopt.
Agile requires from its users more of right mindset than right skills or tools. Mindset determines how
effectively a team uses the practices. Agile mindset facilities sharing information in a team, so that its
members can make important project decisions together – instead of having a manager who makes all of
those decisions alone. It is about opening up planning, design, and process improvement to the entire
team.61
DevOps has big affinity with Agile approach. The traditional view of operations treated the “Dev” side as the
“makers a system” and the “Ops” side as the “people that take care of the system in production”. For the last
decade, especially after the introduction of cloud computing, IT industry realized the harm that has been
done by treating these two as separate silos.
DevOps can be understand as an extension of Agile that prescribes close collaboration of customers, product
management, developers, operations and QA to iterate towards a better product fast. Service delivery and its
configurations is a fundamental part of the value for the customer, and thus the product team needs to
include those concerns as a top level item in the Agile-driven project. From this perspective, DevOps is simply
extending Agile principles beyond the boundaries of “the code” to the entire delivered service.62
Infrastructure as a Code Infrastructure was automated long before the emergence of Agile methodologies and DevOps movement.
However, they in old times servers were mainly handcrafter by an individual engineer, whose scripts (if they
were any) were unreadable for others.
As Mike Loukides has put it:
“Perl was designed as a programming language for automating system administration. It didn’t take
long for leading-edge sysadmins to realize that handcrafted configurations and nonreproducible
incantations were a bad way to run their shops.”63
In recent years, new ideas and new software in the field of configuration management emerged and
developed to replace both manual configuration and old-style shell and Perl automation scripts. The central
idea of new tooling was to enable close collaboration between developers and operations engineers. Its aim
was also to provide higher transparency in the complex infrastructure installations. This problem was
addressed because in recent years the number of such installations is growing exponentially. With increasing
61 Andrew Stellman, Jennifer Greene, Learning Agile, O'Reilly Media, 2014, p.2 62 http://theagileadmin.com/what-is-devops/ (06/01/2015) 63 http://radar.oreilly.com/2012/06/what-is-devops.html (23/01/2015)
29
complexity and more sophisticated integration of layers of IT systems developers were required to
understand operations, and operational engineer to know the development process. The infrastructure as
code paradigm can help to achieve these goals.64
Thanks to Agile new methods of developing software emerged: continuous integration, test driven
development, build/deployment automation, and others. All of them were created mostly to automate as
many parts as possible of the lifecycle of a software product. However, at the beginning the biggest focus was
at the software itself, and the infrastructure on which the software runs was often perceived as a separate
problem.
From a traditional perspective, infrastructure summarizes items such as operating systems, servers, switches,
and routers. It comprises all of the environments of an organization together with supporting services, such
as firewalls and monitoring services. In Infrastructure as a Code context, infrastructure often includes every
part of the solution that is not the developed software application itself. In that sense, infrastructure is
includes the middleware: web and application servers, databases, load balancers, configuration files,
software packages as part of the operating system, crontabs, users, groups, etc.).
However, infrastructure is set up and changed over time, before the software even goes into production. In
cloud computing environments it became more often needed to rebuild your infrastructure from scratch.
This brought a need to well document infrastructure and find a solution of automatically set up it.65
Infrastructure as code is a powerful concept and approach that promises to help repair the split-brain
phenomenon witnessed so frequently in organizations where developers and system administrators view
each other as enemies, to the detriment of the common good. Through co-design of the infrastructure code
that runs an application, we give operational responsibilities to developers. By focusing on design and the
software lifecycle, we liberate system administrators to think at higher levels of abstraction. These new
aspects of our professions help us succeed in building robust, scaled architectures. We open up a new way of
working—a new way of cooperating—that is fundamental to the emerging DevOps movement.66
Infrastructure as code emphasizes the need to handle the setup of infrastructure in the same way as the
development of code: by picking the right language or tool to do the job and start developing a solution that
suits the needs, making it an executable specification that can be applied to target systems efficiently and
repeatedly.
64 Michael Huttermann, op. cit., p.136 65 Michael Huttermann, op. cit., p. 135 66 Stephen Nelson-Smith, Test-Driven Infrastructure with Chef, 2nd Edition, O'Reilly Media, 2013, p. 5
30
Chapter III: Description of the project My solution to the problem stated in the thesis – how to utilize a hybrid cloud based on OpenStack and
Amazon Web Services? – is a project based on a popular ruby-based DevOps software stack – Chef and
Vagrant. Project consists of one cookbook and four Vagrant configuration files. All cookbook's code and
Vagrant files are stored in Git repository at bitbucket.com. The cookbook installs and configures a Koji cluster
(see
31
Appendix A: Koji build system), a software stack dedicated for building RPM packages.
Assumptions and requirements The system to work properly has specific requirements that have to be fulfilled before running it. Firstly, the
project was tested on CentOS Linux operating system.67 The user of the system has to have an access to
OpenStack and AWS clouds, install necessary software on his computer and set up some environment
variables. Additionally, if user would like to develop further the project, another set of tools is required.
The project assumes that its user has access to OpenStack cloud and Amazon Web Services. The first one user
can install on any Linux box.68 The other can be obtained by registering on Amazon’s web site.69 Additionally,
user needs an access to Chef Server – either local one, or Hosted Chef Server.70 Last thing that user needs to
67 The project will most probably work on other Linux distributions without much changes. In case of Microsoft Windows or Apple Mac OS X significant changes would be required to run it. 68 RDO OpenStack documentation: https://openstack.redhat.com/Quickstart (23/01/2015) 69 AWS registration form: https://portal.aws.amazon.com/gp/aws/developer/registration/index.html (23/01/2015) 70 More information about Chef Server and Hosted Chef: https://www.chef.io/chef/choose-your-version/ (23/01/2015)
32
install on his computer to use the project is Vagrant.71 Because some data has to be private and secure for
each individual user of the project’s Vagrant configuration files, this kind of information is hidden in shell’s
environment variables. There are also other variables kept in it for convenience of configuration.
The project to run properly requires environment variables listed in Figure 6 to be exported with valid values.
The script with environment variables is divided into four sections: Logging, AWS, OpenStack and Chef. In
each part there is number of variables – the bolded ones are those which values have to be provided by the
user of the script.
71 Vagrant can be downloaded from its website: https://www.vagrantup.com/ (23/01/2015)
#Logging
export VAGRANT_LOG=debug
export CHEF_LOG=debug
export VAGRANT_OPENSTACK_LOG=debug
# AWS
export EC2_ACCESS_KEY=USER_ACCESS_KEY
export EC2_SECRET_KEY=USER_SECRET_KEY
export EC2_URL=http://ec2.amazonaws.com
export S3_URL=https://s3.amazonaws.com:443
export AWS_ACCESS_KEY="${EC2_ACCESS_KEY}"
export AWS_SECRET_KEY="${EC2_SECRET_KEY}"
export AWS_KEYPAIR=USER_KEYPAIR
export AWS_PRIVATE_KEY_PATH=$HOME/.ssh/user_key
export AWS_SSH_USERNAME=ec2-user
export AWS_AMI_IMAGE=ami-799cf410
export AWS_REGION=us-east-1
export AWS_INSTANCE_TYPE=t1.micro
export AWS_SECURITY_GROUP=SEC_GROUP
# OpenStack
export OS_USERNAME=OPENSTACK_USERNAME
export OS_TENANT_NAME=OPENSTACK_TENANT_NAME
export OS_PASSWORD=OPENSTACK_PASSWORD
export OS_IP=OPENSTACK_IP_ADDRESS
export OS_AUTH_URL=http://$OS_IP:5000/v2.0/
export OS_PUBLIC_KEY_PATH=$HOME/.ssh/user_key.pub
export OS_PRIVATE_KEY_PATH=$HOME/.ssh/user_key
export OS_SSH_USERNAME=centos
export OS_FLAVOR=m1.small
export OS_IMAGE='CentOS'
export OS_FLOATING_IP_POOL=public
# Chef
export CHEF_ORG=USER_CHEF_ORG_NAME
export CHEF_SERVER=https://api.opscode.com/organizations/$CHEF_ORG
export CHEF_VALIDATION_CLIENT_NAME=koji-validator
export CHEF_VALIDATION_KEY_PATH=$HOME/.chef/koji-validator.pem
Figure 6 Shell environment variables
33
First part includes variables regarding the verbosity of the Vagrant’s output – “debug” means that Vagrant,
Chef and vagrant-openstack-provider will provide a lot of information during run.
Second section includes the configuration of Amazon Web Services. Five variables in this section requires
user to provide values: he’s AWS access key and secret key allowing authorization in AWS, key pair name and
path to private key, so that Vagrant can log into instance to configure it. Last variable is dedicated to security
group - it has to be one with open ports 22 (SSH) and 80 (Koji).
Third section is related to OpenStack configuration. In this section user has to provide his username and
password, tenant name and IP address of the OpenStack endpoint. Similarly to AWS, user has to provide
public and private key to enable Vagrant SSH log into instance.
Last section is dedicated to Chef – in this part there are two possibilities depending on type of Chef Server
used. If user has an access to Hosted Chef Server, than it is enough to provide Chef organization’s name.
Otherwise, that is when Chef Server is installed locally, then entire CHEF_SERVER has to be changed.
Additionally, name and path of his koji-validator key are correct.
Assuming that file with environment variables is named projectrc, following command has to be used to
export it: source projectrc.
Development environment
Ruby The project can be further developed. However, installation of specific development tools for this purpose is
necessary. The software tools used in the project are dependent on Ruby execution environment
(interpreter)72 and few Ruby libraries (called “gems”)73.
In Figure 7 Gemfile there are listed Ruby gems that are required for the project. The easiest solution to
provide them is to install and run Bundler. “Gemfile” is the name of the file that is used by Bundler to
72 Ruby is also included in ChefDK, so it doesn’t have to installed separately from it. 73 More on Ruby gems: https://rubygems.org/pages/about (25/01/2014)
source 'https://rubygems.org'
gem 'berkshelf'
gem 'test-kitchen'
gem 'kitchen-vagrant'
gem 'knife-ec2'
gem 'knife-openstack'
gem 'aws-sdk'
gem 'fog'
gem 'chef-api'
gem 'chef'
Figure 7 Gemfile
34
download and install gems. Then user has to run command: bundle install. Some gems have specific
software requirements (for instance some of them need compilation of their source code written in C, thus
they need GCC and maybe Make, or other compilation toolchain).
Bundler simply ensures that the appropriate dependencies are installed for a given project without
unpleasant ordering issues or cyclical dependencies. Thanks to ease of sharing Gemfile, it enables sharing a
software project between other developers, or other machines or environments, and be confident the
application and its dependencies will behave in the same way.74
ChefDK Chef Development Kit (ChefDK) defines a common workflow for cookbook development, including unit and
integration testing, identifying lint-like behavior, dedicated tooling.75
It includes:
Cookbook dependency manager Berkshelf.
Test Kitchen integration testing framework.
ChefSpec, for cookbook's unit testing.
Foodcritic, a linting tool for doing static code analysis on cookbooks.
All of the basic Chef tools: Chef Client, Knife, Ohai and Chef Zero.
System’s design From the assumptions and requirements one can already has an impression of how the system is designed.
The design of this project resembles a flow of deployment – that is firstly instances are created on OpenStack
and AWS clouds, then Koji software packages are installed, then specific configuration of Koji cluster applied,
and at the end Koji is started and tests are conducted.
Deployment flow The execution of the system can be best presented as an ordered flow of actions leading to successful
deployment of Koji cluster on a hybrid cloud. In the Figure 8 basic flow of this deployment is visualized.
Firstly, user of the system has to download the sources from Git repository using the command git clone
[email protected]:tomasz_klosinski/koji-cookbook.git (provided that he or she has
access to it). After download of the “sources” of the cookbook and infrastructure configuration, user has to
enter the directory with it. Next step is running command vagrant up, which starts the process of
deploying the infrastructure. In this step Vagrant starts executing Vagrantfile that contains all necessary
information about the machines.
74 Stephen Nelson-Smith, op. cit., p. 37 75 https://docs.chef.io/#chef-dk-title (25/01/2015)
35
Figure 8 Flow of deployment
Instances on clouds are started in accordance to order of appearance in Vagrantfile. The first starts an
instance with Koji hub on Amazon EC2 and then instance with Koji builder on OpenStack. After the instances
are created, Vagrant uses SSH protocol to log into them. Firstly, Vagrant copies the cookbooks (using rsync),
then it runs Chef Client to execute them. Chef Client registers a new “node” in Chef Server and provides basic
information about it (collected by Ohai). After execution of the cookbooks Koji cluster is ready to use by the
user.
System’s implementation The implementation of the system is based on three DevOps technologies on Linux operating system: Git,
Vagrant and Chef. Git is used with a remote repository at Bitbucket.com. To extend Vagrant configuration
possibilities four plugins are installed. Chef is used by Vagrant to provision software on cloud instances. Chef
cookbooks and information about nodes are stored on Hosted Chef Server.
Git Version control system (Source Code Manager) is central component of the project. It is not mere addition to
the project – it is its heart. It helps to stay sane when dealing with important files and collaborating on them.
Using version control is a fundamental part of any infrastructure automation.
36
Entire history of changes of project files is kept in Git repository. The local repository is synchronized with
remote one at Bitbucket.com. Figure 9 presents a history of commits of the Git repository at Bitbucket’s web
interface.
Figure 9 Git: Bitbucket.com repository
Although it is possible to develop code without version control, in practice this is highly inefficient. Especially,
when project is developed by group of developers and when it has different versions (which can be kept in
separate branches). The thesis’s project had only one author, but it had few versions. Additionally, thanks to
version control it is much easier to rollback changes that were a mistake.
Vagrant Vagrant is a tool that handles running the instances on OpenStack and AWS cloud and starts Chef Client on
them, which in turn handles the provisioning of Koji.
Vagrant enables configurable virtual infrastructure environments that are easy to reproduce. Such
environment can be built with usage of many hypervisors or cloud providers and many provision technologies
in a single consistent workflow. Machines (virtual machines or cloud instances76) are provisioned on top of
VirtualBox, VMware, AWS, or any other provider. Then provisioning tools such as shell scripts, Chef, or
Puppet, can be used to automatically install and configure software on them.77
Vagrantfile Each Vagrant project has a file which contains the definitions of virtual machines instances - it is called
Vagrantfile. In this file we also configure the connection to VMs provider (hypervisor or cloud) and software
provisioning (shell scripts and Chef).
Usually there is one Vagrantfile per project, although there may be more and the active one is regulated by a
VAGRANT_VAGRANFILE shell variable. Vagrantfile should be version controlled. With VCS it is easier to share
the environment definition and collaborate on it. Vagrantfile works the same way on each platform that
76 In the thesis these two names are used exchangeable. 77 http://docs.vagrantup.com/v2/why-vagrant/index.html (12/11/2014)
37
Vagrant supports. The syntax of Vagrantfiles is Ruby, but knowledge of this programming language is not
necessary to make modifications, since it is mostly simple variable assignment.78
Vagrant plugins Vagrant is extendable by plugins.79 In the thesis project four Vagrant plugins are used. Their usage is
described in Figure 10 Vagrant: List of plugins used.
Plugin name Description
vagrant-omnibus Ensures the desired version of Chef is installed via the platform-specific Omnibus packages.80
vagrant-berkshelf Adds Berkshelf integration to the Chef provisioners. Vagrant Berkshelf will automatically download and install cookbooks onto instances.81
vagrant-aws Adds an AWS provider to Vagrant, allowing Vagrant to control and provision machines in EC2 and VPC.82
vagrant-openstack-provider Adds an OpenStack Cloud provider to Vagrant, allowing Vagrant to control and provision machines within OpenStack cloud.83
Figure 10 Vagrant: List of plugins used
Vagrant plugins are usually installed manually using command vagrant plugin install NAME.
However, this process can be automated in Vagrantfile and Figure 11 shows how to achieve that result. In
this block of code first list of required plugins is defined. Than using Ruby’s each loop, the list is iterated by
|plugin| element to install the plugin (only unless it is not already installed).
Common elements Once all plugins are installed, we can start analyzing the main part of Vagrantfile. Firstly, we have to check if
Vagrant is installed in proper version.84 This is necessary since there was a break in API’s between second
78 http://docs.vagrantup.com/v2/vagrantfile/index.html (12/11/2014) 79 Full list of Vagrant plugins is available on Vagrant’s wiki: https://github.com/mitchellh/vagrant/wiki/Available-Vagrant-Plugins (27/11/2014) 80 https://github.com/chef/vagrant-omnibus (27/11/2014) 81 https://github.com/berkshelf/vagrant-berkshelf (27/11/2014) 82 https://github.com/mitchellh/vagrant-aws (27/11/2014) 83 https://github.com/ggiamarchi/vagrant-openstack-provider (27/11/2014) 84 http://docs.vagrantup.com/v2/vagrantfile/vagrant_version.html (27/11/2014)
required_plugins = %w( vagrant-omnibus vagrant-berkshelf vagrant-aws
vagrant-openstack-provider )
required_plugins.each do |plugin|
system "vagrant plugin install #{plugin}" unless Vagrant.has_plugin?
plugin
end
Figure 11 Vagrantfile: plugins installation
38
(1.1+) and first version (1.0.x) of Vagrant.85 From that time Vagrant.configure takes as an argument the API’s
version (in our case it is “2”).
Vagrant.configure is main element of Vagrantfile. In this Ruby’s block of code all configuration is placed. The
file is too long to place in this section of the thesis – for the convenience of reading it is explained excerpt by
excerpt. Full file is provided in
85 http://docs.vagrantup.com/v2/vagrantfile/version.html (27/11/2014)
VAGRANTFILE_API_VERSION = "2"
Vagrant.require_version ">= 1.5.0"
Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
(...)
end
Figure 12 Vagrantfile: API’s and syntax’s version
39
Appendix B: Project’s Vagrant files.
We start the configuration by appending to block variable “config” three variables: nfs.functional, ssh.pty,
and berkshelf.enabled. First one disables the NFS sync of “vagrant” directory – that means that rsync will be
used instead. Second variable forces Vagrant to use pseudoterminal (pty) in SSH session. Third variables
enables vagrant-berkshelf plugin to handle the cookbook management.
Further in Vagrantfile we have definition of instances (virtual machines), which starts with vm.define with
name of an instance as argument. Inside the block of VM definition there are details of its configuration
(including networking, software provisioning and others).
In Figure 14 Vagrantfile: Definitions of the VMs we have an excerpt that show fragment of definition of two
virtual machines: “kojihub” and “kojibuidler”. They both have set up three variables. First is their hostname.
Second is their synced folder (directory to copy to the instance). In our case with synchronize the current
directory (indicated by “.”) to /vagrant directory on VMs using rsync. Third variable is related to chef
provisioner – Vagrant installs latest chef using omnibus installer on virtual machines.
VMs are executed by the order of appearance in Vagrantfile. Therefore first is created “kojihub” on Amazon
EC2 and second “kojibuilder” on OpenStack.
# Koji hub on Amazon EC2
config.vm.define "kojihub" do |kojihub|
kojihub.vm.hostname = "kojihub"
kojihub.vm.synced_folder '.', '/vagrant', type: "rsync"
koji.omnibus.chef_version = :latest
(...)
end
# Koji builder on OpenStack
config.vm.define "kojibuilder" do |kojibuilder|
kojibuilder.vm.hostname = "kojibuilder"
kojibuilder.vm.synced_folder ".", "/vagrant", type: "rsync"
koji.omnibus.chef_version = :latest
(...)
end
Figure 14 Vagrantfile: Definitions of the VMs
config.nfs.functional = false
config.ssh.pty = true
config.berkshelf.enabled = true
Figure 13 Vagrantfile: Config specific to installation
40
Kojihub instance Up to this point of Vagrantfile, configuration of first instance was the same as instance of the second.
Further, the configurations will be explained one machine after another. Details of one machine will be
presented in following excerpts with comments regarding given part of code.
In Figure 16 Vagrantfile: Amazon dummy box we see two variables regarding the VM image (“box” in
Vagrant’s nomenclature) that Vagrant typically uses to run virtual machine on a given hypervisor. Since in
cloud environments the images are already provided, we set up a dummy box provided by provider’s plugin
developer.
Next part of the VM configuration is provider. Since we’d like to run Koji hub on Amazon’s cloud, we have to
configure the details of connection. Most of the values are already set in environment variables.
Now, we just have to provide additional configuration in user_data variable. When you launch an instance in
Amazon EC2, you have the option of passing user data to the instance that can be used to perform common
automated configuration tasks and even run scripts after the instance starts. You can pass two types of user
# Koji hub on Amazon EC2
config.vm.define "kojihub" do |kojihub|
(...)
kojihub.vm.box = "dummy.box"
kojihub.vm.box_url = "https://github.com/mitchellh/vagrant-
aws/raw/master/dummy.box"
(...)
Figure 16 Vagrantfile: Amazon dummy box
# AWS provider
kojihub.vm.provider :aws do |aws, override|
aws.access_key_id = "#{ENV['AWS_ACCESS_KEY']}"
aws.secret_access_key = "#{ENV['AWS_SECRET_KEY']}"
aws.keypair_name = "#{ENV['AWS_KEYPAIR']}"
override.ssh.private_key_path = "#{ENV['AWS_PRIVATE_KEY_PATH']}"
override.ssh.username = "#{ENV['AWS_SSH_USERNAME']}"
aws.ami = "#{ENV['AWS_AMI_IMAGE']}"
aws.region = "#{ENV['AWS_REGION']}"
aws.instance_type = "#{ENV['AWS_INSTANCE_TYPE']}"
aws.security_groups = ["#{ENV['AWS_SECURITY_GROUP']}" ]
aws.user_data = "#!/bin/bash
echo 'Defaults:#{ENV['AWS_SSH_USERNAME']} !requiretty' >
/etc/sudoers.d/999-vagrant-cloud-init-requiretty
chmod 440 /etc/sudoers.d/999-vagrant-cloud-init-requiretty
mkdir -p /etc/chef/ohai/hints
touch /etc/chef/ohai/hints/ec2.json"
end
Figure 15 Vagrantfile: AWS provider
41
data to Amazon EC2: shell scripts and cloud-init directives. You can also pass this data into the launch wizard
as plain text, as a file (this is useful for launching instances via the command line tools), or as base64-encoded
text (for API calls).86
In our case we provide a simple shell script that handles two things: firstly, it sets sudo so that it doesn’t
require terminal (tty) from our user; secondly we add create an empty /etc/chef/ohai/hints/ec2.json file,
which is required by Chef’s Ohai to collect instance metadata from Amazon EC2.
After this basic configuration of instance, we move to next step in Vagrantfile – provisioning. In this part we
configure Chef and use cookbooks to install and configure additional software (that is Koji hub in case of this
instance).
86 http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html (12/11/2014)
kojihub.vm.provision "chef_client" do |chef|
chef.node_name = "koji"
chef.chef_server_url = "#{ENV['CHEF_SERVER']}"
chef.validation_key_path = "#{ENV['CHEF_VALIDATION_KEY_PATH']}"
chef.validation_client_name = "#{ENV['CHEF_VALIDATION_CLIENT_NAME']}"
chef.json =
{
"build-essential" => {
"compiletime" => true
},
postgresql: {
password: {
postgres: '123123',
port: 5432
}
},
apache: {
listen_ports: ['80', '443'],
listen_address: '0.0.0.0'
},
selinux: {
state: 'disabled'
}
}
chef.run_list = [
"recipe[nfs::server]",
"recipe[iptables::disabled]",
"recipe[koji::default]",
"recipe[koji::test]"
]
chef.delete_node = true
chef.delete_client = true
end
Figure 17 Vagrantifle: Chef provisioning of koji hub
42
Figure 17 presents part of Vagrantfile that is dedicated to provisioning Koji hub and other software using
Chef. First variable defines name of the node registered in Chef Server (Hosted Chef in our case). Next three
variables are provided using environment variables: Chef Server URL, validation key path and validation key
name.
Next two variables are converted by Vagrant into JSON file that is provided to Chef for configuration. First is a
Ruby hash that overrides the default attributes of cookbooks. In this example we set build-essential
cookbook to install packages that are needed to compile Ruby gems written in C. Then we provide password
and port for PostgreSQL, ports and listen address to Apache, and we set SELinux to disabled. Second variable
is Chef’s run list – that is list of cookbook’s recipes that are to be executed on the node. In our example we
install NFS server, we disable iptables, we run default recipe of Koji and then its test recipe.
In this point configuration of first VM ends, and we proceed to next one.
Kojibuilder instance Next instance have many similarities and analogues parts to first one. Firstly, just as in the case of first VM,
we don’t need image for OpenStack provider and we use a dummy image (box) instead.
Next part of the VM’s configuration is OpenStack provider details.
# Koji builder on OpenStack
config.vm.define "kojibuilder" do |kojibuilder|
(...)
kojibuilder.vm.box = "dummy.box"
kojibuilder.vm.box_url = "https://github.com/cloudbau/vagrant-
openstack-plugin/raw/master/dummy.box"
(...)
Figure 18 Vagrantfile: OpenStack dummy box
43
As in the case of AWS’s instance, OpenStack requires credentials and SSH keys. Additionally we need to
provide also Keystone (OpenStack’s authorization service) endpoint URL, tenant name, flavor, image and
floating IPs pool’s name. All those values are provided in environment variables in our example.
Next variable is related to additional (persistent) storage attached to instance – volume. This volume was
created earlier in OpenStack. In this point we only provide its id and name under which it will be accessible in
the instance.
Last part, analogously to AWS case, is user data that provides modification of sudoers file to enable SSH login
without terminal (tty). Next, we create the /etc/chef/ohai/hints/openstack.json file to make Chef’s Ohai
collect metadata about the instance from OpenStack. And at the end there is a sequence of commands that
make partition on attached volume, format it with ext4 filesystem and mount it to /var/koji.
# OpenStack provider
kojibuilder.vm.provider :openstack do |os, override|
os.username = "#{ENV['OS_USERNAME']}"
os.password = "#{ENV['OS_PASSWORD']}"
os.public_key_path = "#{ENV['OS_PUBLIC_KEY_PATH']}"
override.ssh.private_key_path = "#{ENV['OS_PRIVATE_KEY_PATH']}"
override.ssh.username = "#{ENV['OS_SSH_USERNAME']}"
os.openstack_auth_url = "#{ENV['OS_AUTH_URL']}/tokens"
os.tenant_name = "#{ENV['OS_TENANT_NAME']}"
os.flavor = "#{ENV['OS_FLAVOR']}" # 'm1.small'
os.image = "#{ENV['OS_IMAGE']}" # 'Fedora 20 x86_64'
os.floating_ip_pool = "#{ENV['OS_FLOATING_IP_POOL']}" # 'public'
os.volumes = [
{
id: 'f9976f16-3d9d-499a-86c1-42247588b3da',
device: '/dev/vdb'
}
]
os.user_data = "#!/bin/bash
echo 'Defaults:#{ENV['OS_SSH_USERNAME']} !requiretty' >
/etc/sudoers.d/999-vagrant-cloud-init-requiretty
chmod 440 /etc/sudoers.d/999-vagrant-cloud-init-requiretty
mkdir -p /etc/chef/ohai/hints
touch /etc/chef/ohai/hints/openstack.json
(echo o; echo n; echo p; echo 1; echo ; echo; echo w) | fdisk
/dev/vdb
mkfs.ext4 /dev/vdb1
mkdir -p /var/koji
mkdir -p /var/koji/mock
mkdir -p /var/koji/tmp
mount /dev/vdb1 /var/koji"
end
Figure 19 Vagrantfile: OpenStack provider
44
In the Figure 20 we have provisioning of Koji builder using Chef. Similarly to AWS’s instance, first variable
defines name of the node registered in Chef Server (Hosted Chef). Next three variables are provided using
environment variables: Chef Server URL, validation key path and validation key name.
And again next two variables are converted by Vagrant into JSON file that is provided to Chef for
configuration. First one overrides the default attributes of cookbooks: here we set SELinux to disabled.
Second variable is list of cookbook’s recipes: here we install NFS client (using default recipe), we disable
iptables and we run kojid recipe of Koji cookbook.
That’s is everything required to run Koji cluster on a hybrid cloud using Amazon Web Services and OpenStack.
In next section of this chapter we will discuss Chef and Koji cookbook.
Chef As the discipline of software development has matured, frameworks have emerged with the aim of reducing
development time by minimizing the overhead of having to implement or manage low-level details that
support the development effort. This allows developers to concentrate on rapid delivery of software that
meets customer requirements.
Chef is a framework for infrastructure development—a supporting structure and package of associated
benefits of direct relevance to framing one’s infrastructure as code. Chef provides an extensive library of
primitives for managing just about every conceivable resource that is used in the process of building up an
# Enable provisioning with chef client/
kojibuilder.vm.provision "chef_client" do |chef|
chef.node_name = "kojibuilder"
chef.chef_server_url = "#{ENV['CHEF_SERVER']}"
chef.validation_key_path = "#{ENV['CHEF_VALIDATION_KEY_PATH']}"
chef.validation_client_name = "#{ENV['CHEF_VALIDATION_CLIENT_NAME']}"
chef.json =
{
selinux: {
state: 'disabled'
}
}
chef.run_list = [
"recipe[nfs]",
"recipe[iptables::disabled]",
"recipe[koji::kojid]"
]
chef.delete_node = true
chef.delete_client = true
end
end
Figure 20 Vagrantfile: Chef provisioning of Koji builder
45
infrastructure within language for modeling infrastructure, and a consistent abstraction layer that allows
developers and system administrators to design and build scalable environments without getting dragged
into operating system and low-level implementation details. It also provides some design patterns and
approaches for producing consistent, shareable, and reusable components.87 It was initially written in Ruby,
but the latest version is a mixture of Erlang and Ruby.
Chef is a set of DevOps tools that enable managing both physical and cloud servers. With support of version
control system it allows to create perfect clones of infrastructure environments with full change history
(allowing to rollback to any version or creating new branches of infrastructure’s configuration). Thanks to
Chef’s “Search” it is easy to configure applications that require knowledge about infrastructure (for instance
about cookbooks applied to other servers or their network configuration, etc.). The advantage of Chef is that
once servers are automated using it, replication of the whole infrastructure becomes very easy.
Chef consists of three logical components: Server, Workstation and Node (in practice Workstation is a special
form of a node). Chef Server holds the configuration data for each and every node registered with it.
Workstation holds the local Chef repository (it’s the Chef user’s personal computer). A node is a client that is
registered with the Chef Server. It has an agent known as Chef Client installed on it.
To automate the configuration of a node cookbooks are used. Chef Cookbook is the basic building blocks of
the automation. It defines a complete scenario of a node, like for instance packages installation and their
configuration. They hold the type of configuration that needs to be done on a node.88
Chef Client A Chef node needs to have an agent, known as Chef Client, installed on it. It is used to interact with the Chef
Server and to pull the configuration that needs to be done on the node.
The process conducted by Chef Client is following: firstly, it registers the node with the Chef Server; then it
downloads the required cookbook in the local cache and compiles the required recipes. Finally, it configures
the node and brings it to the expected state.89
An agent that runs on systems being managed by Chef, and the primary mechanism by which such systems
communicate with the Chef server. chef-client uses the framework’s library of primitives to configure
resources on a system by talking to a central server API to retrieve data.90
Ohai Ohai is a built-in tool that comes with Chef and is used to provide node attributes to the Chef Client so that a
node can be configured. Chef client requires some information about the node whenever it runs. Ohai is used
to detect certain attributes of that particular node and then provide them to the chef client whenever
87 Stephen Nelson-Smith, op. cit., p. 50 88 Navin Sabharwal, Manak Wadhwa, Automation through Chef Opscode, Apress, 2014, p. 4 89 Navin Sabharwal, Manak Wadhwa, op. cit., p. 5 90 Stephen Nelson-Smith, op. cit., p. 51
46
required. Ohai can also be used as a stand-alone component for discovery purposes. Ohai can provide a
variety of details from networking to platform information.91
It is a system profiling tool that gathers large quantities of data about the system, from network and user
data to software and kernel versions. Ohai is extendable – plugins can be written (usually in Ruby) that will
furnish data in addition to the defaults. The collected data is emitted in a machine-parseable and readable
format (JSON), and is used to build up a database of facts about each system that is managed by Chef.92
Chef Server Chef Server component is written in Erlang and uses a JSON-oriented document datastore. The whole Chef
framework is driven via a RESTful API, of which the Knife command-line tool is a client.
The server is open sourced, under the Apache 2.0 license, and is considered a reference implementation of
the Chef Server API. The API is also implemented as a hosted software-as-a-service offering. The hosted
version, called Hosted Chef, offers a fully resilient, highly available, multitenant environment. The platform is
free to use for fewer than five nodes, so it’s the ideal way to experiment with and gain experience with the
framework, tool, and API. A single standalone version of chef server can handle up to 10,000 nodes.
The Chef server also provides an indexing service. All information gathered about the resources managed by
Chef is indexed and searchable, meaning that Chef becomes a coordination point for dynamic, data-driven
infrastructures. It is possible to issue queries for any combination of attributes—for example, VMware
servers on VLAN 102 or MySQL slaves running CentOS 5. This opens up tremendously powerful capabilities –
a simple example would be a dynamic load balancer configuration that automatically includes the web
servers that match a given query to its pool of backend nodes.
The most important thing to understand is that the Chef server is fundamentally nothing more than a
publishing platform with an API, an index, and a dependency solver. All interactions, without exception, are
via the REST API.93
Chef Server is a centrally located server which holds all the data related to the registered nodes (i.e.,
cookbooks, the node object, and metadata). The agent (chef client) runs on each and every node, and it gets
the configuration data from the server and then applies the configuration to a particular node. This approach
is quite helpful in distributing the effort throughout the organization rather than on a single server.
There are two different types of chef server: Hosted Enterprise Chef and On Premises Chef Server.94
Additionally chef can be used in non-client/server architecture using Chef Solo.
91 Navin Sabharwal, Manak Wadhwa, op. cit., p. 5 92 Stephen Nelson-Smith, op. cit., p. 51 93 Stephen Nelson-Smith, op. cit., pp. 52-53 94 https://www.chef.io/chef/choose-your-version/ (10/11/2014)
47
Hosted Enterprise Chef Enterprise chef is the paid version of the chef server which comes with two types of installations: one is on-
premise installation (i.e., in your datacenter behind your own firewall) and the other is the hosted version in
which chef is offered as a service hosted and managed by Chef Company.
The major difference between the enterprise version and the open source version is that the enterprise
version comes with high-availability deployment support and has additional features on reporting and
security.
On Premises Chef The open source chef server has most of the capabilities of the enterprise version. However, this version also
has certain limitations. It can be installed only in stand-alone mode (i.e., it is not available in the hosted
model). The open source chef components need to be installed on a single server, and it doesn’t offer the
levels of security available in the enterprise version. It also doesn’t provide reporting capabilities like the
enterprise version.95
Search feature Search feature is essential part of Chef. It can be used in Knife or inside a cookbook.
Chef server maintains an index of your data (environments, nodes, roles). Search index easily allows you to
query the data that is indexed and then use it within a recipe. There is a specified query syntax that supports
range, wildcard, exact, and fuzzy. Search can be done from various places in chef; it can be within a recipe, it
can be from the management console. The search engine in a chef installation is based on Apache Solr.96
We can use the result of a search query in a recipe. The following code shows an example of using a simple
search query in a recipe:
search(:node, "attribute:value")
The result of a search query can be stored in variable and then can be used anywhere within a recipe.
The search query in Figure 21 shows fragment of builder.rb recipe. It will return the servers with the recipe
koji::hub applied and then it iterates over the result set to put into Chef’s log a string informing about name
of the found host and its IP address.
95 Navin Sabharwal, Manak Wadhwa, op. cit., p. 6 96 Navin Sabharwal, Manak Wadhwa, op. cit., pp. 90-91
kojihubs = search(:node, 'recipes:koji\:\:hub')
kojihubs.each do |node|
Chef::Log.info("#{node['hostname']} has IP address #{node['ipaddress']}")
end
Figure 21 Chef: Search in a recipe
48
Knife A workstation is a system that is used to manage chef. There can be multiple workstations for a single chef
server. It is simply a machine where knife is used to manage the Chef Server.
Knife is a command line tool used to interact with the chef server. The complete management of the chef
server is done using knife.
Some of the functions of knife include:
Managing nodes.
Uploading cook books and recipes.
Managing roles and environments.
Knife is a multipurpose command-line tool that facilitates system automation, deployment, and integration.
It provides command and control capabilities for managing physical, virtual, and cloud environments across a
range of Linux, Unix, and Windows platforms. It is also the primary means by which the underlying model
that makes up the Chef framework is managed. Knife is extensible and has a pluggable architecture.97
Figure 22 shows the content of .chef/knife.rb98 configuration file used in the thesis’s project. All of the data in
the file are hidden in environment variables (some of the values are shared with Vagrantfile’s configuration).
The only values that are provided directly in the file are related to default author’s name, email and copyright
of a new cookbook created using knife.
97 Stephen Nelson-Smith, op. cit., p. 52 98 See https://docs.chef.io/config_rb_knife.html for more information on knife configuration options.
49
Knife is a tool that very useful to investigate the nodes and their attributes. It also enables us to use Chef’s
search. It can be also used to test if new configuration of server was applied.
Some of the useful knife commands include:
knife node list
knife search node ‘recipes:cookbook\:\:recipe’
knife search node ‘recipes:cookbook\:\:recipe’ -a attribute_name
knife node show -l node_name
First one list the nodes registered in Chef Server. Second search for the node that has “cookbook::recipe”
applied. Third one additionally shows a given attribute of that node. Forth one shows all information about a
node formatted in human-readable way.
Other Chef tools Chef includes also few others tools that were not used in the project, but are worth mentioning99:
Chef Shell – an interactive debugging console that provides command-line access to the framework’s
libraries, the API, and the local system’s data.
99 Stephen Nelson-Smith, op. cit., p. 51
current_dir = File.dirname(__FILE__)
log_level :info
log_location STDOUT
node_name "workstation"
client_key "#{current_dir}/workstation.pem"
validation_client_name "#{ENV['CHEF_VALIDATION_CLIENT_NAME']}"
validation_key "#{ENV['CHEF_VALIDATION_KEY_PATH']}"
chef_server_url "#{ENV['CHEF_SERVER']}"
cache_type 'BasicFile'
cache_options( :path => "#{ENV['HOME']}/.chef/checksums" )
cookbook_copyright "Tomasz Kłosiński"
cookbook_license "All rights reserved"
cookbook_email "[email protected]"
# AWS
knife[:aws_access_key_id] = ENV['AWS_ACCESS_KEY_ID']
knife[:aws_secret_access_key] = ENV['AWS_SECRET_ACCESS_KEY']
# OpenStack
knife[:openstack_auth_url] = "#{ENV['OS_AUTH_URL']}/tokens"
knife[:openstack_username] = "#{ENV['OS_USERNAME']}"
knife[:openstack_password] = "#{ENV['OS_PASSWORD']}"
knife[:openstack_tenant] = "#{ENV['OS_TENANT_NAME']}"
Figure 22 Chef: knife.rb configuration file
50
Chef Solo – a fully featured standalone configuration management tool that allows access to a
subset of Chef’s features without using a Chef server; suitable for simple deployments.
Chef Apply – a lightweight tool for configuring a machine to perform a function with a single
command, needing no configuration or Chef server.
Berkshelf Berkshelf is not part of Chef framework, but it is a tool that rather complements it.
At the beginning of Chef, user had to manually ensure that all dependent cookbooks are installed. User had
to download each and every one of them manually only to find out that with each downloaded cookbook,
another set of dependent cookbooks was inherited. This process is IT world is known as “dependency hell”.100
To fix this Knife gain a possibility of “site install”, which installed all the dependencies locally for the user.
However, this was still not optimal solution, since cookbook directory in the user’s repository get cluttered
with all dependent cookbooks. Usually, user haven’t really care about all those cookbooks and haven’t
wanted to see or even manage them. Additionally, knife’s “site install” downloaded always current version of
dependent cookbooks. However in some situations a particular version of cookbook was needed. Also
sharing the list of cookbooks was problematic.
This is where Berkshelf came to fix these problems. It works like Bundler for Ruby gems, managing cookbook
dependencies for the user. It downloads all the defined dependencies recursively. Instead of polluting user’s
Chef repository, it stores all the cookbooks in a central location (usually ~/.berkshelf.d/). User just commit
Berkshelf dependency file (called Berksfile) to repository, and every other person sharing this repository or
every build server could download and install all those dependent cookbooks based on it.101
Berkshelf shares twin goals of Bundler:
Ensure that the appropriate dependencies are installed for a given problem without encountering
unpleasant ordering issues or cyclical dependencies.
Ensure code can be shared between other developers, or other machines or environments, and be
confident the code and its dependencies will behave in the same way.
Berkshelf solves these problems for cookbooks, only in the place of a Gemfile, Berkshelf has a Berksfile. As
soon as we started relying on recipes from other cookbooks and made use of the include_recipe resource, we
needed to update the metadata.rb file to specify an explicit dependency on the cookbook that provided the
recipe or LWRP that we wanted. That’s perfectly reasonable and to be expected. However, my expectation is
that you pretty soon got tired of having to solve cookbook dependencies manually and recursively. Similarly,
having to upload cookbooks in the right order, one at a time, was equally tiresome. Berkshelf takes these
pains away by providing a local dependency solving solution, and by functioning as a Chef API client for
uploading cookbooks.
100 http://en.wikipedia.org/wiki/Dependency_hell (20/11/2014) 101 Matthias Marschall, Chef Infrastructure Automation Cookbook, Packt Publishing, 2013, p. 25
51
Berkshelf provides considerably more functionality than this. It’s pivotal to an entire Chef development
workflow, dubbed “The Berkshelf Way” by the group of developers from Riot Games, the company behind
Berkshelf, who open sourced it and its component tools.102
As presented in Figure 23 Berksfile consists of source directive, metadata directive and list of dependent
cookbooks. Source is a link to Berkshelf’s cookbook resource – from this website the dependent cookbook
will be downloaded. In our case this is central Chef’s community cookbook repository called Chef
Supermarket (supermarket.getchef.com). Metadata indicates that Berkshelf will also download and manage
the list of dependent cookbooks from metadata.rb file.
Koji cookbook A cookbook is the basic unit of configuration and policy definition in Chef. It defines a complete scenario for
the deployment and configuration of a Koji cluster. Chef cookbook is written in Ruby as the reference library.
For using specific Chef’s resources in cookbook extended DSL (Domain Specific Language) can be used.
It plays following role in Chef ecosystem103:
A cookbook defines the files that need to be distributed for that component onto the client.
It defines the attribute values that should be present on the nodes.
It provides definitions for reusability of code.
It provides libraries which can be used to extend the functionality of chef.
It provides recipes that specify the resources and the order of execution of code on the client.
It provides templates for file configurations.
It provides metadata which can be used specify any kind of dependency, version constraints, and so
on.
102 Stephen Nelson-Smith, op. cit., p. 173 103 Navin Sabharwal, Manak Wadhwa, op. cit., p. 87
source "https://supermarket.getchef.com"
metadata
cookbook "resource-control"
cookbook "apache2"
cookbook "database"
cookbook "hostsfile"
cookbook "postgresql"
cookbook "selinux"
cookbook "yum"
cookbook "yum-epel"
cookbook "chef-zero"
cookbook "chef"
cookbook "iptables"
cookbook "ohai"
Figure 23 Berkshelf: Berksfile
52
Metadata Cookbook metadata is used to store certain information about it. For this purpose the file metadata.rb
provides this information. The file is located in the cookbook directory.
A metadata can be used to specify the following important things104:
Dependencies: If the cookbook is dependent on any other cookbook.
Description: What the cookbook is actually doing.
Supported OS list.
Name of the cookbook.
Version of the cookbook.
The project’s cookbook metadata is presented in Figure 24. It is divided in two sections: in the first one there
are provided basic information about cookbook (name, maintainer, description, etc.); second section
contains the list of dependent cookbooks (each line starts with “depends” and name of the cookbook).
These dependency cookbook are required for Koji cookbook to run properly. Traditionally they were
managed manually by the cookbook user. However, nowadays dependencies are installed and uploaded to
Chef Server by Berkshelf basing on Berksfile file (which includes also metadata.rb file dependencies).
Attributes An attribute is a specific detail about a node. They usually contain information about the current state of the
node, state of the node at the end of the previous chef-client run, and what the state of the node should be
at the end of the current chef-client run.
Attributes are defined by:
The state of the node itself
104 Ibid., p. 118
name 'koji'
maintainer 'Tomasz Kłosiński'
maintainer_email '[email protected]'
license 'All rights reserved'
description 'Installs/Configures Koji'
long_description IO.read(File.join(File.dirname(__FILE__), 'README.md'))
version '0.1.2'
depends "database"
depends "apache2"
depends "postgresql"
depends "hostsfile"
depends "nfs"
depends "selinux"
depends "yum"
depends "yum-epel"
Figure 24 Cookbook: metadata
53
Cookbooks (in attribute files and/or recipes)
Roles
Environments
During every chef-client run, the chef-client builds the attribute list using data about the node collected by
Ohai, the node object that was saved to the Chef server at the end of the previous chef-client run. Eventually
it collects information from the rebuilt node object from the current chef-client run, after it is updated for
changes to cookbooks (attribute files and/or recipes), roles, and/or environments, and updated for any
changes to the state of the node itself.105
After the node object is rebuilt, all of attributes are compared, and then the node is updated based on
attribute precedence. At the end of every chef-client run, the node object that defines the current state of
the node is uploaded to the Chef server so that it can be indexed for search.
Attributes enables us overriding values of the cookbook configuration. Default values of variables are usually
hardcoded in cookbook, but they can be easily change through Chef’s JSON mechanism. By overriding default
values set in cookbooks, users can inject their own values.106
An attribute file is located in the attributes/default sub-directory for a cookbook. When a cookbook is run
against a node, the attributes contained in all attribute files are evaluated in the context of the node object.
Node methods (when present) are used to set attribute values on a node.
105 https://docs.chef.io/attributes.html (12/11/2014) 106 Matthias Marschall, op. cit., p. 98
node.default['koji']['domain'] = "example.com"
node.default['koji']['database']['name'] = "apache"
node.default['koji']['database']['user'] = "apache"
node.default['koji']['database']['ipaddress'] = "127.0.0.1"
node.default['koji']['database']['password'] = "apache"
node.default['koji']['hub']['topdir'] = "/mnt/koji"
node.default['koji']['client']['server'] = "http://koji.#{node['koji']['domain']}/kojihub"
node.default['koji']['client']['weburl'] = "http://koji.#{node['koji']['domain']}/koji"
node.default['koji']['client']['topurl'] = "http://kojipkgs.#{node['koji']['domain']}/koji"
node.default['koji']['kojira']['server'] = "http://koji.#{node['koji']['domain']}/kojihub"
node.default['koji']['kojira']['weburl'] = "http://koji.#{node['koji']['domain']}/koji"
node.default['koji']['kojira']['topurl'] = "http://kojipkgs.#{node['koji']['domain']}/koji"
node.default['koji']['kojid']['server'] = "http://koji.#{node['koji']['domain']}/kojihub"
node.default['koji']['kojid']['weburl'] = "http://koji.#{node['koji']['domain']}/koji"
node.default['koji']['kojid']['topurl'] = "http://kojipkgs.#{node['koji']['domain']}/koji"
Figure 25 Cookbook: Attributes
54
In Figure 25 there are listed default attributes for Koji cookbook (attributes/default.rb file). Among them we
have domain name, database configuration details, main directory of Koji hub and connection details for Koji
client.
Templates Template is a Chef’s resource that is used to manage the contents of a configuration file. It stores files in an
ERB (Embedded Ruby) template. Templates are stored in the template/default subdirectory of the
cookbook.107
Embedded Ruby allows Ruby code to be embedded within a pair of <% and %> delimiters. These embedded
code blocks are then evaluated in place (they are replaced by the result of their evaluation).108 To implement
ERB Chef uses Erubis109 as its template language.
There two types of delimiters in ERB:
<%= %> is used to print the value of a variable or Ruby expression into the generated file.
<%- %> use used to embed Ruby logic into the template file (it allows to loop over a list for
instance).110
/etc/koji.conf File /etc/koji.conf is based on client-koji.conf.erb template file. It provides basic configuration for Koji client:
that is details regarding connection to Koji hub, such as server URL, web URL, top directory URL and top
directory path on the server.
107 Navin Sabharwal, Manak Wadhwa, op. cit., p. 113 108 Stephen Nelson-Smith, op. cit., p. 234 109 Erubis website: http://www.kuwata-lab.com/erubis/ (03/01/2015) 110 Matthias Marschall, op. cit., p. 103
[koji]
;configuration for koji cli tool
;url of XMLRPC server
server = <%= node[:koji][:client][:server] %>
;url of web interface
weburl = <%= node[:koji][:client][:weburl] %>
;url of package download site
topurl = <%= node[:koji][:client][:topurl] %>
;path to the koji top directory
topdir = <%= node[:koji][:client][:topdir] %>
Figure 26 Cookbook: Template /etc/koji.conf
55
/etc/httpd/conf.d/kojihub.conf This file is based on httpd-kojihub.conf.erb template file. It is configuration of XML-RPC server running under
mod_wsgi in Apache. As Figure 27 shows, there are no modifications attributes in it and default values from
Koji hub installation is used instead.
/etc/koji-hub/hub.conf This file is based on hub.conf.erb template file. It handles the configuration of Koji hub – its connection to
database, its main directory (where Koji stores packages and repositories), URL to Koji web and other
variables. By default new user login creates the user in database and Koji notifies package maintainer that a
build was success.
Alias /kojihub /usr/share/koji-hub/kojixmlrpc.py
<Directory "/usr/share/koji-hub">
Options ExecCGI
SetHandler wsgi-script
Order allow,deny
Allow from all
</Directory>
Alias /kojifiles "/mnt/koji/"
<Directory "/mnt/koji">
Options Indexes
AllowOverride None
Order allow,deny
Allow from all
</Directory>
Figure 27 Cookbook: Template /etc/httpd/conf.d/kojihub.conf
[hub]
DBName = <%= node[:koji][:database][:name] %>
DBUser = <%= node[:koji][:database][:user] %>
DBHost = <%= node[:koji][:database][:ipaddress] %>
DBPass = <%= node[:koji][:database][:password] %>
KojiDir = <%= node[:koji][:hub][:topdir] %>
LoginCreatesUser = On
KojiWebURL = <%= node[:koji][:hub][:weburl] %>
NotifyOnSuccess = True
Figure 28 Cookbook: Template /etc/koji-hub/hub.conf
56
/etc/kojira/kojira.conf This file is based on kojira.conf.erb template file. It contains the configuration of Kojira service, which is
responsible for keeping order in main RPM repository of Koji (that is, it deletes old builds). In the
configuration file, presented in Figure 29 we can see that attributes are used to provide server URL, top
directory path and rest of the variables are default ones.
/etc/kojid/kojid.conf This file is based on kojid.conf.erb template file. It controls the Koji builder (kojid) service. It includes variables
provided by Chef attributes: credentials, Koji hub URL, top URL and other variables are default.
Recipes Recipes are the configuration units in chef that are actually deployed on the client and are used to configure
the system. They are written in Ruby and Chef’s DSL. Recipes are normally a collection of resources with a bit
of Ruby code. A recipe helps in configuring the nodes that is stored in a cookbook. It can be used in any other
recipe. Every recipe is executed in a top-down approach.111
111 Navin Sabharwal, Manak Wadhwa, op. cit., pp. 88-89
[kojira]
user=kojira
password=kojira
server=<%= node[:koji][:kojira][:huburl] %>
topdir=<%= node[:koji][:kojira][:topdir] %>
logfile=/var/log/kojira.log
with_src=no
Figure 29 Cookbook: Template /etc/kojira/kojira.conf
[kojid]
user = <%= node['hostname'] %>
password = <%= node['hostname'] %>
topdir=<%= node[:koji][:kojid][:topdir] %>
workdir=/var/koji/tmp
mockdir=/var/koji/mock
mockuser=kojibuilder
vendor=Koji
packager=Koji
distribution=Koji
mockhost=koji-linux-gnu
server=<%= node[:koji][:kojid][:huburl] %>
topurl=<%= node[:koji][:kojid][:topurl] %>
Figure 30 Cookbook: Template /etc/kojid/kojid.conf
57
Ruby is a programming language designed to read and behave in a predictable manner. Recipe is mostly a
collection of resources, defined using patterns (resource names, attribute-value pairs, and actions). Recipe
must define everything that is required to configure part of a system. It also has to be stored in a cookbook.
One recipe may be included in another one or it can have a dependency on one (or more) recipes. Recipe
may use the results of a search query and read the contents of a data bag (including an encrypted data bag).
It may tag a node to facilitate the creation of arbitrary groupings. It must be added to a run-list before it can
be used by the chef-client and it is always executed in the same order as listed in a run-list.112
default.rb Default recipe is executed when a run list indicates a cookbook that has to be deployed but doesn’t specify
which recipe it has to run. Figure 31 presents default recipe of Koji cookbook. In this case firstly we ensure
installation of EPEL yum repository by including “yum-epel” recipe (from yum cookbook). Then default recipe
includes other Koji recipes: client, hub, database, kojira and builder.
In other words, entire Koji stack is installed except from Koji builder (kojid). Test recipe is also not included by
default.
112 http://docs.chef.io/recipes.html (11/12/2014)
node['yum']['epel']['enabled'] = true
include_recipe "yum-epel"
include_recipe "koji::client"
include_recipe "koji::hub"
include_recipe "koji::database"
include_recipe "koji::kojira"
include_recipe "koji::builder"
Figure 31 Cookbook: default.rb recipe
58
client.rb Client recipe installs and configure Koji client. Firstly, a user for Koji client is created (kojiadmin), a directory
“.koji” and a symlink to main configuration file. Thanks to this – a user to use Koji will only need to switch to
kojiadmin user. Recipe then provides using template resource a configuration file /etc/koji.conf and sets its
owner to root and mode to 0644.
user "kojiadmin" do
supports :manage_home => true
comment "kojiadmin"
home "/home/kojiadmin"
shell "/bin/bash"
password "123123"
end
directory "/home/kojiadmin/.koji" do
owner "kojiadmin"
group "kojiadmin"
mode 00755
action :create
end
link "/home/kojiadmin/.koji/config" do
user "kojiadmin"
to "/etc/koji.conf"
end
package "koji " do
action :install
end
template "/etc/koji.conf" do
source "client-koji.conf.erb"
mode 0644
owner "root"
group "root"
end
Figure 32 Cookbook: client.rb recipe
59
hub.rb Hub recipe is main part of the cookbook. As Figure 33 Cookbook: hub.rb recipe - part 1 indicates, this recipe
installs and configures the Koji hub (server). Firstly, the recipe ensures that proper FQDN is set in /etc/hosts
file. Then user “apache” is created (it is user that will be used by the Koji hub service. Next step is installing
Koji hub packages: koji-hub, httpd, mod_ssl and mod_wsgi. Last two lines are modification of Apache
hostsfile_entry "127.0.0.1" do
hostname 'koji.example.com'
aliases ['kojihub.example.com', 'kojiweb.example.com',
'kojipkgs.example.com']
unique true
comment 'Append by Recipe koji::hub'
action :append
end
user "apache" do
supports :manage_home => true
comment "apache"
home "/home/koji"
shell "/bin/bash"
password "123123"
end
%w{koji-hub httpd mod_ssl mod_wsgi}.each do |pkg|
package pkg do
action :install
end
end
node.default['apache']['prefork']['maxrequestworkers'] = 100
node.default['apache']['worker']['maxrequestworkers'] = 100
directory "/etc/httpd/conf.d/" do
owner "root"
group "root"
mode 00755
action :create
end
template "/etc/httpd/conf.d/kojihub.conf" do
source "httpd-kojihub.conf.erb"
mode 0440
owner "root"
group "root"
end
ruby_block "Add 'Include conf.d/*.conf' to /etc/httpd" do
block do
File.open("/etc/httpd/conf/httpd.conf", 'a').puts "Include
conf.d/*.conf"
end
end
Figure 33 Cookbook: hub.rb recipe - part 1
60
service’s configuration file that is required by Koji to work optimally. In next steps we add /etc/httpd/conf.d/
directory and provide to it Koji hub web server’s configuration file. Lastly using ruby block we add a line
“Include conf.d/*.conf” to /etc/httpd.conf.
In the next part of the file (Figure 34 Cookbook: hub.rb recipe - part 2), we provide main configuration file of
Koji hub: /etc/koji-hub/hub.conf. Then we establish a directory hierarchy for Koji in /mnt/koji and we export
it as a NFS share.
template "/etc/koji-hub/hub.conf" do
source "hub.conf.erb"
mode 0440
owner "root"
group "root"
end
%w{koji koji/packages koji/repos koji/work koji/scratch}.each do |dir|
directory "/mnt/" + dir do
owner "apache"
group "apache"
mode 00755
action :create
end
end
nfs_export "/mnt/koji" do
network '*'
writeable true
sync true
options ['no_root_squash', 'insecure']
end
directory "/var/www/html/koji" do
owner "apache"
group "apache"
mode 00755
action :create
end
Figure 34 Cookbook: hub.rb recipe - part 2
61
database.rb
include_recipe 'build-essential::default'
include_recipe "postgresql::server"
include_recipe "database::postgresql"
node.default['postgresql']['pg_hba'] = [
{:comment => '# TYPE DATABASE USER IP-ADDRESS
METHOD',
:type => 'local', :db => 'all', :user => 'all', :addr => nil, :method
=> 'trust'},
{:comment => '# TYPE DATABASE USER IP-ADDRESS
METHOD',
:type => 'local', :db => 'apache', :user => 'apache', :addr =>
nil, :method => 'trust'},
{:comment => '# TYPE DATABASE USER IP-ADDRESS
METHOD',
:type => 'local', :db => 'apache', :user => 'apache', :addr =>
nil, :method => 'trust'},
{:comment => '# TYPE DATABASE USER IP-ADDRESS
METHOD',
:type => 'host', :db => 'apache', :user => 'all', :addr =>
'127.0.0.1/32', :method => 'trust'},
{:comment => '# TYPE DATABASE USER IP-ADDRESS
METHOD',
:type => 'host', :db => 'template1', :user => 'all', :addr =>
'127.0.0.1/32', :method => 'trust'},
{:comment => '# TYPE DATABASE USER IP-ADDRESS
METHOD',
:type => 'host', :db => 'apache', :user => 'postgres', :addr =>
'0.0.0.0/0', :method => 'md5'}
]
connection_user_postgres = {
:host => '127.0.0.1',
:port => node['postgresql']['config']['port'],
:username => 'postgres',
:password => node['postgresql']['password']['postgres']
}
execute "Create a postgresql user for koji but grant no privileges" do
user "postgres"
exists = <<-EOH
psql -U postgres -d template1 -c \'\\du\' | grep -c apache
EOH
cwd "/var/lib/pgsql/"
command "psql -U postgres -d template1 -c \"CREATE ROLE apache
PASSWORD 'apache' NOSUPERUSER NOCREATEDB NOCREATEROLE LOGIN;\""
not_if exists, :user => "postgres"
end
Figure 35 Cookbook: database.rb - part 1
62
Database recipe installs and configures PostgreSQL server for Koji hub. It requires build-essential recipe to
build Ruby gem that enables connecting to PostgreSQL service and manipulate the schema from recipe’s
code. Next recipe that it requires is “postgresql::server”, which basically installs PostgreSQL server, and
“database::postgresql” recipe which provides resources for manipulating the database: its users and its
schema. It allows also creating new databases and tables.
In next step we configure pg_hba configuration file that is responsible for restricting access to PostgreSQL
database. Further, we define a connection to the database. Then we create a database user for Koji called
“apache”. In next excerpt of the code we start with creating a database for Koji called “apache” and after
that we create a connection to this database (with previously created “apache” user). Next step is executing
the SQL script that creates the schema of Koji’s database.
At the end we set PostgreSQL to listen to all addresses and we start the service and enable it at boot time.
postgresql_database "apache" do
connection connection_user_postgres
provider Chef::Provider::Database::Postgresql
template 'DEFAULT'
encoding 'DEFAULT'
tablespace 'DEFAULT'
connection_limit '-1'
owner 'apache'
action :create
end
connection_user_apache = {
:host => '127.0.0.1',
:port => node['postgresql']['config']['port'],
:username => 'apache',
:password => node['postgresql']['password']['postgres'],
:database_name => 'apache'
}
execute "run schema script: /usr/share/doc/koji-1.9.0/docs/schema.sql" do
user "postgres"
exists = <<-EOH
cat /var/lib/pgsql/lock.txt | grep -c lock
EOH
command "psql -U apache -d apache < /usr/share/doc/koji-
1.9.0/docs/schema.sql && echo lock > /var/lib/pgsql/lock.txt"
not_if exists, :user => "postgres"
end
node.default['postgresql']['config']['listen_addresses'] = '*'
service "postgresql" do
supports :status => true, :restart => true, :reload => true
action [ :enable, :start ]
end
Figure 36 Cookbook: database.rb recipe - part 2
63
kojira.rb Next recipe is for installation of Kojira. We start from installation of the koji-utils package (it includes kojira
service). Then we restart Apache server. We add a user and password for Kojira service in Koji hub database.
In next step we provide a configuration file /etc/kojira/kojira.conf. At the end we start and enable at boot the
service.
package "koji-utils" do
action :install
end
service "httpd" do
action :restart
end
bash "Add kojira user and grant repo permissions in koji" do
user "postgres"
cwd "/var/lib/pgsql"
exists = <<-EOH
psql -U apache -d apache -c "select * from users where
name='kojira'" | grep -c kojira
EOH
code <<-EOH
koji --user=admin --password=admin add-user kojira
psql apache -h127.0.0.1 --command "UPDATE users SET password='kojira'
WHERE name='kojira';"
koji --user=admin --password=admin grant-permission repo kojira
EOH
not_if exists, :user => 'postgres'
end
template "/etc/kojira/kojira.conf" do
source "kojira.conf.erb"
mode 0440
owner "root"
group "root"
end
service "kojira" do
supports :status => true, :restart => true, :reload => true
action [ :enable, :start ]
end
Figure 37 Cookbook: kojira.rb recipe
64
builder.rb Builder is a recipe that adds to Koji hub existing Koji builders. First we search in Chef Server if we have any
node that has “koji::kojid” recipe applied. Then we iterate over this result to configure each node. In the
block of configuration we start from adding the host to Koji hub, then we increase its capacity to “4.0”.
In next step we provide a user/password for this node so that it can with these credentials authorize against
Koji hub. Then we have granting permissions to this user so that it has access to the Koji’s repository.
kojibuilders = search(:node, 'recipes:koji\:\:kojid')
kojibuilders.each do |kojid|
execute "koji add-host #{kojid['hostname']} x86_64" do
user "root"
cwd "/root"
exists = <<-EOH
koji --user=admin --password=admin list-hosts | awk '{ print $1 }' | grep
-Fx #{kojid['hostname']} | grep -c #{kojid['hostname']}
EOH
command "koji --user=admin --password=admin add-host #{kojid['hostname']}
x86_64"
not_if exists
end
execute "koji edit-host --capacity=4.0 #{kojid['hostname']}" do
user "root"
cwd "/root"
exists = <<-EOH
koji --user=admin --password=admin list-hosts --channel=createrepo | awk
'{ print $1 }' | grep -Fx #{kojid['hostname']} | grep -c #{kojid['hostname']}
EOH
command "koji --user=admin --password=admin edit-host --capacity=4.0
#{kojid['hostname']}"
not_if exists
end
bash "Add kojid user" do
user "postgres"
cwd "/var/lib/pgsql"
exists = <<-EOH
psql -U apache -d apache -c "select name from users where
name='#{kojid['hostname']}'" | grep -c #{kojid['hostname']}
EOH
code <<-EOH
koji --user=admin --password=admin add-user #{kojid['hostname']};
EOH
not_if exists
end
Figure 38 Cookbook: builder.rb recipe - part 1
65
bash "Set kojid user's password" do
user "postgres"
cwd "/var/lib/pgsql"
exists = <<-EOH
psql -U apache -d apache -c "select password from users where
name='#{kojid['hostname']}'" | grep -c #{kojid['hostname']}
EOH
code <<-EOH
psql -U apache -d apache -h127.0.0.1 --command "UPDATE users SET
password='#{kojid['hostname']}' WHERE name='#{kojid['hostname']}';";
EOH
not_if exists
end
bash "Grant repo permissions to user kojid" do
user "postgres"
cwd "/var/lib/pgsql"
exists = <<-EOH
psql -U apache -d apache -c "select * from user_perms up join users u on
up.user_id=u.id where u.name='#{kojid['hostname']}' and up.perm_id='3'" | grep -c
#{kojid['hostname']}
EOH
code <<-EOH
koji --user=admin --password=admin grant-permission repo
#{kojid['hostname']};
EOH
not_if exists
end
end
Figure 39 Cookbook: builder.rb recipe - part 2
66
kojid.rb Kojid recipe is one that is separate from the others because it can be applied to other servers than Koji-hub.
Additionally, we can have more than one Koji builder. Recipe starts from including a PostgreSQL client recipe,
include_recipe "postgresql::client"
%w{koji koji/mock koji/tmp}.each do |dir|
directory "/var/" + dir do
owner "root"
group "root"
mode 00755
action :create
end
end
kojihubs = search(:node, 'recipes:koji\:\:hub')
kojihubs.each do |node|
Chef::Log.info("#{node['hostname']} has IP address #{node['ipaddress']}")
end
unless kojihubs.empty?
if `ping -q -c3 #{kojihubs.first['ipaddress']} 2>/dev/null 1>/dev/null &&
echo true || echo false` then
kojihubip = kojihubs.first['ipaddress']
end
if !kojihubs.first['cloud'].nil? && `ping -q -c3
#{kojihubs.first['cloud']['public_ipv4']} 2>/dev/null 1>/dev/null && echo
true || echo false` then
#kojihubip = kojihubs.first['eip_address']
kojihubip = kojihubs.first['cloud']['public_ipv4']
end
else
kojihubip = "127.0.0.1"
end
hostsfile_entry "#{kojihubip}" do
hostname 'koji.example.com'
aliases ['kojihub.example.com', 'kojiweb.example.com',
'kojipkgs.example.com']
unique true
comment 'Append by Recipe koji::kojid'
action :append
end
hostsfile_entry "127.0.0.1" do
hostname "#{node['hostname']}.example.com"
unique true
comment 'Append by Recipe koji::kojid'
action :append
end
Figure 40 Cookbok: kojid.rb recipe - part 1
67
so that we can connect to PostgreSQL database of Koji hub. Than we provide a directories hierarchy for kojid
in /var/koji. In next step we search in Chef Server for a node that has hub recipe. In next step we assign a
public IP address of this node to a variable kojihubip. Using this variable we add Koji hub IP address to
/etc/hosts file.
In next part we create a directory for /mnt/koji and we mount under it a NFS share from Koji hub server.
Then we create in it a hierarchy of directories required by kojid service. After this step we install EPEL yum
repository and we install koji-builder and related packages. Ten we provide /etc/kojid/kojid.conf
configuration file using a template.
directory "/mnt/koji" do
owner "root"
group "root"
mode 00755
action :create
end
mount "/mnt/koji" do
device "kojihub.example.com:/mnt/koji"
fstype "nfs"
options "rw"
action [:mount, :enable]
end
%w{koji/mock koji/tmp }.each do |dir|
directory "/mnt/" + dir do
owner "root"
group "root"
mode 00755
action :create
end
end
include_recipe "yum-epel"
%w{mock setarch rpm-build createrepo koji-builder}.each do |pkg|
package pkg do
action :install
end
end
template "/etc/kojid/kojid.conf" do
source "kojid.conf.erb"
mode 0440
owner "root"
group "root"
end
Figure 41 Cookbook: kojid.rb recipe - part 2
68
In the next part we check if Koji hub is available to this Kojid node (in other words, we check if there aren’t
any network issues between Koji hub and Koji builder or if firewall is blocking the connections).
If this test is passed and Koji hub is available, then we start kojid service. At this point Koji cluster should be
ready to use.
test.rb Test recipe is described in next section of this chapter regarding tests of the deployment.
Other files Usually in a cookbook repository there are also files not directly related to Chef. Those include among others
README.md, CHANGELOG.md, LICENSE. First contains description of cookbook and its usage documentation,
second one is log of changes made to the cookbook and last one is of course license applied to the cookbook.
If cookbook is kept in a Git repository, than it will also contain a .gitignore file which contains list of files and
directories (or regular expressions indicating files and directories) that Git version control system should
ignore. Analogously works Chefignore file – it determines which coookbooks to ignore when uploading them
to Chef Server (using Berkshelf or Knife).
Koji cookbook has only basic information provided in these files: author name and email, license (“All rights
reserved”). CHANGELOG will be updates once the cookbook will be made public for Chef community.
README.md contains short description, installation requirements and a how-to regarding cookbook usage.
kojihubAvialable = false
ruby_block "check if koji-hub accepts connections on port 80" do
block do
server = kojihubip
port = 80
begin
Timeout.timeout(5) do
Socket.tcp(server, port){}
end
Chef::Log.info('connections open')
kojihubAvialable = true
rescue
Chef::Log.fatal('connections refused')
end
end
end
if ! kojihubs.empty? and kojihubAvialable # and userExistsInDB
service "kojid" do
supports :status => true, :restart => true, :reload => true
action [ :enable, :start ]
end
end
Figure 42 Cookbook: kojid.rb recipe - part 3
69
.gitignore Chefignore
*~ *# .#* \#*# .*.sw[a-z] *.un~ pkg/ # Berkshelf .vagrant /cookbooks Berksfile.lock # Bundler Gemfile.lock bin/* .bundle/* .kitchen/ .kitchen.local.yml *sublime*
.DS_Store Icon? nohup.out ehthumbs.db Thumbs.db # EDITORS # \#* .#* *~ *.sw[a-z] *.bak REVISION TAGS* tmtags *_flymake.* *_flymake *.tmproj .project .settings mkmf.log *sublime* ## COMPILED ## a.out *.o *.pyc *.so *.com *.class *.dll *.exe */rdoc/ # SCM # .git */.git .gitignore .gitmodules .gitconfig .gitattributes .svn */.bzr/* */.hg/* */.svn/* # Berkshelf # cookbooks/* tmp
70
# Cookbooks # CONTRIBUTING CHANGELOG* # Vagrant # .vagrant Vagrantfile
Figure 43 Cookbook: .gitignore and chefignore
Koji cookbook’s .gitignore and Chefignore files are presented in Figure 43. Generally they are complementary.
Usually files that are not be visible in Git repository, shouldn’t also be included in uploaded cookbook.
However, there may be exceptions to this rule.
Tests Tests were conducted in two ways: manually and automatically. Manual test included login into via SSH into
machine and checking the service availability and status. Checking the logs of the services installed and
configured by Chef. Also some information can be obtained via Koji client – for instance it is possible to test
authentication, list Koji builders hosts or build a package. Although I decided to test these capabilities using a
shell script.
Automatic tests consists of two scripts: a ruby recipe and a shell script. First one is test.rb recipe of Koji
cookbook. As Figure 44 presents the recipe firstly copies a cookbook file from the cookbook
(files/default/centos.sh) to /tmp/centos6.sh, then it checks if there are available any Koji builders. If there are
Kojid hosts than test script is executed, otherwise "No kojid hosts found!" string is send to Chef log.
cookbook_file "/tmp/centos6.sh" do
source "centos6.sh"
mode '0744'
path "/tmp/centos6.sh"
action :create_if_missing
end
kojidHostsNumber = "`koji --user=admin --password=admin list-hosts --quiet
| wc -l`".to_i
if kojidHostsNumber > 1
execute "run /tmp/centos6.sh" do
command "sh /tmp/centos6.sh"
end
else
Chef::Log.fatal("No kojid hosts found!")
end
Figure 44 Cookbook: test.rb recipe
71
Figure 45 shows what centos6.sh script contains. This scripts consists of four parts. Firstly, we add tag to Koji
defining a new distribution (CentOS 6 in this case) and additional sub-tag to attach “x86_64” architecture to
it.
In the second part we add two yum repositories to this tag: main centos6 repository and EPEL repository. This
constitute a target for the tag.
In the third part of the script we create virtual yum groups for building RPM packages. There are two groups:
one for building SRPM and another to build RPM. To each of the group list of packages required for building is
assigned. For building SRPMs we need such tools as bash, cvs, gnupg, make, redhat-rpm-config, rpm-build,
shadow-utils, wget, rpmdevtools. Whereas for building RPMS we need: bash, bzip2, coreutils, cpio, diffutils,
findutils, gawk, gcc, grep, sed, gcc-c++, gzip, info, patch, redhat-rpm-config, rpm-build, shadow-utils, tar,
unzip, util-linux-ng, which, make. After that repository related to this target is regenerated.
Forth part of the script builds actual RPM package (nginx web server). Firstly, it downloads a src.rpm (SRPM)
package. Then we use it to build scratch (test) version. And then we add it to Koji database and process final
build. As the result of this script we obtain a RPM package in /mnt/koji/packages directory.
# tags
koji --user=admin --password=admin add-tag dist-centos6
koji --user=admin --password=admin add-tag --parent dist-centos6 --arches "x86_64"
dist-centos6-build
# external repos
koji --user=admin --password=admin add-external-repo -t dist-centos6-build dist-
centos6-repo http://centos.bio.lmu.de/6/os/\$arch/
koji --user=admin --password=admin add-external-repo -t dist-centos6-build dist-
epel6-repo http://ftp-stud.hs-esslingen.de/pub/epel/6/\$arch/
koji --user=admin --password=admin add-target dist-centos6 dist-centos6-build
# virtual build yum groups
koji --user=admin --password=admin add-group dist-centos6-build build
koji --user=admin --password=admin add-group dist-centos6-build srpm-build
koji --user=admin --password=admin add-group-pkg dist-centos6-build build bash
bzip2 coreutils cpio diffutils findutils gawk gcc grep sed gcc-c++ gzip info patch
redhat-rpm-config rpm-build shadow-utils tar unzip util-linux-ng which make
koji --user=admin --password=admin add-group-pkg dist-centos6-build srpm-build bash
cvs gnupg make redhat-rpm-config rpm-build shadow-utils wget rpmdevtools
koji --user=admin --password=admin regen-repo dist-centos6-build
# build rpm
wget http://nginx.org/packages/rhel/6/SRPMS/nginx-1.6.2-1.el6.ngx.src.rpm -O
/tmp/nginx-1.6.2-1.el6.ngx.src.rpm
koji --user=admin --password=admin build --scratch dist-centos6 /tmp/nginx-1.6.2-
1.el6.ngx.src.rpm
koji --user=admin --password=admin add-pkg --owner koji dist-centos6 nginx
koji --user=admin --password=admin build dist-centos6 /tmp/nginx-1.6.2-
1.el6.ngx.src.rpm
Figure 45 Tests: Building nginx for CentOS 6
72
Chapter IV: Conclusion The aim of this research project, as stated in the Introduction, was to show if and eventually how a hybrid
cloud can be utilized. The results clearly demonstrate that using DevOps techniques and software it is
possible to productively employ mixed cloud environment even within a single project.
I personally believe that the project is a success for two reasons. Firstly, projects works as it was intended – it
produces an automated way to establish a Koji cluster in the clouds. Secondly, however, I learnt in depth
practical side of writing code and adjust sophisticated configurations. And this is my personal achievement.
The project was a success on OpenStack and Amazon Web Services. However, it seems that not much
changes are required to adjust it to other Vagrant providers. The cookbook and Vagrant files are more
universal and cloud-independent than I assumed when I have started writing this thesis.
Potential applications The project has an array of potential application that can be successfully employed in IT industry. Potential
user of the project is anyone who needs easy and fast method of deploying Koji cluster in the clouds to build
RPM packages. That are mostly organizations that provide their software for RPM-based Linux distributions.
Koji is used predominantly by Fedora and CentOS developers. Those organizations use Koji in a traditional
datacenter provided by Red Hat. For instance, Fedora uses 91 Koji builders for building packages in three
different processor architectures.113
Therefore, the most probable potential application would be by those projects or by a new organization
aiming at creating new RPM-based Linux distribution.
Suggestions on further studies and investigations Definitely, the project still leaves a lot of space for improvement. First of all, there are few drawbacks that
could be omitted and few things could be done better – especially testing of the code. As a suggestion for
further development I would recommend extending the project include Continuous Integration from source
code commit to automated integration tests in the clouds.
Testing The project lacks proper use of testing frameworks available for Chef and Ruby. In fact while writing this code
I lacked a proper testing mindset. As a non-developer I wrote the code in an old-fashioned and ineffective
way – that is by writing tests at the very end of the project.
Of the modern, Agile development methodologies, the practice most crucial for creating good code, warning
against unwanted side effects, is that of test-driven development (TDD). For infrastructure developers, the
practice is difficult to introduce and implement. However, it promises the biggest return on investment. TDD
113 List of Fedora’s koji builders: http://koji.fedoraproject.org/koji/hosts?start=0&state=enabled&order=name (02/12/2014)
73
is a widely adopted way of software development that facilitates the creation of highly reliable and
maintainable code.
The philosophy of TDD is encapsulated in the phrase Red, Green, Refactor. This is an iterative approach that
follows six steps114:
1. Write a test based on requirements.
2. Run the test and watch it fail.
3. Write the simplest code you can to make the test pass.
4. Run the test and watch it pass.
5. Improve the code as required to make it perform well, be readable, and reusable, but without
changing its behavior.
6. Repeat the cycle.
If I would apply to this procedure it would clearly help me prevent scope from growing and it would early
reveal design problems.
Another mistake related to testing that I have done is not using the actual tools available for tests of Ruby
code and Chef cookbooks. Tools that I should use in the project include: Cucumber and Leibniz for
Acceptance Testing, Test Kitchen with Serverspec and Bats for Integration Testing, Chefspec and RSpec for
Unit Testing, and lastly Foodcritic for Linting and Static Analysis.
Continuous Integration, Deployment and Delivery Interesting continuation of the project would be implementation of Continuous Integration process to extend
the automation from new code commit through testing phase to deployment. This approach would enable
integration of automation of both software development and its deployment on infrastructure.
One of the possible solutions would be using Jenkins CI to implement a build pipeline. Jenkins (using an
external plugin) is able to control Vagrant's machines. A brief idea on how this could be done is presented by
Michael Huttermann in his book "DevOps for Developers" (p. 144). However, in our case we would use Chef
instead of Puppet.
114 Stephen Nelson-Smith, op. cit., p. 126
74
Appendix A: Koji build system Koji is an RPM build system open sourced by Red Hat and currently used by Red Hat, Fedora, CERN, CentOS,
Amazon, TomTom and many others organizations.115
The term "build system" may mean different things to different people. From the developer's (or RPM
package maintainer’s) perspective, Koji is a service that accepts build requests and farms them out to
different machines for building on cluster of koji builders. Koji tracks the resulting packages in its database
and supports a tagging system for organizing them. Koji has a web interface, a command line interface, and a
rich XML-RPC interface. Originally Koji was limited to building RPMs, but now it also supports building Java
packages via Maven. 116 Although in this project Koji is configured to build RPM packages.
Architecture Koji system is divided into four components: koji-hub, koji-web, kojira and koji builder (kojid). All of them are
written in Python programming language. Koji-hub and koji-web (optional Koji’s web interface) runs on top of
Apache web server. Koji-hub stores data about packages, builds, users, tags and other metadata in
PostgreSQL database. Kojira is service that maintains order in internal repository of Koji – it is collecting
“garbage” (old non-used RPM packages and their builds). Koji builder is a service that handles the building of
packages. There can be more than one builder and they can be installed on machines with different
processor architectures (i.e. x86_64, PPC, ARM) to compile packages for those architectures. As a good
practice Koji files can be kept on NFS share. Koji components and dependent services, like PostgreSQL or NFS,
can run on one server or on separate servers.
To build a RPM package Koji builder creates a chroot environment called buildroot – inside this isolated
environment the package is build. Koji builder firstly collects in the buildroot other packages: ones that are
needed to compile code and build RPM and ones that are dependencies of the build program. To achieve this
Koji builder utilize a tool called Mock. It enables users to reproduce build environment and debug the process
of RPM building.
Once completed, a build is imported into Koji's database and tagged. Koji's tagging system is very flexible and
can support build configurations for many different projects in the same instance.
115 http://fedoraproject.org/wiki/Koji/RunsHere (10/11/2014) 116 http://opensource.com/life/11/7/free-sake-story-koji (10/11/2014)
75
Figure 46 Koji services diagram
Koji-hub Koji-hub is central element in Koji system and it works as a mediator between all other koji components,
database and filesystem. It is an XML-RPC server running under mod_wsgi in Apache. Koji-hub is passive in
that it only receives XML-RPC calls and relies upon the build daemons and other components to initiate
communication. Koji-hub is the only component that has direct access to the database (PostgreSQL) and is
one of the two components that have write access to the file system. Koji-hub serves also as an
authentication system for other Koji’s services and users.
76
Koji-web Koji-web is a set of scripts that run in mod_wsgi and use the Cheetah templating engine to provide a web
interface to Koji. It acts as a client to koji-hub providing a visual interface to perform a limited amount of
administration. Koji-web exposes a lot of information and also provides a means for certain operations, such
as cancelling builds. It is optional element that provides web interface to browse packages, users, koji
builders, running tasks (i.e. package building tasks) and browse logs from builds. It is not used to manage Koji
system (Koji client is used for that). Its main goal is to provide visualization of packages and their building
tasks.117
Kojira kojira is a daemon that keeps the build root repodata updated. It is responsible for removing redundant build
roots and cleaning up after a build request is completed. This service is used to maintain order in internal
repository of Koji – it updates this repository and deletes non-used elements. It is not visible for both Koji’s
user and administrator, although it plays major role in the system.
Koji builder (kojid) Koji builder is a service that builds package. Koji-hub can manage one or more koji builders. kojid is the build
daemon that runs on each of the build machines. Its primary responsibility is polling for incoming build
requests and handling them accordingly. Essentially kojid asks koji-hub for work. Koji also has support for
tasks other than building. Creating install images is one example. kojid is responsible for handling these tasks
as well. kojid uses mock for building. It also creates a fresh buildroot for every build. kojid is written in Python
and communicates with koji-hub via XML-RPC.
Mock Mock is a tool for building packages. It can build packages for different architectures and different Fedora or
RHEL versions than the build host has. Mock creates chroots and builds packages in them. Its only task is to
reliably populate a chroot and attempt to build a package in that chroot.118
Koji builder runs mock to build RPM packages. For each build kojid creates a directory in /var/lib/mock/
beginning with “dist-“ in which mock downloads dependencies and build tools, it creates its chroot directory
(buildroot) to build a package and provides logs from the building.
Koji client Koji-client is a CLI tool written in Python that provides many hooks into Koji. It allows the user to query much
of the data as well as perform actions such as adding users and initiating build requests.
117 Example of koji-web interface: http://koji.fedoraproject.org/koji/ (22/09/2014) 118 https://fedoraproject.org/wiki/Projects/Mock (02/10/2014)
77
Additional tools
Authentication options There are three methods of authentication in Koji: login/password, SSL certificates and Kerberos. During the
installation Koji administrator has to choose one of them.
YUM repository generation YUM repository can be created either manually, by coping the packages from /mnt/koji and running
createrepo, or it can be generated basing on Koji’s tag/target using mash.
Integration with Source Control Manager Koji can build sources either from SRPM file or pulling them from SCM (like Git repository for instance).
Supported SCMs are Git, SVN, Mercurial, and CVS.
ISO generation Revisor and its fork Pungi are tools to build ISO image (Live or installation) of Linux system basing on a YUM
repository. This tools doesn’t integrate with Koji, but they are often used once the YUM repository is ready to
use. It is a simple solution to produce one’s own Linux distribution.
GPG signing of packages Sigul is a tool to sign RPM packages using GPG key used among others in Fedora project. It easily integrates
with Koji. However, user can sign packages traditionally, i.e. using rpm --sign command.
78
Appendix B: Project’s Vagrant files In this Appendix Vagrant configuration files for given infrastructure are provided: VirtualBox version,
OpenStack version, Amazon Web Services version and “production” version that includes mixed
infrastructure: OpenStack and Amazon Web Services.
Vagrantfile.vbox
# -*- mode: ruby -*-
# vi: set ft=ruby :
# Vagrant required plugins installation.
required_plugins = %w( vagrant-omnibus vagrant-berkshelf vagrant-aws )
required_plugins.each do |plugin|
system "vagrant plugin install #{plugin}" unless Vagrant.has_plugin? plugin
end
# Vagrant conflicting plugins
# vagrant-omnibus
# Vagrantfile API/syntax version.
VAGRANTFILE_API_VERSION = "2"
Vagrant.require_version ">= 1.5.0"
Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
config.berkshelf.enabled = true
config.berkshelf.berksfile_path = "Berksfile"
# CentOS has "Defult requirepty" in /etc/sudoers
# config.ssh.pty = true
# Chef zero
config.vm.define "zerodev" do |zerodev|
zerodev.vm.hostname = "zerodev"
zerodev.vm.synced_folder '.', '/vagrant'
# Set the version of chef to install using the vagrant-omnibus plugin
zerodev.omnibus.chef_version = :latest
# Every Vagrant virtual environment requires a box to build off of.
# If this value is a shorthand to a box in Vagrant Cloud then
zerodev.vm.box = "chef/centos-6.5"
# Assign this VM to a host-only network IP, allowing you to access it
# via the IP. Host-only networks can talk to the host machine as well as
# any other machines on the same network, but cannot be accessed (through
this
# network interface) by any external networks.
zerodev.vm.network :private_network, ip: "33.33.33.3"
zerodev.vm.network :forwarded_port, host: 4000, guest: 4000
79
# Provision chef-zero using chef_solo
zerodev.vm.provision "chef_solo" do |chef_solo|
chef_solo.log_level = :debug
chef_solo.data_bags_path = "./data_bags/"
chef_solo.json =
{
'build-essential' => {
compile_time: true
},
'chef-zero' => {
install: true,
start: true
},
'chef' => {
server_url: "http://127.0.0.1:4000"
}
}
chef_solo.run_list = [
"recipe[iptables::disabled]",
"recipe[build-essential::default]",
"recipe[chef-zero::default]",
"recipe[chef::client]"
]
end
# Run chef-zero
zerodev.vm.provision :shell,
:inline => "/opt/chef/embedded/bin/chef-zero -H 0.0.0.0 -p 4000 -d"
# Run chef-client
zerodev.vm.provision :shell, :path => "scripts/bash_scripts/chef-
configuration.sh", :args => "http://127.0.0.1:4000"
end
# Koji hub
config.vm.define "kojidev" do |kojidev|
kojidev.vm.hostname = "kojidev"
kojidev.vm.synced_folder '.', '/vagrant'
# Set the version of chef to install using the vagrant-omnibus plugin
kojidev.omnibus.chef_version = :latest
# Every Vagrant virtual environment requires a box to build off of.
# If this value is a shorthand to a box in Vagrant Cloud then
kojidev.vm.box = "chef/centos-6.5"
# Assign this VM to a host-only network IP, allowing you to access it
# via the IP. Host-only networks can talk to the host machine as well as
# any other machines on the same network, but cannot be accessed (through
this
# network interface) by any external networks.
kojidev.vm.network :private_network, ip: "33.33.33.30"
80
# Enable provisioning with chef client/
kojidev.vm.provision "chef_client" do |chef|
chef.log_level = :debug
chef.chef_server_url = "http://33.33.33.3:4000"
chef.validation_key_path = "/home/tklosinski/m/koji/.chef-
backup/dummy_key.pem"
chef.json =
{
postgresql: {
password: {
postgres: '123123'
}
},
apache: {
listen_ports: ['80', '443'],
listen_address: '0.0.0.0'
},
selinux: {
state: 'disabled'
}
}
chef.run_list = [
"recipe[iptables::disabled]",
"recipe[koji::kojid]",
"recipe[koji::default]"
]
end
end
end
81
Vagrantfile.openstack
# -*- mode: ruby -*-
# vi: set ft=ruby :
require 'rubygems'
require 'fog'
require 'chef'
# Vagrant required plugins installation.
required_plugins = %w( vagrant-omnibus vagrant-berkshelf vagrant-aws vagrant-
openstack-provider chef ) # vagrant-chef-kojibuilder )
required_plugins.each do |plugin|
system "vagrant plugin install #{plugin}" unless Vagrant.has_plugin? plugin
end
# Vagrantfile API/syntax version.
VAGRANTFILE_API_VERSION = "2"
Vagrant.require_version ">= 1.5.0"
Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
# CentOS has "Defult requirepty" in /etc/sudoers
config.ssh.pty = true
config.berkshelf.berksfile_path = "./Berksfile"
# Koji builder on OpenStack
config.vm.define "kojibuilder" do |kojibuilder|
kojibuilder.vm.hostname = "kojibuilder"
#kojibuilder.vm.synced_folder '.', '/vagrant', :disabled => true
kojibuilder.vm.synced_folder ".", "/vagrant", type:
"rsync", :rsync_excludes => ['bar/', 'foo/']
# Set the version of chef to install using the vagrant-omnibus plugin
kojibuilder.omnibus.chef_version = :latest
# This is not used in fact, Vagrant just requires some box.
kojibuilder.vm.box = "dummy.box"
kojibuilder.vm.box_url = "https://github.com/cloudbau/vagrant-openstack-
plugin/raw/master/dummy.box"
# OpenStack provider
kojibuilder.vm.provider :openstack do |os, override|
os.public_key_path = "#{ENV['OS_PUBLIC_KEY_PATH']}"
override.ssh.private_key_path = "#{ENV['OS_PRIVATE_KEY_PATH']}"
override.ssh.username = "#{ENV['OS_SSH_USERNAME']}"
os.openstack_auth_url = "#{ENV['OS_AUTH_URL']}/tokens"
os.username = "#{ENV['OS_USERNAME']}"
os.password = "#{ENV['OS_PASSWORD']}"
os.tenant_name = "#{ENV['OS_TENANT_NAME']}"
os.flavor = "#{ENV['OS_FLAVOR']}" # 'm1.small'
82
os.image = "#{ENV['OS_IMAGE']}" # 'Fedora 20 x86_64'
os.floating_ip_pool = "#{ENV['OS_FLOATING_IP_POOL']}" # 'public'
end
# Enable provisioning with chef client/
kojibuilder.vm.provision "chef_client" do |chef|
chef.node_name = "kojibuilder"
chef.log_level = :debug
chef.chef_server_url = "#{ENV['CHEF_SERVER']}"
chef.validation_key_path = "#{ENV['CHEF_VALIDATION_KEY_PATH']}"
chef.validation_client_name = "#{ENV['CHEF_VALIDATION_CLIENT_NAME']}"
chef.json =
{
selinux: {
state: 'disabled'
}
}
chef.run_list = [
"recipe[koji::kojid]"
]
chef.delete_node = true
chef.delete_client = true
end
end
end
83
Vagrantfile.aws
# -*- mode: ruby -*-
# vi: set ft=ruby :
require 'rubygems'
require 'chef'
Chef::Config.from_file(File.join(File.dirname(__FILE__), '.chef',
'knife.rb'))
# Vagrant required plugins installation.
required_plugins = %w( chef vagrant-omnibus vagrant-berkshelf vagrant-aws
vagrant-openstack-provider )
required_plugins.each do |plugin|
system "vagrant plugin install #{plugin}" unless Vagrant.has_plugin? plugin
end
# Vagrantfile API/syntax version.
VAGRANTFILE_API_VERSION = "2"
Vagrant.require_version ">= 1.5.0"
Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
config.berkshelf.enabled = true
config.nfs.functional = false
config.ssh.pty = true
# Koji hub
config.vm.define "koji" do |koji|
koji.vm.hostname = "koji"
koji.vm.synced_folder '.', '/vagrant'
# Set the version of chef to install using the vagrant-omnibus plugin
koji.omnibus.chef_version = :latest
# This is not used in fact, Vagrant just requires some box.
koji.vm.box = "dummy.box"
koji.vm.box_url = "https://github.com/mitchellh/vagrant-
aws/raw/master/dummy.box"
# AWS provider
koji.vm.provider :aws do |aws, override|
aws.access_key_id = "#{ENV['AWS_ACCESS_KEY']}"
aws.secret_access_key = "#{ENV['AWS_SECRET_KEY']}"
aws.keypair_name = "#{ENV['AWS_KEYPAIR']}"
override.ssh.private_key_path = "#{ENV['AWS_PRIVATE_KEY_PATH']}"
override.ssh.username = "#{ENV['AWS_SSH_USERNAME']}"
aws.ami = "#{ENV['AWS_AMI_IMAGE']}"
aws.region = "#{ENV['AWS_REGION']}"
aws.instance_type = "#{ENV['AWS_INSTANCE_TYPE']}"
84
aws.security_groups = [ "koji" ]
aws.user_data = "#!/bin/bash
echo 'Defaults:#{ENV['AWS_SSH_USERNAME']} !requiretty' >
/etc/sudoers.d/999-vagrant-cloud-init-requiretty
chmod 440 /etc/sudoers.d/999-vagrant-cloud-init-requiretty
mkdir -p /etc/chef/ohai/hints
touch /etc/chef/ohai/hints/ec2.json"
aws.tags = {
'Name' => 'koji'
}
end
# Enable provisioning with chef client/
koji.vm.provision "chef_client" do |chef|
# chef.arguments = "--splay 75"
chef.node_name = "koji"
chef.chef_server_url = Chef::Config[:chef_server_url]
chef.log_level = Chef::Config[:log_level]
chef.validation_key_path = Chef::Config[:validation_key]
chef.validation_client_name = Chef::Config[:validation_client_name]
chef.json =
{
"build-essential" => {
"compiletime" => true
},
postgresql: {
password: {
postgres: '123123',
port: 5432
}
},
apache: {
listen_ports: ['80', '443'],
listen_address: '0.0.0.0'
},
selinux: {
state: 'disabled'
}
}
chef.run_list = [
"recipe[iptables::disabled]",
"recipe[koji::kojid]",
"recipe[koji::default]",
"recipe[koji::test]"
]
chef.delete_node = true
chef.delete_client = true
end
end
end
85
Vagrantfile.production
# -*- mode: ruby -*-
# vi: set ft=ruby :
require 'rubygems'
require 'chef'
Chef::Config.from_file(File.join(File.dirname(__FILE__), '.chef',
'knife.rb'))
# Vagrant required plugins installation.
required_plugins = %w( chef vagrant-omnibus vagrant-berkshelf vagrant-aws
vagrant-openstack-provider )
required_plugins.each do |plugin|
system "vagrant plugin install #{plugin}" unless Vagrant.has_plugin? plugin
end
# Vagrantfile API/syntax version.
VAGRANTFILE_API_VERSION = "2"
Vagrant.require_version ">= 1.5.0"
Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
config.nfs.functional = false
config.ssh.pty = true
config.berkshelf.enabled = true
# Koji hub on Amazon EC2
config.vm.define "kojihub" do |kojihub|
kojihub.vm.hostname = "koji"
kojihub.vm.synced_folder '.', '/vagrant'
# Set the version of chef to install using the vagrant-omnibus plugin
kojihub.omnibus.chef_version = :latest
# This is not used in fact, Vagrant just requires some box.
kojihub.vm.box = "dummy.box"
kojihub.vm.box_url = "https://github.com/mitchellh/vagrant-
aws/raw/master/dummy.box"
# AWS provider
kojihub.vm.provider :aws do |aws, override|
aws.access_key_id = "#{ENV['AWS_ACCESS_KEY']}"
aws.secret_access_key = "#{ENV['AWS_SECRET_KEY']}"
aws.keypair_name = "#{ENV['AWS_KEYPAIR']}"
override.ssh.private_key_path = "#{ENV['AWS_PRIVATE_KEY_PATH']}"
override.ssh.username = "#{ENV['AWS_SSH_USERNAME']}"
aws.ami = "#{ENV['AWS_AMI_IMAGE']}"
aws.region = "#{ENV['AWS_REGION']}"
aws.instance_type = "#{ENV['AWS_INSTANCE_TYPE']}"
aws.security_groups = [ "koji" ]
86
aws.user_data = "#!/bin/bash
echo 'Defaults:#{ENV['AWS_SSH_USERNAME']} !requiretty' >
/etc/sudoers.d/999-vagrant-cloud-init-requiretty
chmod 440 /etc/sudoers.d/999-vagrant-cloud-init-requiretty
mkdir -p /etc/chef/ohai/hints
touch /etc/chef/ohai/hints/ec2.json"
aws.tags = {
'Name' => 'koji'
}
end
# Enable provisioning with chef client/
kojihub.vm.provision "chef_client" do |chef|
chef.node_name = "koji"
chef.chef_server_url = "#{ENV['CHEF_SERVER']}"
chef.validation_key_path = "#{ENV['CHEF_VALIDATION_KEY_PATH']}"
chef.validation_client_name = "#{ENV['CHEF_VALIDATION_CLIENT_NAME']}"
chef.json =
{
"build-essential" => {
"compiletime" => true
},
postgresql: {
password: {
postgres: '123123',
port: 5432
}
},
apache: {
listen_ports: ['80', '443'],
listen_address: '0.0.0.0'
},
selinux: {
state: 'disabled'
}
}
chef.run_list = [
"recipe[nfs::server]",
"recipe[iptables::disabled]",
"recipe[koji::default]",
"recipe[koji::test]"
]
chef.delete_node = true
chef.delete_client = true
end
end
# Koji builder on OpenStack
config.vm.define "kojibuilder" do |kojibuilder|
kojibuilder.vm.hostname = "kojibuilder"
#kojibuilder.vm.synced_folder '.', '/vagrant', :disabled => true
87
kojibuilder.vm.synced_folder ".", "/vagrant", type:
"rsync", :rsync_excludes => ['bar/', 'foo/']
# Set the version of chef to install using the vagrant-omnibus plugin
kojibuilder.omnibus.chef_version = :latest
# This is not used in fact, Vagrant just requires some box.
kojibuilder.vm.box = "dummy.box"
kojibuilder.vm.box_url = "https://github.com/cloudbau/vagrant-openstack-
plugin/raw/master/dummy.box"
# OpenStack provider
kojibuilder.vm.provider :openstack do |os, override|
os.username = "#{ENV['OS_USERNAME']}"
os.password = "#{ENV['OS_PASSWORD']}"
os.public_key_path = "#{ENV['OS_PUBLIC_KEY_PATH']}"
override.ssh.private_key_path = "#{ENV['OS_PRIVATE_KEY_PATH']}"
override.ssh.username = "#{ENV['OS_SSH_USERNAME']}"
os.openstack_auth_url = "#{ENV['OS_AUTH_URL']}/tokens"
os.tenant_name = "#{ENV['OS_TENANT_NAME']}"
os.flavor = "#{ENV['OS_FLAVOR']}" # 'm1.small'
os.image = "#{ENV['OS_IMAGE']}" # 'Fedora 20 x86_64'
os.floating_ip_pool = "#{ENV['OS_FLOATING_IP_POOL']}" # 'public'
os.volumes = [
{
id: 'f9976f16-3d9d-499a-86c1-42247588b3da',
device: '/dev/vdb'
}
]
os.user_data = "#!/bin/bash
echo 'Defaults:#{ENV['OS_SSH_USERNAME']} !requiretty' >
/etc/sudoers.d/999-vagrant-cloud-init-requiretty
chmod 440 /etc/sudoers.d/999-vagrant-cloud-init-requiretty
mkdir -p /etc/chef/ohai/hints
touch /etc/chef/ohai/hints/openstack.json
(echo o; echo n; echo p; echo 1; echo ; echo; echo w) | fdisk
/dev/vdb
mkfs.ext4 /dev/vdb1
mkdir -p /var/koji
mkdir -p /var/koji/mock
mkdir -p /var/koji/tmp
mount /dev/vdb1 /var/koji"
end
# Enable provisioning with chef client/
kojibuilder.vm.provision "chef_client" do |chef|
chef.node_name = "kojibuilder"
chef.chef_server_url = "#{ENV['CHEF_SERVER']}"
chef.validation_key_path = "#{ENV['CHEF_VALIDATION_KEY_PATH']}"
chef.validation_client_name = "#{ENV['CHEF_VALIDATION_CLIENT_NAME']}"
chef.json =
{
88
selinux: {
state: 'disabled'
}
}
chef.run_list = [
"recipe[nfs]",
"recipe[iptables::disabled]",
"recipe[koji::kojid]"
]
chef.delete_node = true
chef.delete_client = true
end
end
end
89
Appendix C: Directories and files tree of the project .
|-- attributes
| `-- default.rb
|-- Berksfile
|-- CHANGELOG.md
|-- .chef
| `-- knife.rb
|-- chefignore
|-- contributors.txt
|-- files
| `-- default
| `-- centos6.sh
|-- Gemfile
|-- .gitignore
|-- .kitchen.yml
|-- libraries
|-- LICENSE
|-- metadata.rb
|-- providers
|-- README.md
|-- recipes
| |-- builder.rb
| |-- database.rb
| |-- default.rb
| |-- hub.rb
| |-- kojid.rb
| |-- kojira.rb
| `-- test.rb
|-- resources
|-- roles
|-- scripts
| |-- bash_scripts
| | |-- chef-configuration.sh
| | |-- chef-install.sh
| | |-- dev
| | | |-- by_chef.sh
| | | `-- from_source.sh
| | `-- knife.sh
| `-- ruby_scripts
| |-- aws.rb
| |-- chefapi.rb
| |-- chef.rb
| |-- chef-zero.rb
| |-- destroy_instances.rb
| |-- json.rb
| |-- metadata.rb
| |-- openstack.rb
| `-- ridley.rb
|-- templates
| `-- default
| |-- client-koji.conf.erb
| |-- httpd-kojihub.conf.erb
| |-- httpd-ssl.conf.erb
| |-- httpd-web.conf.erb
90
| |-- hub.conf.erb
| |-- kojid.conf.erb
| |-- kojira.conf.erb
| |-- openssl.cnf.erb
| |-- ssl.conf
| `-- web.conf.erb
|-- test
| `-- integration
| `-- default
|-- Thorfile
|-- Vagrantfile.production
|-- Vagrantfile.aws
|-- Vagrantfile.openstack
`-- Vagrantfile.vbox
18 directories, 49 files
91
Figures Figure 1 AWS Global Infrastructure (Regions) .................................................................................................... 22
Figure 2 List of AWS Regions and Locations ....................................................................................................... 22
Figure 3 AWS - Availability Zones ....................................................................................................................... 23
Figure 4 AWS – Services ...................................................................................................................................... 24
Figure 5 Basic elements of DevOps software development method ................................................................. 26
Figure 6 Shell environment variables ................................................................................................................. 30
Figure 7 Gemfile .................................................................................................................................................. 31
Figure 8 Flow of deployment .............................................................................................................................. 33
Figure 9 Git: Bitbucket.com repository ............................................................................................................... 34
Figure 10 Vagrant: List of plugins used ............................................................................................................... 35
Figure 11 Vagrantfile: plugins installation .......................................................................................................... 35
Figure 12 Vagrantfile: API’s and syntax’s version ............................................................................................... 36
Figure 13 Vagrantfile: Config specific to installation .......................................................................................... 36
Figure 14 Vagrantfile: Definitions of the VMs .................................................................................................... 36
Figure 15 Vagrantfile: AWS provider .................................................................................................................. 38
Figure 16 Vagrantfile: Amazon dummy box ....................................................................................................... 38
Figure 17 Vagrantifle: Chef provisioning of koji hub .......................................................................................... 39
Figure 18 Vagrantfile: OpenStack dummy box ................................................................................................... 40
Figure 19 Vagrantfile: OpenStack provider ......................................................................................................... 41
Figure 20 Vagrantfile: Chef provisioning of Koji builder ..................................................................................... 42
Figure 21 Chef: Search in a recipe ...................................................................................................................... 45
Figure 22 Chef: knife.rb configuration file .......................................................................................................... 47
Figure 23 Berkshelf: Berksfile ............................................................................................................................. 49
Figure 24 Cookbook: metadata ........................................................................................................................... 50
Figure 25 Cookbook: Attributes .......................................................................................................................... 51
Figure 26 Cookbook: Template /etc/koji.conf .................................................................................................... 52
Figure 27 Cookbook: Template /etc/httpd/conf.d/kojihub.conf ........................................................................ 53
Figure 28 Cookbook: Template /etc/koji-hub/hub.conf ..................................................................................... 53
Figure 29 Cookbook: Template /etc/kojira/kojira.conf ...................................................................................... 54
Figure 30 Cookbook: Template /etc/kojid/kojid.conf......................................................................................... 54
Figure 31 Cookbook: default.rb recipe ............................................................................................................... 55
Figure 32 Cookbook: client.rb recipe .................................................................................................................. 56
Figure 33 Cookbook: hub.rb recipe - part 1 ........................................................................................................ 57
Figure 34 Cookbook: hub.rb recipe - part 2 ........................................................................................................ 58
Figure 35 Cookbook: database.rb - part 1 .......................................................................................................... 59
Figure 36 Cookbook: database.rb recipe - part 2 ............................................................................................... 60
Figure 37 Cookbook: kojira.rb recipe .................................................................................................................. 61
Figure 38 Cookbook: builder.rb recipe - part 1 ................................................................................................... 62
92
Figure 39 Cookbook: builder.rb recipe - part 2 ................................................................................................... 63
Figure 40 Cookbok: kojid.rb recipe - part 1 ........................................................................................................ 64
Figure 41 Cookbook: kojid.rb recipe - part 2 ...................................................................................................... 65
Figure 42 Cookbook: kojid.rb recipe - part 3 ...................................................................................................... 66
Figure 43 Cookbook: .gitignore and chefignore.................................................................................................. 68
Figure 44 Cookbook: test.rb recipe ..................................................................................................................... 68
Figure 45 Tests: Building nginx for CentOS 6 ...................................................................................................... 69
Figure 46 Koji services diagram .......................................................................................................................... 73
93
Bibliography Baun, C., Kunze, M., Nimis, J., & Tai, S. (2011). Cloud Computing: Web-Based Dynamic IT Services. Springer.
Beach, B. (2014). Pro PowerShell for Amazon Web Services. Apress.
Furht, B., & Escalante, A. (2010). Handbook of Cloud Computing. Springer.
Hurwitz, J., Bloor, R., Kaufman, M., & Halper, F. (2009). Cloud Computing for Dummies. For Dummies.
Huttermann, M. (2012). DevOps for Developers. Apress.
Marschall, M. (2013). Chef Infrastructure Automation Cookbook. Packt Publishing.
Mell, P., & Grance, T. (2011). The NIST Definition of Cloud Computing. National Institutes of Technology, U.S.
Department of Commerce. Retrieved from http://csrc.nist.gov/publications/nistpubs/800-
145/SP800-145.pdf
Nelson-Smith, S. (2013). Test-Driven Infrastructure with Chef, 2nd Edition. O'Reilly Media.
Pepple, K. (2011). Deploying OpenStack. Sebastopol : O'Reilly Media.
Rittinghouse, J. W., & Ransome, J. F. (2009). Cloud Computing: Implementation, Management and Security.
USA: CRC Press.
Sabharwal, N., & Wadhwa, M. (2014). Automation through Chef Opscode. Apress.
Sarna, D. E. (2010). Implementing and Developing Cloud Computing Applications. Auerbach Publications.
Sitaram, D., & Manjunath, G. (2011). Moving To The Cloud: Developing Apps in the New World of Cloud
Computing. Syngress.
Stellman, A., & Greene, J. (2014). Learning Agile. O'Reilly Media.
Velte, A., Velte, T. J., & Elsenpeter, R. C. (2009). Cloud Computing, A Practical Approach. USA: McGraw-Hill
Prof Med/Tech.