Chapter 1. Overview

Table of Contents
Goals of Wigwam
How Wigwam Is Organized
Frequently Used Commands in Wigwam
Terminology

Goals of Wigwam

Wigwam is a framework for managing the development and operation of server applications. The standard example is a website and all the component services that make up that website.

Wigwam has three main goals:

Collaboration

The primary goal of Wigwam is to enable a large (or small) group of people to work cooperatively on a project without slowing each other down or breaking each other's work in the process. This is similar to the convenience that source code control or revision control provides. If you're not familiar with the concept of revision control, see Appendix F and read about CVS.

In a traditional software project, each person working on the project gets their own copy of all the source code (a playpen) so they can change and test all components of the software without interference from anyone else. The changes they make and the changes from others are merged and tracked using a revision control system.

For some packages, such as a GUI utility run by a user or a library, this is the only type of change management necessary.

But with a server application such as a website, you also need to manage the operation of the software. This involves more than just a few software libraries linked into one executable. It means integrating and configuring external programs as well as the source code for the project itself.

The configuration files and other software integration data, such as the details of starting and stopping of services, need to be managed by revision control along with the source code. For example, in projects that maintain a webserver, the source-code repository for the project needs to store all information required to run the site at a given moment. For example, it will probably list the webserver's hostname and which port the services are configured to bind to. Likewise revision control retains all the configuration files, for example, they surely contain the location of the database server at a given time.

Ideally, the revision control system should only contain source files: it should contain no generated files. However, that is difficult to attain, given that most websites depend on many libraries, and furthermore they probably depend on configuration files.

Here is a concrete example of the problem. Suppose someone committed a configuration file to the revision control system for my application server that included a connection to some.machine.com:3301 for some database. Unfortunately, some other developer might be testing against a local server, say theirmachine.machine.com:3394. Worse yet, accidentally committing such a change could break the live running server application.

Perhaps you're now thinking that configuration should be left out of revision control. Actually, that can cause similar problems. Introducing new software and new configuration for that software into a project that does not get propagated to other developers or the live application server, can cause those other developers or the live application server to break as a result of running new code that depends on the new services being configured properly.

For example, developer A sets up a new database service as part of his website. He then configures (or hardcodes) the application logic to connect to that database. Then he checks into revision control all of his changes. Developer B or even the live website then gets a copy of the changes. There is no way for developer B or the live site to know that the site now depends on the new database and their versions of the site will break.

A hybrid approach is needed where a the small amount of information that must vary between playpens (such as user IDs, network ports, and network names) is allowed outside of revision control.

Wigwam accomplishes this by giving users standard places to store this type of information and standard methods for getting at it and using it for configuration. More specifically, Wigwam loads a bunch of environment variables from per-role and per-cluster configuration files Startup scripts for services can use those variables to generate runtime config files or scripts from templates that are checked into the project or provided by a package. (See the Section called Reusing the Effort of Others for an introduction to the Wigwam package system.) This way, almost everything is tracked by the revision control system, and the difference between development, testing, and live installations of the project is minimized.

Deployment

In order to make our projects' uptime as great as possible, we have developed a simple testing methodology. Developers each work on their own machine, or on a dev cluster, whichever they prefer. Once they have reached the release goals, they stage the project (which consists of tagging and publishing), at which point the producers and QA engineers may test it. Sites may be "staged" to the developer's machine, a dedicated staging machine, or a staging cluster. Finally, once the project has been tested, the same tag should be published to the live cluster.

It is important to everyone that publication is reliable -- any misstep in publication can cause the site to be down, which can be costly. To complicate the program, we use load balancing so we have to run our projects on multiple machines.

We try to depend as little as possible on the environment of the servers where our sites are deployed. A Wigwam project should include copies of all the external applications which the site might need, such as web server software (e.g. apache) or a database application (e.g. mysql). The configuration files which manage these are also stored as part of the Wigwam project, rather than in the default locations on the system (it would be rather difficult to run multiple instances of Apache, for example, if they all attempted to read their config file /etc/httpd/conf/httpd.conf!

As with all source-control systems, developers acquire versions of the project on their local machine, called playpens, for debugging and development. Ideally, the playpens will be identical to the real site.

After testing a certain version of the site, it must be tagged, meaning that the exact version of every file on the site is recorded under a unique name.

Since there is no ambiguity about what a tag means, developers and producers can easily coordinate testing a particular tagged version. We call this phase of testing staging, and we have clusters dedicated to it.

Then the tag can be published. If something unexpectedly goes wrong, you should always be able to revert to an older tag. Ideally, only one developer (the "release engineer") should publish to the live site so that all changes can be tracked centrally.