Wigwam is a framework for managing the development and operation of server applications. The standard example is a website and all the component services that make up that website.
Wigwam has three main goals:
development, debugging, quality control, change management
Usually the goal is to try to make our services highly available by deploying the final product on multiple load-balanced machines. But it works equally well for managing a service intended to run on only one machine.
Obviously code reuse saves much development and debugging time.
The primary goal of Wigwam is to enable a large (or small) group of people to work cooperatively on a project without slowing each other down or breaking each other's work in the process. This is similar to the convenience that source code control or revision control provides. If you're not familiar with the concept of revision control, see Appendix F and read about CVS.
In a traditional software project, each person working on the project gets their own copy of all the source code (a playpen) so they can change and test all components of the software without interference from anyone else. The changes they make and the changes from others are merged and tracked using a revision control system.
For some packages, such as a GUI utility run by a user or a library, this is the only type of change management necessary.
But with a server application such as a website, you also need to manage the operation of the software. This involves more than just a few software libraries linked into one executable. It means integrating and configuring external programs as well as the source code for the project itself.
The configuration files and other software integration data, such as the details of starting and stopping of services, need to be managed by revision control along with the source code. For example, in projects that maintain a webserver, the source-code repository for the project needs to store all information required to run the site at a given moment. For example, it will probably list the webserver's hostname and which port the services are configured to bind to. Likewise revision control retains all the configuration files, for example, they surely contain the location of the database server at a given time.
Ideally, the revision control system should only contain source files: it should contain no generated files. However, that is difficult to attain, given that most websites depend on many libraries, and furthermore they probably depend on configuration files.
Here is a concrete example of the problem. Suppose someone committed a configuration file to the revision control system for my application server that included a connection to some.machine.com:3301 for some database. Unfortunately, some other developer might be testing against a local server, say theirmachine.machine.com:3394. Worse yet, accidentally committing such a change could break the live running server application.
Perhaps you're now thinking that configuration should be left out of revision control. Actually, that can cause similar problems. Introducing new software and new configuration for that software into a project that does not get propagated to other developers or the live application server, can cause those other developers or the live application server to break as a result of running new code that depends on the new services being configured properly.
For example, developer A sets up a new database service as part of his website. He then configures (or hardcodes) the application logic to connect to that database. Then he checks into revision control all of his changes. Developer B or even the live website then gets a copy of the changes. There is no way for developer B or the live site to know that the site now depends on the new database and their versions of the site will break.
A hybrid approach is needed where a the small amount of information that must vary between playpens (such as user IDs, network ports, and network names) is allowed outside of revision control.
Wigwam accomplishes this by giving users standard places to store this type of information and standard methods for getting at it and using it for configuration. More specifically, Wigwam loads a bunch of environment variables from per-role and per-cluster configuration files Startup scripts for services can use those variables to generate runtime config files or scripts from templates that are checked into the project or provided by a package. (See the Section called Reusing the Effort of Others for an introduction to the Wigwam package system.) This way, almost everything is tracked by the revision control system, and the difference between development, testing, and live installations of the project is minimized.
In order to make our projects' uptime as great as possible, we have developed a simple testing methodology. Developers each work on their own machine, or on a dev cluster, whichever they prefer. Once they have reached the release goals, they stage the project (which consists of tagging and publishing), at which point the producers and QA engineers may test it. Sites may be "staged" to the developer's machine, a dedicated staging machine, or a staging cluster. Finally, once the project has been tested, the same tag should be published to the live cluster.
It is important to everyone that publication is reliable -- any misstep in publication can cause the site to be down, which can be costly. To complicate the program, we use load balancing so we have to run our projects on multiple machines.
We try to depend as little as possible on the environment of the servers where our sites are deployed. A Wigwam project should include copies of all the external applications which the site might need, such as web server software (e.g. apache) or a database application (e.g. mysql). The configuration files which manage these are also stored as part of the Wigwam project, rather than in the default locations on the system (it would be rather difficult to run multiple instances of Apache, for example, if they all attempted to read their config file /etc/httpd/conf/httpd.conf!
As with all source-control systems, developers acquire versions of the project on their local machine, called playpens, for debugging and development. Ideally, the playpens will be identical to the real site.
After testing a certain version of the site, it must be tagged, meaning that the exact version of every file on the site is recorded under a unique name.
Since there is no ambiguity about what a tag means, developers and producers can easily coordinate testing a particular tagged version. We call this phase of testing staging, and we have clusters dedicated to it.
Then the tag can be published. If something unexpectedly goes wrong, you should always be able to revert to an older tag. Ideally, only one developer (the "release engineer") should publish to the live site so that all changes can be tracked centrally.
In the software world, people like to reuse software they've written to solve one problem to solve another similar problem. This is because no one wants to duplicate effort. Usually this type of software (the Apache webserver, for example) gets bundled up into some sort of release, such as a tarball. This package is then used by other projects, by installing and configuring it for the enviroment it's installed in. In many enviroments the effort of installing and sometimes configuring is taken care of for you by way of a package managment system, (Such as Debian's dpkg, Red Hat's rpm, Sun's pkg, or Windows's Install Shield.)
Since an important feature of Wigwam is the ability to integrate existing software into your project, Wigwam 3.0 provides a package managment system.
Wigwam's packaging system is independent of any packaging system provided by the operating system. Projects and their packages can be installed anywhere, any number of times. This has many benefits:
All changes to support software and configuration files are kept under revision control along with the project's code. System administration errors, such as forgetting to update configuration files on some of the computers in a cluster, are much less likely. If a mistake is made in configuration, it can easily be rolled back to a previous state.
Two or more playpens can run on a single computer without interfering with one another. They may even be running different versions of software. For example, one developer can test a new version of Apache while another developer works with the old version on the same machine. Similarly, multiple projects can be hosted on a single computer even if they have conflicting software requirements.