Drupal websites are not just applications, they are complex applications. They encompass design, functionality, and content, and they’re updated at a furious pace.
In order to master the process for iteration, developers need a mental model for the interplay of code, content, and configuration.
These comprise the modern website’s Holy Trinity.
The Minimum Viable Workflow
Safely making frequent changes to a running website requires some basic capabilities, which I call the Minimum Viable Workflow. In a perfect world, all of these workflow steps are scripted or completely automated, leading to lower friction for developers, and improved reliability via reduced human error.
Being able to iterate requires a minimum of three separate operational instances of the web site; three environments. This is necessary to separate development work from quality assurance and approval, and from the actual live or “production” environment for the site. For safety, these environments must be isolated from one another. Development work can have unpredictable side-effects and cannot be allowed to disrupt the production environment. Likewise, quality assurance and testing should not block development.
These environments must also be as close to identical as possible. Unexpected surprises after deployments and the phrase “it worked on my machine” are the most frequent slayers of team agility and release cadence.
Minimum Workflow Steps
In the MVW (Minimum Viable Workflow), a developer does her work in the Development Environment. While she is active, that environment is unstable; it may appear broken or incomplete to an outside observer.
Code changes are committed to version control, creating a reliable, restorable record of progress. When work is ready for a stakeholder (e.g., project manager, site owner, QA team) to review and approve their work, they will deploy it into the Test Environment.
Deploying into Test often also includes cloning the content (and if possible the configuration) from the Live Environment. For minor changes, this may not be necessary, but for a non-trivial release you want to see a precise preview of what deploying into production will look like.
After approval is granted, code changes deployed into the Live Environment, and, thanks to high-quality testing, there are no surprises, the site users and administrators get access to the new design or functionality, and there is much rejoicing.
Now it's time to start iterating again!
Before the next cycle begins, the developer will want to refresh her environment with the latest content and config from Live. Working against an up-to-date picture of reality reduces errors. (See Image A)
If the Live environment is fast-moving, or the current work in progress lags by a few days, performing refreshes as part of the normal developer workday is advisable. This is usually up to the judgment of the developer, although for larger teams on complex projects, a team-wide standard is a good idea.
A Note About Syncing Content
In this workflow, the only reliable and rational way to synchronize the content between different environments is to do a complete copy: Partial synchronization or “database diffing” is asking for trouble. On the filesystem, copying between different environments can be done efficiently with rsync or other utilities.
This can be a challenge for large websites. Data has mass, and for sites with significant content these synchronization operations take time. Where this time cuts into team velocity, it is possible to use a “representative sample” by pruning the database of older content and then syncing with the trimmed database dump.
However, testing code changes against a pristine copy before release is still a must.
The pattern of “pulling data back” also guarantees that any configuration stored in the database — which, by default, is most configuration — is reliably shared between all environments. This helps keep developers on the same page when creating new functionality.
Feature Branching for Development
Websites are usually a team sport, but trying to share environments is a recipe for frustration, and trying to organize parallel work on separate features is a recipe for disaster; for instance, one developer’s changeset overwrites another’s or creates unexpected incompatibilities. Events like these destroy productivity and lower team morale.
The answer for this is to utilize multiple development environments along with a version control pattern known as “feature branching.” A developer or team will make a branch for their work, build on that branch, and then propose a merge back to the main-line (usually master) when work is complete and ready for release. (See Image B)
This allows multiple features to proceed without interfering with one another and creates a clean process for integrating changes. If a single release involves both features, teams are able to coordinate integration by merging between branches. If one feature goes out before the other, the feature still in development will pull in changes from the released feature via the master.
If feature work is very ambitious or moving at a slow place, it may be advisable to set up a separate branch specifically for integration. This allows you to keep the master branch clear for small tweaks and bug fixes for production, while any hairy challenges for integrating larger feature changes are worked out on the side.
The lifecycle for the average website is no longer one big launch every few years. This industry-wide shift towards continual improvement and iterative innovation demands new patterns for professional development work. As developers we should celebrate and embrace this shift; we have much to gain from the cycle of build, measure, learn.
This workflow may seem onerous to some, and there are certainly simple situations and sites that can get by with less. I emphasize simplicity, in large part to help teams avoid modeling the complexity of their organizational structure into the system, an urge which usually does more harm than good.
However, as Drupal powers more ambitious websites with larger and more diverse teams, these workflows will become more widely demanded and practiced. Much like the adoption of version control, most developers I know who have started using a multi-environment workflow have little desire to regress to the old ways.
Doing it right is the ticket to great rewards.
Elements of a Drupal Website
- Code: Website code includes Drupal core, any contrib or custom modules, and the theme. All code must be tracked in version control. The Drupal project itself – and most of the industry – has standardized on Git as the version control system of choice; the workflow described here relies on the functionality and conventions of Git.
- Content: The data that fills up the website is its content. With Drupal, content can be created, edited, or deleted without a developer’s involvement. That's the whole point, and that means how we manage content is critical to our success.
- Configuration: The third component of a Drupal website is its configuration – application settings which control how the code and content interact, often exposed through the admin user interface. Configuration is traditionally a big challenge. In versions prior to Drupal 8, there is no core method for how configuration is described or managed. The best answers are CTools exportables and interfaces like the Features module. In the best case, these contributed modules allow you to export configuration to code, where it can be tracked in version control and deployed along with accompanying functional or style changes. Managing configuration means extra work up front for developers, but for complex projects that see rapid iteration, it is well worth the investment.
Image: Confluence of the Inauc River by Sundeep Bhardwaj is licensed under CC BY- 3.0