Any improvements made anywhere besides the bottleneck are an illusion. — Gene Kim
How To Identify And Resolve Your Pinch Points
Rush hour is a cruel juxtaposition between drivers ready, willing, and able to get to their destination as quickly as possible, only to find themselves creeping along (or at a dead stop) due to traffic jams caused by the roads being over capacity and the lack of coordination or optimization of the travel plans of each individual driver.
Most of those drivers also know that by consulting Google Maps and Waze, they can discover the cause of the jam-up, their distance to it, and estimated time of delay; they can then make the decision to stick it out or take the next exit and proceed through the streets.
How motor vehicles flow through a network of highways and byways is a good analogy of how work flows through an organization. When the volume of your company’s work ramps up, and there comes a moment when a single stage becomes the rate-limiting constraint for the entire system, your first step is to locate that damn bottleneck.
And Waze won’t help.
Making Work and Workflow Visible
While the system-wide constraint may be obvious in some organizations, it's not always apparent until the flow of work increases to the degree that the bottleneck is overloaded and potentially damaged, making things worse (e.g. a car accident causing traffic to back up even more). The following three exercises will help identify the problem area.
Exercise #1: Inventory the Four Types of Work
As organizations grow, so does the quantity and variety of projects that are being worked on simultaneously across different teams and divisions. Unfortunately, the net result is that it becomes more difficult to quantify and prioritize what is being done. Therefore, the first order of business is simply to locate this information. You can perform this exercise digitally, using a spreadsheet, but it's much more powerful if done against a wall with index cards. Either way you'll gain insight into your organization.
To start, create four columns on the wall and label them: Business Projects, Internal Projects, Changes, and Unplanned Work. Then, fill out a single index card for each active project or recurring activity that falls under each heading. To make sure you're not missing anything, invite other members of the team to review.
Here are some descriptions of the four groups.
- Business Projects: Any product or service delivered to a client. (As a rule of thumb, if it's revenue-generating, it's a business project.)
- Internal Projects: Any work done to improve or maintain the state of the organization. Examples include infrastructure improvements, training, hiring, attending trade shows, strategy meetings, etc.
- Changes: Anything that can disrupt the ability to deliver products and services to a customer, or anything that can disrupt internal projects. Examples include server configurations, software updates, etc.
- Unplanned Work: Typically referred to as “fire fighting” because it's often chaotic and resource intensive. It's only productive in the sense that it restores operations.
For most organizations, the results of this exercise can be shocking. You may find that your number of internal projects exceeds your business projects. You may find that as much as 30-35% of everyone’s time was spent on unplanned work. This can be a bitter pill to swallow, particularly if project managers and business developers forecast timelines under the assumption that 80-90% of an individual's time can and will be spent on business projects.
While this type of information is incredibly valuable, it's generally not enough to identify the key bottleneck, although it should reveal how much effort is going toward client versus non-client work.
Exercise #2: Identify the Work States
The different roles that exist within an organization tend to work at different stages in the pipeline of building Drupal sites, so it's valuable to learn whether or not the current distribution of projects and their various states matches the current makeup of your organization.
To do this, we need to introduce the concept of an agile or kanban board, in which we create a column for each major transition point of a project. While the specific breakdown will be unique to your organization, we can generalize this for many Drupal projects with the following buckets:
- Sales: The process of converting leads to contracts.
- Discovery: The process of further refining the requirements, design, and overall blueprint.
- Development: The actual prototype or build phase.
- User Acceptance Testing: Ensuring requirements are met before delivery.
- Deploy: Ensuring a smooth launch.
To complete this exercise, take the cards that you used in the previous step and color code them as a function of type. Then place them in the appropriate column. You may need to add additional columns for projects in a backlog or “paused” state, but, as best you can, the board with anything actively worked on in the past week.
Ideally the distribution matches your team's capacity. However, you may start to see disturbing patterns emerging. If the development stage is already at full capacity and a significant number of projects are in discovery, it might prove wise to delay any further lead conversion until enough projects move out of development, in order to reduce the number in progress in discovery.
Personally, I find it critical to perform this exercise every week, so that all teams across all stages can begin to anticipate when work may ramp up or slow down over the next two to four weeks. Armed with this knowledge, the organization can be proactive in moving some projects through the pipeline faster or pause other projects in order to even out the pipeline. (See Image A.)
Exercise #3: Identify How Work Flows
This process is the most detailed of the three, but it's valuable in unearthing processes that are overly complex or overly reliant on an individual. Here, we start with the end-to-end flow of how work moves forward and backward from stage to stage. We take it one step further by identifying the processes within each stage to uncover the true path of work as it flows through the organization.
To demonstrate this, I'm going to provide a hypothetical flow diagram at both the global level and within a particular stage (development). We'll then review how this information can be used to identify a bottleneck.
In Image B, we see the overall flow of work between the different stages of the website delivery pipeline. Note that we have to be realistic with ourselves: work can flow backwards, particularly in cases where a shortcut or an error introduced upstream wasn't captured. We have also explicitly noted the reliance on tools, systems, and other IT related needs to perform work at each stage.
In Image C, we zoom in on a hypothetical development workflow. To help surface our reliance on particular roles or individuals, we add that contextual data to each step. Similar to Image B, we're also explicit about scenarios where we go back upstream to business development or requirements analysis.
Although it can take a bit of time to think through each state with this much detail, the benefits are immense.
Suppose you discover that nearly every project has 10-20 hours that needs to go back to sales because of a misunderstanding of what was in or out of scope on a fixed-bid project. From the perspective of flow throughout the entire system, that is unplanned work that ultimately may require ripping up and replacing work in the discovery and development phases. While there is no way to prevent every possible type of miscommunication, knowing the heavy costs of these movements back upstream should serve as a motivating factor to spend more time upstream to ensure requirements are properly conveyed such that the overall flow is optimized.
Suppose you discover that a single individual is required at nearly every step of the process. What happens if that team member has to take a sick day or gets pulled into another project? This can quickly result in a bottleneck because any number of steps in the development process could be stalled simultaneously.
By the time you've completed all three of these exercises, the biggest bottleneck in the organization should become apparent. And given that this bottleneck is the single biggest block in increasing the flow throughout the organization (e.g. roadwork on a highway during rush hour slows all cars), addressing it should become a top priority.
Breaking the Bottleneck
There are three basic themes we need to follow to maximize the flow through the system-wide bottleneck:
- Protect: Flow through the bottleneck should only be interrupted when absolutely necessary. Remember, “An hour lost at the bottleneck is an hour lost for the entire system.” (Eliyahu Moshe Goldratt)
- Elevate: Here, we optimize the bottleneck (hire more people, introduce efficiencies, reduce number of steps, etc).
- Subordinate: Work should only be released to the bottleneck as capacity is available. Any release of work prior to that can result in the bottleneck becoming overwhelmed and cause thrashing.
For more details on these terms, I highly recommend reading Goldratt’s The Theory of Constraints. For now, we can use these simplified definitions as a basis for each of the strategies listed below.
This is the most obvious solution. If a particular segment of your company faces a consistent mismatch between personnel and workloads (see exercise #2), then hiring more people will theoretically allow more work to flow through. However, this is often the most costly of all solutions, particularly if your bottleneck is at the development stage, given that the Drupal community has been suffering from a talent shortage for years.
Does everyone within your organization work on their highest value tasks at all times? Chances are if you completed exercise #3, then you identified certain high-demand individuals working on things that could easily be offloaded to someone else with much more available capacity. A great example of this occurs during the QA process for cross browser testing, which is something that nearly everyone in a Drupal shop can assist with. By identifying areas that can be offloaded to others, overall flow can increase.
Role vs Person Dependencies
It's incredibly helpful to have smart, effective people in an organization that can jump in and get things done. Unfortunately, with great success often comes an over-reliance on those same individuals. This can become a dangerous habit. If unresolved, a single person could become a dependency in many different stages, as you may have identified in exercise #3.
If you're still not sure whether you have a person-specific dependency, perform the hit-by-a-bus thought experiment. If this person was hit by a bus tomorrow and you now had to have someone else step in, could you do it? If the answer is no, then you have a bona fide person dependency which should be quickly resolved.
Documentation can be incredibly valuable, both for emergencies as well as for training and/or delegating certain tasks to others. It also externalizes all the tribal knowledge in everyone's head. Knowledge transfer of any kind ensures that there is at least some redundancy.
If you took the time to dive deep into exercise #3, you may have been shocked how simple five-minute tasks actually had to pass hands multiple times to be completed. Using this information, you can see where steps can be consolidated, eliminated, or replaced by fewer, more efficient steps.
Piggybacking off the concept of reducing waste, you may discover that certain steps require the same manual intervention over and over again, resulting in both human error and dependency on a particular individual. In these circumstances, automation can improve the throughput of work through the bottleneck.
A great example here might be the creation of a common virtual machine that a developer can download and boot up so that they spend as much time as possible on the task at hand rather than fighting with server settings and configurations.
Controlling Release of Work
It's counterintuitive, but how quickly work is released to all downstream teams can break an organization. For example, ramping up the number of converted sales while the team responsible for performing discovery is already overbooked will only result in thrashing. It's better to slow down the number of leads until some space is created by work leaving that stage.
Prioritize Internal Projects
In exercise #1, you may have identified a significant number of internal initiatives ranging in importance from mission critical to nice-to-have. Unfortunately, if the number of internal projects exceeds a certain threshold, you can cause both internal projects and client projects to stagnate.
A certain level of self-discipline is in order. Prioritize the top 5-10 initiatives that will truly result in a big impact (i.e., improve throughput in the bottleneck). By eliminating unnecessary work in progress (WIP), you've created the space to complete these important items, which then frees up more time and resources to work on the next batch.
Taming the Traffic Jam
Would these principles reduce the quantity and severity of traffic jams within a particular network? While we can never fully avoid the effects of a bad accident producing a mile-long parking lot, we can become proactive in how we can prevent, avoid, or work around more common situations (e.g. planned roadwork). The good news is that bottlenecks in a company’s workflow pose simpler problems with simpler solutions.
As Drupalistas and Drupal agencies, we need to improve our ability to execute, innovate, and ultimately deliver kickass Drupal websites to our clients.