fromSeptember 2013

Drupal Static Caching

Are You Giving Your Functions Some Static?

Static-filled Television Drupal at scale is possible, and indeed, even powerful. Ask someone what they think of Drupal, though, and more often than not they'll tell you that they've heard it's slow. I've seen a lot of poorly-performing Drupal sites in my line of work, and caching is by far the most common reason for the gap between possibility and practice. Even the most basic Drupal installation brings an excellent multi-tier caching architecture to the table, but unfortunately it's easy for developers to break it.

Perhaps the most frustrating caching problem is when developers miss easy opportunities to leverage static caching in their custom modules. By storing computed function results in static PHP variables, further calls to the same method can be made hundreds or thousands of times faster. Taking advantage of this technique requires minimal developer effort: if a result has already been computed, return it; otherwise, store the new result in the cache before returning it.

function apachesolr_static_response_cache($searcher, $response = NULL) {
  $_response = &drupal_static(__FUNCTION__, array());

  if (is_object($response)) {
    $_response[$searcher] = clone $response;
  if (!isset($_response[$searcher])) {
    $_response[$searcher] = NULL;
  return $_response[$searcher];

The Apache Solr module uses static caching in several places, such as ensuring that only one Solr search will be performed per request, even when there are several search-related blocks on the page.

Like any caching solution, the performance benefits of static caching depend on whether the speed benefit of cache hits outweighs the performance overhead associated with cache misses. The largest performance gains come from caching functions that are time-consuming, repeated often within a single PHP execution, and expected to return the same value more often than not. This is a well-defined set of conditions, and a lot of Drupal code meets them.

Many core functions that do heavy lifting are statically cached: menu.module uses a tree data structure, which means that it does a lot of recursive, potentially repetitive work. By leveraging static caching, it can dramatically reduce the time spent processing menu items and menus used in multiple places on the page. Likewise, the queries involved in Drupal's taxonomy system are relatively complex, and a given taxonomy term might show up dozens of times on a list or gallery page. Five functions in taxonomy.module contain drupal_static() implementations to speed up this repeated work.

Static caching is also seen in core for functions that are very fast, but called in many places and expected to almost never change. drupal_is_front_page() and drupal_page_is_cacheable() are simple checks, but a slightly more involved example can be found in ip_address(). The first pass through this function may have to process the X-Forwarded-For header to determine the user's true IP address, and that requires string manipulation. While not particularly slow, string parsing is still much slower than the overhead from an additional function call and memory access.

This all sounds great so far, but what if the cached value is no longer accurate? Even though static caching only persists within a single request, it's still possible for the cached data to become stale. It's easy to invalidate the static cache from inside the caching function, but the cache variable's scope prevents other functions from accessing it. This means that a naive static caching implementation can introduce other problems.

Prior to Drupal 7, it was up to functions whether to expose a way to invalidate the caches they contained. Some did so by tacking on an ungainly 'invalidate cache' argument to their function signature, but most didn't even bother doing that. The drupal_static() API was added in Drupal 7 so that functions can expose a consistent static variable interface for use by other functions, allowing modifying them even though they're outside the function scope.

The ip_address() function is a good example of the value of an API for static caching. Many sites have geolocation functionality (such as redirecting users to a default language or showing local content) and properly testing it is notoriously hard. The Smart IP module implements testing with a form of IP spoofing that checks user permissions and debug variables:

if (user_access('administer smart_ip') && variable_get('smart_ip_debug', FALSE)) {

    $ip = variable_get('smart_ip_test_ip_address', ip_address());

    $location = smart_ip_get_location($ip);

This approach is fine for development, but what if you wanted to do automated testing of geolocated pages on your production site? The testing account would need to be given access privileges (which prevents testing of anonymous user functionality), and the IP address changes would affect site administrators too. Worse, every tweak to the testing IP address would result in a write to the variable table, which can cause performance degradation across the entire Drupal environment.

Because ip_address() exposes its static cache via drupal_static(), you could instead modify the function’s return value. This would allow testing of any IP-based functionality (including geolocation) without requiring user accounts, permissions, or interaction with the variable table. Switching back to the normal IP address partway through the request just requires a call drupal_static_reset(), at which point calling ip_address() again would parse the headers as if it were being called for the first time. lists 180 calls to drupal_static() in core Drupal 7 modules, and expands that number to 563. Not every call to drupal_static() is a static cache, and not every static cache is implemented with drupal_static(), but this still gives us an idea of how important static caching is for improving Drupal's performance. Have you made sure that your contrib modules and custom development are leveraging it just as effectively?

Image: ©