fromAugust 2011
Article:

Hooked On Security

0

As a content management system, Drupal has long recognized that security is a key component of managing content. A CMS in which the wrong people can edit a page is not very useful. Drupal has a number of systems to manage access.

First is Drupal's permissions system. Any code can easily check to determine if a given user has a given permission in the system and alter its behavior, or reject the user entirely if not.

Second -- and more importantly -- is the node access system.

Drupal's node access system actually has two parts: Runtime access control and the grants system. Both have existed for years, but saw major improvements in Drupal 7 as a result of a meeting at DrupalCon Szeged. Let's have a look at how these systems work to keep content secure.

First, we need to understand what sort of access we are controlling. There are five operations that a user can perform on a node: Create, Read (view), Update, Delete, and List. The first four are obvious, but the final, List, is a special case of "Read". It's a special case because, when listing nodes, we have tighter constraints on how we can determine access. We'll see why in a moment.

In code, anywhere we want to check a user's access to a node we can simply call node_access($op, $node), where $op is one of "view", "update", "delete", or "create" and $node is a fully-loaded node object or, in the case of create, the node type to create. To check a user other than the current user, passing a user object as an optional 3rd parameter. node_access() will return either boolean true or boolean false.

node_access() runs a number of checks. If the user has "bypass node access" permission, it returns true unconditionally. In earlier versions of Drupal this permission was part of the overly-broad "administer nodes" permission. If the user does not have "access content" permission, it will return false unconditionally. That is useful for limiting a site to authenticated users only.

Now we get to the meat of the node access system, where we can take control. First, node_access() invokes hook_node_access() with the node, operation, and user in question. In Drupal 6 and earlier, hook_access() was a pseudo-hook, and only the module that defined the node type could implement it. For Drupal 7, that was replaced with hook_node_access() to allow all modules to influence the access of any node. Implementations of hook_node_access() may return one of three constant values: NODE_ACCESS_ALLOW, NODE_ACCESS_DENY, or NODE_ACCESS_IGNORE. Returning nothing is equivalent to returning NODE_ACCESS_IGNORE.

A given user will have access to perform an operation, say "update," if and only if at least one module returns NODE_ACCESS_ALLOW and no module returns NODE_ACCESS_DENY. Because Drupal will deny access to a node by default, it is rare for access control modules to explicitly deny access, as that prevents other modules from granting access.

Familiar node access permissions such as "edit page content" or "edit own article content" are provided by the node module's implementation of its own hook and?as of Drupal 7?are provided for all node types regardless of the module that created them. They can now also be disabled by setting the node_permissions_$type variable to false. That's useful if you are using some other access logic and want to entirely disable the permission-based controls. (Just remember to re-enable them in your uninstall hook.)

As an example, let's say we want to allow a user to edit (but not delete) their own nodes, but only for an hour after it's posted. That allows the user to correct typos they find right after hitting submit (which always happens) or to delete an inappropriate comment, but not go back and change it days later. The simple implementation is shown in the following code snippet. As we can see, if one of our conditions fails we do not deny access; we simply do not grant access and let other modules decide what to do. (Making the affected node types and time period configurable is left as an exercise for the reader.)

<?php
function example_node_access($node, $op, $account) {
  if ($op == 'edit'
      && $node->uid == $account->uid
      && $node->created > (REQUEST_TIME - 3600)) {
    return NODE_ACCESS_ALLOW;
  }
  return NODE_ACCESS_IGNORE;
}
?>

This is incredibly powerful, and yet was completely impossible in Drupal 6 unless our module defined the node type in the first place. In Drupal 7, it's one if() statement.

There are two other checks made:

  1. If no module decided to either grant or deny access to a node, we check to see if the node is unpublished and the user has the "view own unpublished content" permission. If so, and it's the user's own node, permission is granted.
  2. The check of last resort is the node access grants system. This is Drupal's most fine-grained?but least understood?access system. It is also the only one that can handle List operations. Consider the case of listing the 10 most recent forum posts on a site where not all users have access to all forums. If we simply queried for the 10 most recent nodes of type forum, we'd get a number of nodes that the user shouldn't be seeing since he doesn't have access to view them. If we wanted to filter those out, we would have to load all those nodes and then run node_access() on each of them in turn, after which we have fewer than 10 nodes left! We could then query again for more nodes and repeat the check, but could easily find ourselves back in the same situation.

List operations are tricky because the filtering must be done in the search operation itself, such as a database query. To that end, Drupal includes a node_access table in the database that acts as a giant materialized access lookup table. Any module may inject rules into it, keyed by group (usually, but not always, user ID) and node ID.

The node_access() function will check that table directly for a record, but in practice the grants system is more useful when running listing queries. Listing queries for nodes must always use the db_select() query builder and be tagged with the "node_access" tag. That in turn fires hook_query_node_access_alter(), which allows the node module to add an extra join to the query itself to filter out nodes that the user doesn't have access to according to the node_access table. An even better approach is to always run node listing queries using the EntityFieldQuery query builder. Although it does not offer as much fine-grained control as SQL, it will translate the listing query into any storage engine in use for nodes; SQL, MongoDB, Cassandra, etc. It will also apply the node_access filter appropriate to that backend.

The grants system has several hooks of its own, which we don't have space to cover here. For now, understand that in practice it is only useful for controlling view listings. Update and Delete usually both require a full node anyway, which makes it easier, and more flexible, to just use hook_node_access().

Drupal 7's access control system has improved dramatically. The introduction of a single hook in the right place has made possible functionality that didn't exist before, and modules such as Organic Groups and Workbench are already taking advantage of it to build more efficient and powerful functionality. It will be exciting to see what other new capabilities module developers come up with in contrib.