Blog
Write a Migrate Process Plugin, Learn Drupal 8
A few of us were coaching Campbell Vertesi on porting the CSV source to Drupal 8 and he asked as an aside about mapping US states he had in a taxonomy vocabulary to taxonomy IDs during a migration. Glad you asked! The answer gives us an example for quite a few concepts in Drupal 8, so let’s dig in! We will go over the code line by line.
Plugins
This particular class is a plugin. Plugins are normal objects in a predefined directory with a little metadata. For example, field widgets and formatters are plugins: they get a field and they return a form or a render array. We can change the formatter freely, only the type and meaning of the inputs and the output is fixed. Another good example are the image effects. Migrate uses plugins for everything: sources, processing, destinations. See more.
Namespaces, PSR-4
Line 8 contains a namespace declaration: the first part is Drupal
and then the module name migrate_plus
then the rest. Typically a plugin will follow by the a Plugin
part and then the name of the defining module migrate
and finally the type of a plugin process
if the defining module has several. Not every plugin type requires such a long namespace, for example entities simply use Entity
after the module name: Drupal\taxonomy\Entity
. Drupal 8 will look for classes of the migrate_plus module under modules/migrate_plus/src
(and all the other usual places for modules) and then the rest of the path is the same as namespace -- this is specified by the PSR-4 standard so this class is in the directory modules/migrate_plus/src/Plugin/migrate/process
(sneak preview: a few lines later we will find the class name is TermReference
and so the filename is TermReference.php
).
Use Statements
Line 10-16 contains use
statements. use some\namespace\class
allows us to just write class
in later code and the Drupal coding standards require this. It really is just syntactic sugar, you can even use
non-existing classes. As an aside, many of us have found the PhpStorm IDE very convenient for Drupal 8 development: for example, it takes care of the file placement and naming from the previous section and adds these use
statements automatically for you.
Annotations
Line 21-23 contains an annotation. Annotations are a very useful feature in sane languages (like Python) so much so that the PHP community have implemented them in user space… several times. As such, Drupal 8 uses the annotations syntax of Doctrine on classes and PHPUnit annotation on tests. The Doctrine annotations are pretty close to a PHP array except {}
is used instead of array()
. We can see a very simple example here: this is using the MigrateProcessPlugin
annotation and the plugin definition is array(‘id’ => ‘term_reference’)
. Every plugin must have an id
at least. In previous versions of Drupal you would’ve used a hook_migrate_process_info
returning an array keyed by the same id and some data. Although the info hooks are gone the alter hooks are still here: for example migrate_process_info_alter
is a valid hook (although at this moment undocumented as its utility is severely limited). Other similar hooks, however, are much more useful, for example hook_entity_info_alter
.
MigrateProcessPlugin
itself is a class in the Drupal\migrate\Annotation
namespace and it’s useful to know this because this class is the nexus of information about process plugins.
Classes, Base Classes and Interfaces
Line 25 contains the class name, a base class and an interface. One of the fundamental building bricks of Drupal 8 are interfaces. Interfaces provide a contract, that by which classes that implement it agree to provide certain functionality so that they can be used the same way other classes that use the interface. In other words, every class will have certain methods which take a certain kind of input and provide a certain kind of output. They are absolutely fundamental to plugins since any code interacting with a plugin will only know about the methods the interface require and nothing about the plugin details itself. Because of this, plugin types can require their plugins to implement a specific interface and Drupal throw an exception if they don’t.
Base classes are not a language feature, they are typical of Drupal 8 however: these classes contain some useful common logic for implementing an interface. Extending these instead of implementing an interface is very strongly recommended (although not mandatory at all). Some interfaces do not have a base class, for example ContainerFactoryPluginInterface
.
Services, Injection
We will skip the constructor for now and talk about the create
method starting on Line 40 required for implementing ContainerFactoryPluginInterface
and then we will cover the constructor briefly.
Previous versions of Drupal were often strongly coupled: hardwired function calls were the norm. In Drupal 8 a lot of functionality is provided by so called services. There is a service for all sorts of things: working with entities, logging information, installing modules etc. The container itself is an object and the most used method of it (by far) is get
as visible on line 46. You can find the services provided by core here. Because the container provides so many things it is not a good practice to pass and store the container in an object. By doing so, it becomes harder to understand (and to test) a class as it can basically depend on anything. Instead only the static create
method will get the container, it passes the necessary services to the constructor and the class itself now has clean dependencies.
By far the most commonly used service is the entity manager: the getDefinition
method gives us the entity type object, the equivalent of entity_get_info
in Drupal 7. The getStorage
gives us the storage object, which in turn can query and load entities of a particular type. (Then the entity objects can save themselves.) If we are not coding a nice little plugin then the entity manager can also be accessed at \Drupal::entityManager()
. The Drupal class has methods for most common of the functionality. Most of these methods are just wrapping a $container->get()
call so this list is also useful as a list of services. See more on services.
So the create
method grabs the taxonomy term storage object and passes it to the constructor. The constructor in turn will call the base class constructor which initializes the common plugin properties, our constructor then initializes our own properties: most importantly the term storage is now available to every method in the class.
Entity Query
We have a getTermId
helper method, not required by any interface -- it can not be as interfaces have public methods only. This method queries the term storage for the terms in the specified vocabulary. This perhaps looks familiar -- almost like a database query in Drupal 7. This, however, is for entities only and the condition
method is extremely powerful, for example to find nodes posted by users joined in the last hour, condition(‘uid.entity.created’, REQUEST_TIME - 3600, ‘>’)
. Also, in general, already in Drupal 7 using SQL queries was discouraged but in Drupal 8 it’s safe to assume accessing the database is just doing it wrong.
The entity query returns a list of entity ids and then we load those terms. The following interesting tidbit is $term->name->value
, this is one of the ways to access a field value in D8 but it’s mostly just for demo, using a proper method $term->label()
is strongly preferred. This $entity->fieldname->propertyname
chain can continue: we can write $node->uid->entity->created->value
to get the created time for the node author.
The entity query condition closely mirrors this syntax: change the arrows to dots, optionally drop the main property
, in this case value
and you will get the previously mentioned condition('uid.entity.created', ...
to query the same. The Entity API is a really powerful feature of Drupal 8.
Process Plugins
Finally we arrived to the transform method which is the only method required from a process plugin. Migrate works by reading a row from a source plugin then running each property through a pipeline of process plugins and then hand the resulting row to a destination plugin. Each process plugin gets the current value and returns a value. Core provides quite a number of these, a list can be found here. Most process plugins are really small: the average among the core process plugins is a mere 58 LoC (lines of code) and there is only one above 100 LoC: the migration process plugin which is used to look up previously migrated identifiers and even that is only 196 LoC (lines of code).
In our case the actual functionality is just one line of code after all this setup. Of course this doesn’t include error handling etc.
So there you have it: in order to be able to run this single line of code, we needed to put a file in the right directory, containing the right namespace and classname, implement the right interfaces, get a service from the container and run an entity query.