Skip to main content
Maximilian Rabus

When it comes to importing content into a Drupal website, there are various options. One of these is migration using csv files. If used correctly, this can make content creation much easier. Migrating thousands of pages can be done at the touch of a button with the necessary preparatory work. However, not only the migration, but also the updating of these pages can be simplified with the management of csv files.

In this article, we will first discussa migration via drushusing a simple example . We will then look at the possibilities of carrying out a migration without drush and changing the process dynamically.

Preparation

Before we start with the actual migration, some preparations must first be made:

  • The modules

    migrate, migrate_tools, migrate_plus and migrate_source_csv

    modules must be activated.

  • You should also ensure that the latest version of drush is installed.

Once this has been done, we can move on to creating a simple csv file.

The csv file

A csv file is similar to a table in which each column is assigned a value in the rows below it. The first line is usually used to name the columns. These columns representvalues that can be assigned to any entity. Imagine, for example, a content type "people" for which fields are defined whose values can be filled by the columns specified in the csv file.

Our example csv file "people":

id,first_name,last_name,age,date_of_birth,married,hair_color,picture
1,Paul,Watson,24,1993-01-08,0,brown,paul.png
2,John,Miller,36,1980-06-24,1,black,john.png

As a formatted table this would look like this:

id

first_name

last_name

age

date_of_birth

married

hair_color

picture

1

Paul

Watson

24

1993-01-08

0

brown

picture.png

2

John

Miller

36

1980-06-24

1

black

picture.png

For each line of the csv file that is not the header line, such content can now be created.

The yml file

In order to assign the individual values of the csv file to the fields of the content type, an additional configuration file is required in addition to the csv file. The file for our "people" example could be named as follows:

"migrate_plus.migration.migrate_node_people"

"migrate_node_people" can be replaced by any name. Theonly important thing is the "migrate_plus.migration" in front of it so that the configuration recognizes this file as a migration.

The structure of such a file is explained step by step below.

id: people_csv_import
label: 'Example csv migration'
source:
 plugin: csv
 path: 'public://csv_file.csv'
 header_row_count: 1
 keys:
   - id

The first two lines

id: people_csv_import
label: 'Example csv migration'

give a migration an ID and a label to make it easier to identify. In the next lines

source:
 plugin: csv
 path: 'public://csv_file.csv'

follows the "source" section. Here we specify that we want to use a csv file within the public directory as the source for the migration. This is followed by a note that one line must be skipped as it represents the headings of the individual columns.

header_row_count: 1

"Keys" are the keys that distinguish the individual rows from each other. In our case, however, this is only one, the id.

keys:
- id

Within the "process" section

process:
 field_complete_name:
   plugin: concat
   source:
     - first_name
     - last_name
   delimiter: _
 field_age: age
 field_date_of_birth: date_of_birth
 field_married: married
 field_hair_color:
   plugin: entity_generate
   source: hair_color
 field_picture:
   plugin: entity_generate
   source: picture
   default_values:
     uri: 'public://picture.png'

the fields of the content type "people" are now assigned to the values of the csv file as an example.

"field_complete_name"

is a text field, but should contain the complete name of the person. To achieve this, the "concat" plugin is used. This plugin concatenates two strings together. The concatenation can be influenced by optionally specifying a separator. The output for the first line of the csv file would therefore be

"Paul_Watson"

A complete list of all plugins can be found here: https://www.drupal.org/node/2129651.

The next field

field_age: age

contains the age of the person as a number. Since the location of the csv file has already been specified above and we do not want to change this information, the name of the column can be assigned directly.

field_date_of_birth: date_of_birth

is a date field. This value can also be assigned directly. The only thing to note is the correct format of the date in the csv file. Year-Month-Day (e.g. 1999-06-24) for a date field that only contains the date, and Year-Month-DayHour:Minute:Second (e.g. 1999-06-24T12:04:13) for a date field that contains the date and time. A possibility to specify a format of your own choice and still obtain a correct date is discussed later.

field_married: married

is a Boolean. The only thing to note here is that a 0 (false) or a 1 (true) is specified in the csv file.

In our example there is a taxonomy

"hair_color"

which is stored in the field

"field_hair_color" 

field. Here we make use of "entity_generate", another plugin, which searches for a term of the referenced taxonomy based on the name of the specified source. If a term with the specified name does not yet exist, it is created and added to the field and the taxonomy.

"field_picture"

is a reference to a picture. Here, too, we use "entity_generate" to use a possibly already existing file instead of creating a new one. In addition, "default_values" is used, which provides a way to fill additional fields of an entity with values if it does not yet exist. Here we specify the path to the file that we want to use as an image.

It is important here that fields that have not been created manually for an entity (e.g. "langcode", "status", etc.) can also be filled with values.

The next step is to specify the type of content to be created:

type:
   plugin: default_value
   default_value: people

Finally, all that remains to be specified in the "destination" section is the type of content. In our case, a "node" of the type "people" is created.

destination:
 plugin: 'entity:node'

Below is the configuration file as a whole:

id: people_csv_import
label: 'Example csv migration'
source:
 plugin: csv
 path: 'public://csv_file.csv'
 header_row_count: 1
 keys:
   - id

process:
 field_complete_name:
   plugin: concat
   source:
     - first_name
     - last_name
   delimiter: _
 field_age: age
 field_date_of_birth: date_of_birth
 field_married: married
 field_hair_color:
   plugin: entity_generate
   source: hair_color
 field_picture:
   plugin: entity_generate
   source: picture
   default_values:
     uri: 'public://picture.png'

type:
   plugin: default_value
   default_value: people

destination:
 plugin: 'entity:node'

To be able to use the configuration file as a migration, it must first be imported into the current configuration. This process can be carried out via the interface under Administration > Configuration > Development > Synchronize or

admin/config/development/configuration/single/import

can be carried out.

Importing the migration

The migrate_tools module has added several drush commands that make it possible to import the migration that has just been created. The command

"drush migrate-status" ("drush ms")

can be used to call up an overview of all currently existing migrations.

By executing

"drush migrate-import people_csv_import" ("drush mi people_csv_import")

a specific migration is executed by specifying the id. In our example, two "nodes" of the type "people" were created. Other useful drush commands:

  • "drush migrate-rollback" ("drush mr")

    This command undoes a migration that has already been imported.

  • "drush migrate-stop" ("drush mst")

    Stops a running migration.

  • "drush migrate-reset-status" ("drush mrs")

    Resets the status of a migration to "idle". Only migrations that have the status "idle" can be imported.

  • "drush migrate-messages" ("drush mmsg")

    Displays all messages (e.g. error messages) of a migration.

As already mentioned, it is also possible to update migrations that have already been imported. If values in the csv file have been changed, an update can be carried out with the command

"drush mi migration_id --update"

can be carried out.

Miscellaneous

id: people_csv_import
migration tags:
 - example
 - import
migration_group:
 - csv
label: 'Example csv migration'

It is possible to define tags and groups within the configuration file. The commands just listed can be applied not only to individual migrations, but also to different tags and groups to save time.

Not every value of the csv file has to be used and vice versa, not every field of the created entity has to be assigned a value. Even fields that are "required" do not necessarily have to be assigned a value. The corresponding entities are created anyway. This can become a problem if the entities are manually edited and saved again. The validation of the form will then fail because a required field has not yet been assigned a value. It is therefore advisable to always assign a value to fields that are "required".

Custom plugins

With the "migrate" module, we have the option of writing our own plugins to change the data in the csv file in a certain way. These plugins can be used in the same way as those already mentioned. Below is an example of a plugin that formats a timestamp into the correct format for a date field.

<?php

namespace Drupal\csv_import\Plugin\migrate\process;

use Drupal\migrate\MigrateExecutableInterface;
use Drupal\migrate\ProcessPluginBase;
use Drupal\migrate\Row;

/**
* Formats a timestamp so it can be used in a migration.
*
* @MigrateProcessPlugin(
* id = "format_timestamp",
* )
*/
class FormatTimestampsExample extends ProcessPluginBase {

 /**
  * {@inheritdoc}
  */
 public function transform($value, MigrateExecutableInterface 
   $migrate_executable, Row $row, $destination_property) {
   $date_formatter = \Drupal:service('date.formatter');
   $current_date = $date_formatter->format($value, 'custm', 'Y-m-dTH:i:s');
   return $current_date;
 }

}

This class was used in a sample module under

example_module/src/Plugin/migrate/process/FormatTimestamp

has been inserted. This is due to the fact that this is a "process" plugin. A plugin for "destination" would therefore be added under

example_module/src/Plugin/migrate/destination/DestinationPluginClass

would be added.

The id below the annotation "@MigrationProcessPlugin" is the name that can be used later within the configuration file to call this plugin.

Within the "transform" function, the timestamp of the csv file is formatted into a valid date. This function therefore switches between the value of the csv file ("value") and the value that is finally assigned to the corresponding field (the "return" value) when it is called. In addition to the value of the csv file, other parameters, such as the current line, are also passed so that it is possible to change values based on other information.

If there is now a timestamp in the csv file where there was previously a valid date, the above example can be used as follows:

field_date_of_birth:
 plugin: format_timetamp
 source: date_of_birth

Migrations without drush

The above already offers many options for migrating content to a Drupal website. However, if you want to have a little more control over the migration process, all the commands described can also be executed without drush within the code.

// Get plugin manager for migration configs.
$manager = \Drupal::service('plugin.manager.config_entity_migration');

// Get all migrations.
$migrations = $manager->createInstances([]);

// Select people_migration with id of yml file.
$people_migration = $migrations['people_csv_import'];

// Start migration.
$executable = new MigrateExecutable($people_migration, 
new MigrateMessage());
$executale->import();

The plugin manager for migration returns all migrations currently in the configuration when "createInstances([])" is called with an empty array. The " import()" function at the end is the same function that calls "drush mi". This piece of code, used in the right place, has the same effect as executing via drush. An update of this migration can be executed with the following addition:

// Update this migration on next import.
$people_migration->getIdMap()->prepareUpdate();

This registers the migration to be updated the next time it is imported. Repeated importing of a migration without this addition is aborted and not carried out.

If you do not want to import the corresponding "yml" file of the migration via the interface, you can use a module under

example_module/config/install

to store this file. The file is then automatically added to the configuration when the module is installed.

Please note that the file is not deleted from the configuration when the module is uninstalled. This leads to an error during a new installation, as the file to be installed already exists. The following addition within a "csv_example.install" is therefore important, which deletes the file from the configuration when the module is uninstalled in order to avoid error messages:

function migrate:menu_uninstall() {
 \Drupal::configFactory()->getEditable
 ('migrate_plus.migration.migrate_node_people')->delete();
}

Migration Events

The "migrate" module registers a series of events that can be used by an "EventSubscriber" during the run of the "import()" function in order to specifically influence the import of a migration.

Such an "EventSubscriber" can be used by integrating a "services.yml".

services:
 example_module.event_subscriber:
  class: Drupal\example _module\EventSubscriber\MigrationEventSubscriber
  tags:
   - { name: event_subscriber }

An "EventSubscriber" can now be created under the specified path.

<?php

namespace Drupal\example_module\EventSubscriber;

use ...

/**
 * Event Subscriber to handle migration events.
 *
 * @package Drupal\example_module\EventSubscriber
 */
class MigrationEventSubscriber implements EventSubscriberInterface {

 /**
  * Event which fires befor the start of the import.
  *
  * @param \Drupal\migrate\Event\MigrateImportEvent $event
  * Current MigrateImportEvent object.
  */
 public function migratePreImport(MigrateImportEvent $event) {
  $migration = $event->getMigration();
 }

 /**
  * Event which fires before the row is going to be saved.
  *
  * @param \Drupal\migrate\Event\MigratePreRowSaveEvent $event
  * Current MigrateImportEvent object.
  */
 public function migratePreRowSave(MigratePreRowSaveEvent $event) {
  $migration = $event->getMigration();
  $row = $event->getRow();
 }

 /**
  * Event which fires after the Row has been saved.
  *
  * @param \Drupal\migrate\Event\MigratePostRowSaveEvent $event
  * Current MigrateImportEvent object.
  */
 public function migratePostRowSave(MigratePostRowSaveEvent $event) {
  $migration = $event->getMigration();
  $row = $event->getRow();
 }

 /**
  * Event which fires after the import has been executed.
  *
  * @param \Drupal\migrate\Event\MigrateImportEvent $event
  * Current MigrateImportEvent object.
  */
 public function migratePostImport(MigrateImportEvent $event) {
  $migration = $event->getMigration();
 }

 /**
  * {@inheritdoc}
  */
 public static function getSubscribedEvents() {
  $events[MigrateEvents::PRE_IMPORT][] = ['migratePreImport'];
  $events[MigrateEvents::PRE_ROW_SAVE][] = ['migratePreRowSave'];
  $events[MigrateEvents::POST_ROW_SAVE][] = ['migratePostRowSave'];
  $events[MigrateEvents::POST_IMPORT][] = ['migratePostImport'];

  return $events;
 }
}

Within "getSubscribedEvents()", various functions can be defined for the different "MigrationEvents" that are to be called for the respective event. Within the event is the migration and sometimes even the object of the current line of the csv file. These objects can be changed here as required.

Conclusion

When used correctly, migrations are a powerful tool that can simplify the creation and updating of huge amounts of data with a little preparatory work. I hope that this guide has answered a few questions about migration and will make it easier for everyone to create and import migrations in the future.