Drupal Migration Sucks
Why is it so hard to debug?
In: Drupal · Web Development · store.lathes.co.uk
At the time of writing there are still hundreds of thousands of sites running on Drupal 7, despite the fact that security support for it has officially ended. There are several likely reasons for this, though there is one in particular that deserves its own write-up.
Migrating is Hard
When faced with migrating a site from Drupal 7 to a newer version, two options present themselves:
- Use the Migrate API like a real man 💪
- Use the Feeds module like a sissy boy 😿
Using Feeds Instead
There are actually some strong cases for using the Feeds module:
- Simpler data sources - For straightforward CSV, RSS, or similar structured data imports without any need to manipulate data, Feeds can be quicker to set up.
- One-time imports - If you’re not migrating an active site, it can be easier to download everything as a set of CSV files (using Views Data Export for example) and manipulate the data manually before importing.
- Recurring imports - Feeds is good for scheduled, repeated imports from consistent sources like RSS feeds.
- Content creation by end-users - Feeds can be configured to allow site users to upload and import their own content within defined parameters, through a fairly a user-friendly UI. No need to write PHP or YAML.
- No command-line - In environments where you can’t use Drush or other CLI tools, the UI is the only option. Migrate Plus provides a UI for the Migrate API, but if you’re going down that route you’re probably not going to use it (and you probably have shell access).
- Learning curve - Spending the time figuring out how to use the Migrate API might not be justified, even for developers or experienced site builders.
Conversely there are some good reasons to use Migrate APi instead:
-
Data manipulation - Migrate API offers process plugins for data manipulation that Feeds can’t match, even with Feeds Tamper.
-
Configuration migration - Migrate API provides migrations for everything; fields, content types, users, module and site configuration. The command
drush migrate:infoprovides a summary of all the content and config entities available across all migrations; that is to say, those provided by core, by Migrate Tools, and also your own artisanal hand-spun YAML files. -
Version control - Migration configurations are stored in YAML files that can be version-controlled and deployed across environments.
-
Entity references - Better handling of entity references and maintaining relationships between content. If necessary this can include the creation of ‘stub’ entities that are popoulated in a subsequent migration.
-
Dependency management - The ability to define migration dependencies ensures everything is created in the correct order (taxonomy vocabularies and their fields, then the terms, then the content that references them).
-
Drush integration - It’s a lot quicker to use the CLI once you get down to the nitty gritty.
-
Data validation - More sophisticated options for validating data during the import process, including the option to skip a rows based on specific (or empty) values, and to log a custom message when that happens.
-
Plugin-based - The Migrate API is built in such a way that makes it relatively easy for you to extend with your own source and process plugins.
So having decided you’re a real manly man who’s capable of using the Migrate API and has too much time on their hands, you then have to get to grips with it.
Anatomy of a Migration
Drupal migrations follow the Extract, Transform, Load (ETL) paradigm, and these stages are defined in the source, process and destination keys of a migration YAML file. You will generate a YAML file for each type of entity to be imported; a migration can only have one destination plugin.
Let’s use a migration for the ‘Article’ content type as an example:
id: articles
label: 'Content - Articles'
migration_group: site_content
source:
# One source plugin is configured here
process:
# Several process plugins go here; at least one for every value that must be set on the destination entity
# Each process plugin may be as simple as setting a default value or copying a value from the source
destination:
plugin: entity:node
default_bundle: article
migration_dependencies:
required:
- my_tags
Extract (source)
Drupal’s migration system makes use of source plugins that connect to and extract data from various sources. These plugins can query SQL databases, parse CSV, XML, or JSON files, connect to REST APIs etc. Core provides some plugins, and more are available. You can also write your own if you think you’re hard enough.
Transform (process)
Transformation is the proccess of getting the data into the right format for Drupal to store, and may consist of:
- Mapping sources to destination fields
- Mapping specific values to other values
- Handling nested values (such as a text field’s
formatandvalue) - Handling multiple values
Load (destination)
The transformed data is then loaded into the database. The destination plugin handles creating new entities (nodes, taxonomy terms, fields, content types etc.) and updating existing entities.
The system maintains a “map” of migrated items, tracking which source items correspond to which destination entities, enabling updates and rollbacks.
Golden Contrib
The Migrate Tools and Migrate Plus modules are pretty much a hard requirement. You may as well install them before you start.
Migrate Tools
The Migrate Tools module extends Drupal’s core migration framework with powerful utilities that make migrations more manageable:
-
Drush Commands: Execute and manage migrations via the command line with commands like:
drush migrate:import my_migration drush migrate:status drush migrate:rollback my_migration -
Migration UI: A simple interface for viewing migration status and executing operations through the Drupal admin interface. If you’ve come this far it’s pretty much redundent; just use Drush.
-
Migration Groups: Organize related migrations together and execute them as a unit.
-
Detailed Messaging: Improved error handling and messaging to troubleshoot migration issues.
Migrate Plus
Migrate Plus further enhances the migration system with:
-
Additional Source Plugins: Support for XML, JSON and SOAP sources via HTTP.
-
Migration Groups: Configure collections of migrations via YAML files.
-
Process Plugins: Additional transformation tools including:
- Entity lookup
- Entity generate
- Conditionally skip items
- URL handling
- Transliteration
Migrations: Structure
Migrations all have a source key, a process key, and a destination key that correspond to the terms Extract, Transform and Load in “ETL”. They may also specify their dependencies in the migration_dependencies key.
- Migration Definition:
id: my_articles
label: 'Article Content'
migration_group: my_migration_group
source:
plugin: csv
path: 'public://migrations/articles.csv'
header_row_count: 1
keys:
- id
process:
title: title
body/value: body
body/format:
plugin: default_value
default_value: 'full_html'
field_tags:
plugin: entity_lookup
source: tag_ids
entity_type: taxonomy_term
bundle: tags
value_key: name
destination:
plugin: entity:node
default_bundle: article
migration_dependencies:
required:
- my_tags
Debugging Migrations is Really Hard
That is, until you know what you’re doing. There is an informative page on the topic that I didn’t find until I’d already many hours. You literally just have to google ‘debugging drupal migrations’ and it pops right up. F.
https://www.drupal.org/docs/drupal-apis/migrate-api/debugging-migrations
This is key:
To investigate data within a migration, you can print data by combining the
callbackplugin andvar_dump():
process:
dump_sourcevar:
plugin: callback
callable: var_dump
source: sourcevar
When a migration fails you’ll need to reset it before it can be rolled back: drush migrate:reset
Best Practices
-
Version Control: Store migration configurations in code for consistent deployment.
-
Testing: Test with a subset of data before running full migrations. You can limi the number of rows to process with
drush migrate:import my_migration --limit=500 -
Backup ALL your shit: If you’re using a custom plugin (or migrating to a custom entity type) you may not be able to roll back without some further work.
ddev snapshot -n <snapshot_name>before you try a migration for the first time.
Resources
Official Documentation
- Drupal.org Migration Guide - Comprehensive official documentation covering migration concepts and implementation
- Migrate API Overview - Core concepts of the Migrate API
- Migrate Tools Documentation - Official documentation for the Migrate Tools module
- Migrate Plus Documentation - Documentation for the Migrate Plus module
Tutorials and Guides
- Lullabot’s Complete Migration Guide - Multi-part series covering migration fundamentals to advanced topics
- Drupalize.me Migration Courses - Paid training with in-depth video tutorials
- Agaric Migration Handbook - Comprehensive guide to Drupal migrations
Code Examples
- Migrate Upgrade - Module for upgrading from earlier Drupal versions
- GitHub - Migration Examples - Various migration projects and examples on GitHub
Community Resources
- Drupal Migration Contributor Group - Connect with migration experts
- #migration channel on Drupal Slack - Real-time help from the community
- Drupal StackExchange - Questions and answers related to migration
Tools
- Migration Builder - Helps generate migration configurations
- Migrate Devel - Development tools for debugging migrations
- Migrate Manifest - Run migrations from a manifest file
Blogs and Articles
- Pantheon’s Migration Guide - Step-by-step migration tutorials
- Mediacurrent Migration Resources - Articles on migration approaches and techniques
- Acquia’s Migration Best Practices - Migration strategies from Acquia
Advanced Topics
- Views Migration - Migrate Views configurations
- Migrate File - Tools for migrating files and media
- Migrate Source CSV - Enhanced CSV source plugin
Case Studies
- Tag1 Consulting Migration Case Studies - Real-world migration examples
- Chapter Three Migration Projects - Practical migration approaches