Automating Tasks with Drupal
Periodic Imports, Lightweight ETL, and CRM / ERP / Marketing Automation Integrations
Drupal is often perceived as a content-oriented CMS. Technically, it is also a robust automation framework, capable of handling periodic imports, lightweight ETL flows, and complex integrations with third-party systems (CRM, ERP, marketing automation).
This article focuses on the essential mechanisms, proven patterns, and recommended technical choices, through to external system integration.
1. Types of Automations in Drupal
Before implementing anything, it is essential to identify Drupal's exact role in the flow.
Common Use Cases
- Periodic imports (XML, CSV, JSON, REST API)
- Data synchronization (one-way or bidirectional)
- Data enrichment and transformation
- Triggering automated actions
- Exposing business APIs
Drupal can act as:
- Consumer: it retrieves data
- Processor: it transforms data
- Producer: it exposes or pushes data
In many projects, Drupal fills all three roles simultaneously.
2. Technical Foundations in Drupal
2.1 The Limits of Native Drupal Cron
Drupal's cron (hook_cron()) is unsuitable for serious automations:
- Uncontrolled frequency
- No parallelism
- No retry after failure
- Dependency on internal execution
It should only be considered for:
- Lightweight tasks
- Without critical business stakes
2.2 The Queue API: The Core Building Block of Automation
The Queue API is essential for any reliable automated task.
Principle:
- One business unit = one queue item
- Asynchronous processing
- Automatic retry
- Controlled execution
$queue = \Drupal::queue('my_custom_import');
$queue->createItem($data);
Advantages:
- Decoupling of ingestion / processing
- Better error tolerance
- Horizontal scalability
- Cron / Drush compatibility
Any automation not based on a queue is an anti-pattern.
2.3 Drush as the Single Entry Point
Every automated task must be runnable via Drush:
drush my-module:run-import
Why:
- Scheduling via system cron
- Manual execution possible
- CI/CD integration
- Return value management (exit codes)
Simple rule:
If an automation cannot be triggered by Drush, it is not production-ready.
3. Periodic Imports: Recommended Architecture
Standard Technical Pattern
- Source retrieval
(API, SFTP, external storage, file)
- Temporary storage (file or intermediate table)
- Splitting into business units
- Sending to queue
- Unit processing
- Mapping to Drupal entities
- Persistence
- Logging and monitoring
This breakdown enables:
- Better observability
- Targeted retries
- Fine-grained error management
3.1 Why Avoid Migrate API for Recurring Imports
The Migrate API is well suited for:
- Initial migrations
- One-off catch-up runs
- Structured, fixed imports
But it is poorly suited for:
- Daily synchronizations
- Evolving flows
- Variable volumes
- Complex business logic
- Controlled partial retries
For periodic imports:
➡️ Queue API + dedicated business code
4. Lightweight ETL with Drupal
Drupal can act as a lightweight ETL, provided it stays within a controlled scope.
Typical Transformations
- Format normalization (dates, currencies, codes)
- Business mapping (statuses, categories, taxonomies)
- Conditional rules
- Enrichment via secondary API
- Simple deduplication
Example:
if ($data['status'] === 'ACTIVE' && $data['score'] >= 80) {
$entity->set('field_level', 'premium');
}
Limits not to be exceeded:
- Complex aggregations
- Heavy multi-source joins
- Analytical computations
In those cases:
➡️ Externalize to a dedicated ETL (Airflow, Talend, etc.)
5. CRM / ERP / Marketing Automation Integration
5.1 Common Integration Models
| Model |
Description |
| Pull |
Drupal consumes the data |
| Push |
Drupal pushes the data |
| Event-driven |
Webhooks |
| Hybrid |
Pull + push combined |
The choice depends on:
- The master data system
- Data volume
- Business criticality
5.2 Drupal as an Exposed Business Backend
In many projects:
- Drupal is the functional reference system
- The CRM or ERP is the consumer
Examples of integrated ecosystems:
- HubSpot
- Salesforce
- Proprietary ERPs
Conclusion
Drupal, when properly architected, is a reliable and efficient automation platform. The key is to use the right tools: Queue API for asynchronous processing, Drush for triggering and scheduling, and a clean layered architecture for maintainability. Whether integrating a CRM, running daily data imports, or building a lightweight ETL pipeline, Drupal provides the building blocks to do it well.