Migrate Process HTML
This module provides a Migrate process plugin to enable you to request and extract HTML from a url or link field such as those commonly found in RSS Feeds.
Installation
Download the module using Composer with the command composer require drupal/migrate_process_html
and enable it.
JavaScript Redirects
By default this plugin is designed to handle JavaScript redirects such as those
commonly found in RSS Feeds served by Google. You have the option to disable
this behaviour from your migration config by setting jsredirect: false
. Please
see below for an example.
Example Usage
process:
'body/value':
-
plugin: migrate_process_js_redirect
source: link
-
plugin: migrate_process_html
-
plugin: dom
method: import
-
plugin: dom_select
selector: //meta[@property="og:image"]/@content
-
plugin: skip_on_empty
method: row
message: 'Field image is missing'
-
plugin: extract
index:
- 0
-
plugin: skip_on_condition
method: row
condition:
plugin: not:matches
regex: /^(https?:\/\/)[\w\d]/i
message: 'We only want a string if it starts with http(s)://[\w\d]'
-
plugin: file_remote_url
Please note that using skip_on_condition
with 'matches' requires the excellent
migrate_conditions module.
https://www.drupal.org/project/migrate_conditions
Author
- Daniel Lobo (2dareis2do)