Skip to content
Snippets Groups Projects
Forked from project / rdf_sync
5 commits behind, 14 commits ahead of the upstream repository.

RDF Sync

coverage

Contents

Description

TL;DR Synchronizes Drupal entities, as triples, to an RDF backend

Terminology

Subject

The semantic triple subject as a URI. For background, see https://en.wikipedia.org/wiki/Semantic_triple

Predicate

The semantic triple predicate as a URI. For background, see https://en.wikipedia.org/wiki/Semantic_triple

Object

The semantic triple object. Can be a resource URI, pointing to other triple, or a literal. For background, see https://en.wikipedia.org/wiki/Semantic_triple

Entity URI

Used to identify the entity in the RDF/triplestore and acts as a subject of all triples representing that entity. Think of this value as a universally unique ID that identifies the entity, very similar to the well-known Drupal entity UUID field, but complying to a URI as pattern. Two entities, even they are from different entity types and/or bundles, cannot share the same URI.

Mapping

A relation between an entity field property/column and an RDF predicate. This relation can be defined as third-party settings in the entity bundle config entity or in code, by implementing hook_entity_bundle_info_alter(). When an entity is synchronized to RDF, as triples, each mapped field property value will be represented as a semantic triple having:

RDF type

An RDF resource URI that identifies the entity bundle in RDF. Normally, this URI is the object of a triple having the entity URI as subject and http://www.w3.org/1999/02/22-rdf-syntax-ns#type as predicate. A notable exception are the taxonomy term entities, which are a special case in the "RDF World". They provide also a mapping for the entity bundle field and the RDF type will be object of a triple with the bundle mapping as predicate. Being a predicate, RDF type is always a URI.

How it works

The module allows to map entity field properties so that their values are synchronized to an RDF/triplestore backend.

Automatic synchronization

When an entity is inserted, updated or deleted, its RDF representation in the RDF/triplestore is synchronized. Only fields that are mapped will be synchronized.

Manual synchronization

In some circumstances the automatic synchronization can be disabled:

PHP

\Drupal::service('rdf_sync.synchronizer')->disableSynchronization();

CLI

vendor/bin/drush rdf_sync:disable

Run manual synchronization:

PHP

use Drupal\rdf_sync\Model\SyncMethod;
\Drupal::service('rdf_sync.synchronizer')->synchronize(SyncMethod::UPDATE, [$entity1, $entity2, ...]);

CLI

# Synchronize all nodes with mapped node-type.
vendor/bin/drush rdf_sync:synchronize node
# Synchronize all page and article nodes.
vendor/bin/drush rdf_sync:synchronize node --bundle=page,article

Switch back to automated synchronization:

PHP

\Drupal::service('rdf_sync.synchronizer')->enableSynchronization();

CLI

vendor/bin/drush rdf_sync:enable

Configuration

Visit /admin/config/system/rdf-sync and configure:

  • The RDF graph URI
  • The endpoint (protocol, host, port, query & update & graph store paths)

Defining mappings

Mappings can be either configured, or defined in code.

Configure mappings

Defining mappings in configuration is only possible for entity types that are defining bundles as config entities. Such entity types are nodes, taxonomy terms, etc. There are two kind of mappings: bundle level and configurable field level.

Bundle mappings

Visit the administrative bundle edit form (e.g., for article nodes go to /admin/structure/types/manage/article) and fill values under the "RDF sync" section.

Configurable field mappings

Visit the field configuration form (e.g., for article body go to /admin/structure/types/manage/article/fields/node.article.body) and fill values under "RDF sync mapping" section.

Define in code

As an alternative to configuration of mappings but also for entity types which are not declaring their bundles as config entities, it's possible to define mapping directly in code by implementing hook_entity_bundle_info_alter(). Here's an example on how to add mappings to the article node bundle:

function my_module_entity_bundle_info_alter(array &$bundles): void {
  if (isset($bundles['node']['article'])) {
    $bundles['node']['article']['rdf_sync'] = [
      // The RDF type.
      'type' => 'http://example.com/article',
      // The name of the field for the entity URI.
      'uri_field_name' => 'rdf_uri',
      // The plugin used to build the URI for new entities.
      // @see \Drupal\rdf_sync\Plugin\rdf_sync\RdfUriGenerator\DefaultRdfUriGenerator()
      'uri_plugin' => 'my_custom_plugin',
      // Fields properties mappings. Includes (bundle) base & configurable fields.
      'fields' => [
        'title' => [
          'value' => [
            // Mapped predicate.
            'predicate' => 'http://example.com/article/title',
            // NULL for translatable strings or a simple type, such as
            // 'xsd:boolean', 'xsd:string', etc., or 'resource' for entity
            // reference fields.
            'type' => NULL,
          ],
        ],
        'body' => [
          // Note that we're using 'processed', which is a computed property,
          // instead of 'value', in order to benefit from text formatting.
          'processed' => [
            'predicate' => 'http://example.com/article/content',
            'type' => 'xsd:string',
          ],
        ],
      ],
    ];
  }
}

Architecture

The module relies on EasyRdf which is a PHP library that allows to manipulate RDF graphs and triples.

In the core of synchronizer is a specialized normalizer (\Drupal\rdf_sync\Normalizer\RdfSyncNormalizer) that knows to normalize a mapped entity as an EasyRdf graph PHP representation (\EasyRdf\Graph).

The encoder (\Drupal\rdf_sync\Encoder\RdfSyncEncoder) allows to serialize a graph as any of the formats supported by EasyRdf. This includes serialization as jsonld, n3, ntriples, rdfxml or turtle.

By providing the normalizer and encoder it's now easy to get very quickly the RDF representation of an entity:

PHP

// Represent the entity as JSON-LD
\Drupal::service('serializer')->serialize($node, 'jsonld');
// As Turtle
\Drupal::service('serializer')->serialize($node, 'turtle');

CLI

# Supposing http://example.com/node/123 is the canonical URL of the entity
curl http://example.com/node/123?_format=jsonld
curl http://example.com/node/123?_format=turtle

Events

The module provides a set of events that can be used to alter the data and entities before and after synchronization. The events are:

  • \Drupal\rdf_sync\Event\RdfSyncNormalizeEvent: Allows to alter the triples before they are serialized or add new ones.
  • \Drupal\rdf_sync\Event\RdfSyncEvent: Allows to perform alterations on the array of entities before syncing them.

Sub-modules

The rdf_sync_published submodule provides a way to filter out unpublished entities from synchronization. It does so by listening to the \Drupal\rdf_sync\Event\RdfSyncEvent event and filtering out the entities that are not published.

Contributing

Feature requests, bug reports, and merge requests are welcomed. Please follow the Drupal coding standards and best practices. Merge requests should contain test coverage.

All development takes place in Drupal.org.

We're using DDEV together with DDEV integration for developing Drupal contrib projects add-on the for module development: