How to map JSON to RDF

You want to combine JSON with RDF data.

Problem

Some datasets are available only in JSON. The data they provide may be useful to combine with your RDF data. Integrating the data allows you to query across multiple datasets. However, in order to do that you must first convert the JSON data to RDF.

Moreover, while RDF allows you to merge any data, data integration is not only a matter of syntax alignment. For example, consider this simple JSON document:

{
  "count": 52350
}

Without knowing its context it is impossible to tell what it is about. Mapping JSON to RDF thus also involves mapping data to RDF vocabularies that describe how to interpret the data.

Solution

You can map JSON to RDF by converting it to JSON-LD, a standard JSON-based syntax for RDF. If you add a JSON-LD context to the data you can reinterpret it as RDF. Simply put, the context maps JSON attributes to RDF properties.

Have a look at the example JSON above. Without context the count attribute can mean anything. I can almost hear you say: “52 thousand of what?” It turns out that the count is the number of graves in Brno, as available in JSON from the municipality's GIS service. You can see live results of this query to the service here. Now that we know how to interpret the JSON, we can describe it in a JSON-LD context:

{
  "@vocab": "http://dbpedia.org/ontology/",
  "count": "numberOfGraves"
}

The context defines http://dbpedia.org/ontology/ (DBpedia ontology) as the default namespace via the special @vocab attribute. Unless specified otherwise, all attributes are interpreted as local names of properties in this namespace given by @vocab. The count is mapped to the numberOfGraves property from the default namespace.

LinkedPipes ETL allows you to enrich JSON with a JSON-LD context via the JSON to JSON-LD component. The component adds the value of the Context object option as the JSON-LD context of its output. The options data predicate and root entity type are used to wrap the input JSON. The output JSON-LD is described as an instance of the root entity type that links the input JSON data via the data predicate. You can stick with the default values of these parameters unless you want to qualify the component's output. Moreover, you would typically post-process the generated JSON-LD via SPARQL, so that you can discard the wrapping data if not needed. You can also add a reference to the input's file name into the output data by turning on the Add file reference switch and filling in a property to store the file name. Note that since there is no support for namespaces, all terms in the component's configuration must be referenced by their absolute IRIs. Given the context above and the default configuration the component produces the following JSON-LD:

{
  "@context": {
    "@vocab": "http://dbpedia.org/ontology/",
    "count": "numberOfGraves"
  },
  "@type" : "http://localhost/ontology/Data",
  "http://localhost/ontology/data" : {
    "count": 52350
  }
}

Finally, you can turn the produced JSON-LD into an RDF database by passing it to the Files to RDF single graph component. At this point, you can work with data from the input JSON as RDF. This is how the example data looks in the Turtle RDF syntax, prettified by syntax short-hands:

@prefix :    <http://localhost/ontology/> .
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

[] a :Data ;
  :data [
    dbo:numberOfGraves "52350"^^xsd:integer
  ] .

An example pipeline showing the described conversion is available here.

Discussion

Consider that JSON-LD contexts can only express basic mapping that is mostly limited to properties and data types of literals. Should you need a more sophisticated mapping that changes the structure of the input data or reformats it, you can transform the output JSON-LD by using the full expressive power of SPARQL. You can see a simple example of mapping via SPARQL in this tutorial.

Problem

Solution

Discussion

See also