DBpedia Development Wiki devilopment bible

Edit this page on Github

Integrating SHACL Tests

Intro

SHACL (Shapes Constraint Language) is a W3C Recommendation which defines a language for validating RDF graphs against a set of conditions. These conditions are provided as shapes and other constructs expressed in the form of an RDF graph. SHACL is used in DBpedia to validate and evaluate the results (i.e. RDF) generated by the extraction framework.

Motivating Example

Recently, the Czech DBpedia community has identified that the dissambiguations links have not been extracted for Czech. This problem, due to a lack of a test, has never been captured by the testing phase. Upon fixing the problem, it is necessary to implement a SHACL test which will in future detect non-existance of the “dissambiguation links” dataset. Follows an example of such SHACL test.

<#Český_(rozcestník)_cs>
	a sh:NodeShape ;
	sh:targetNode <http://cs.dbpedia.org/resource/Český_(rozcestník)> ;
	
	# assuring that the dissambiguation extractor for Czech is active
	# noticed that for some languages the dissambiguation extractor is not active (e.g. the case Czech)
	sh:property [
		sh:path dbo:wikiPageDisambiguates ;
		sh:hasValue <http://cs.dbpedia.org/resource/Český> ;
	] .

The test above checks if the RDF contains triple with:

  • subject <http://cs.dbpedia.org/resource/Český_(rozcestník)>
  • predicate dbo:wikiPageDisambiguates
  • object <http://cs.dbpedia.org/resource/Český>

The test is executed against the RDF extracted from the provided minidump. Thus, the minidump should also provide the content for the corresponding Wikipedia article, i.e. https://cs.wikipedia.org/wiki/Český_(rozcestník). This URL has to be also added to the URLs list.

Learn more about creating minidumps in Testing on Minidumps article.

Workflow for integrating SHACL tests

The workflow for writing and integrating custom SHACL tests is as follows:

  1. Add a custom SHACL test in the custom-shacl-tests.ttl document.
  2. In the URLs list file add the Wikipedia pages for which the test is valid.
  3. If you edited the URLs list, then re-create the minidump
    cd dump/src/test/bash # << $DIEF_DIR/dump/src/test/bash
    ./createMinidump.sh
    
  4. Run and validate your test
    cd dump/ # << $DIEF_DIR/dump
    mvn test
    
  5. Commit the changes!!! Make a pull request and commit all necessary files such as:
    • the SHACL test custom-shacl-tests.ttl
    • the URLs list file uris.lst
    • the newly created minidump(s) $DIEF_DIR/dump/src/test/resources/minidumps/LANGUAGE-TAG/wiki.xml.bz2