SHACL (Shapes Constraint Language) is a W3C Recommendation which defines a language for validating RDF graphs against a set of conditions. These conditions are provided as shapes and other constructs expressed in the form of an RDF graph. SHACL is used in DBpedia to validate and evaluate the results (i.e. RDF) generated by the extraction framework.
Recently, the Czech DBpedia community has identified that the dissambiguations links have not been extracted for Czech. This problem, due to a lack of a test, has never been captured by the testing phase. Upon fixing the problem, it is necessary to implement a SHACL test which will in future detect non-existance of the “dissambiguation links” dataset. Follows an example of such SHACL test.
<#Český_(rozcestník)_cs> a sh:NodeShape ; sh:targetNode <http://cs.dbpedia.org/resource/Český_(rozcestník)> ; # assuring that the dissambiguation extractor for Czech is active # noticed that for some languages the dissambiguation extractor is not active (e.g. the case Czech) sh:property [ sh:path dbo:wikiPageDisambiguates ; sh:hasValue <http://cs.dbpedia.org/resource/Český> ; ] .
The test above checks if the RDF contains triple with:
The test is executed against the RDF extracted from the provided minidump. Thus, the minidump should also provide the content for the corresponding Wikipedia article, i.e.
This URL has to be also added to the URLs list.
Learn more about creating minidumps in Testing on Minidumps article.
The workflow for writing and integrating custom SHACL tests is as follows:
URLs listfile add the Wikipedia pages for which the test is valid.
URLs list, then re-create the minidump
cd dump/src/test/bash # << $DIEF_DIR/dump/src/test/bash ./createMinidump.sh
cd dump/ # << $DIEF_DIR/dump mvn test