Using CSV PropertyTable with Apache Maven
See “Using Jena with Apache Maven” for full details.
<dependency>
<groupId>org.apache.jena</groupId>
<artifactId>jena-csv</artifactId>
<version>X.Y.Z</version>
</dependency>
Using CSV PropertyTable from Java through the API
In order to switch on CSV PropertyTable, it’s required to register LangCSV
into Jena RIOT, through a simple method call:
import org.apache.jena.propertytable.lang.CSV2RDF;
...
CSV2RDF.init() ;
It’s a static method call of registration, which needs to be run just one time for an application before using CSV PropertyTable (e.g. during the initialization phase).
Once registered, CSV PropertyTable provides 2 ways for the users to play with (i.e. GraphCSV and RIOT):
GraphCSV
GraphCSV wrappers a CSV file as a Graph, which makes a Model for SPARQL query:
Model model = ModelFactory.createModelForGraph(new GraphCSV("data.csv")) ;
QueryExecution qExec = QueryExecutionFactory.create(query, model) ;
or for multiple CSV files and/or other RDF data:
Model csv1 = ModelFactory.createModelForGraph(new GraphCSV("data1.csv")) ;
Model csv2 = ModelFactory.createModelForGraph(new GraphCSV("data2.csv")) ;
Model other = ModelFactory.createModelForGraph(otherGraph) ;
Dataset dataset = ... ;
dataset.addNamedModel("http://example/table1", csv1) ;
dataset.addNamedModel("http://example/table2", csv2) ;
dataset.addNamedModel("http://example/other", other) ;
... normal SPARQL execution ...
You can also find the full examples from GraphCSVTest.
In short, for Jena ARQ, a CSV table is actually a Graph (i.e. GraphCSV), without any differences from other types of Graphs when using it from the Jena ARQ API.
RIOT
When LangCSV is registered into RIOT, CSV PropertyTable adds a new RDF syntax of ‘.csv’ with the content type of “text/csv”. You can read “.csv” files into Model following the standard RIOT usages:
// Usage 1: Direct reading through Model
Model model_1 = ModelFactory.createDefaultModel()
model.read("test.csv") ;
// Usage 2: Reading using RDFDataMgr
Model model_2 = RDFDataMgr.loadModel("test.csv") ;
For more information, see Reading RDF in Apache Jena.
Note that, the requirements for the CSV files are listed in the documentation of Design. CSV PropertyTable only supports single-Value, regular-Shaped, table-headed and UTF-8-encoded CSV files (NOT Microsoft Excel files).
Command Line Tool
csv2rdf is a tool for direct transforming from CSV to the formatted RDF syntax of N-Triples.
The script calls the csv2rdf
java program in the riotcmd
package in this way:
java -cp ... riotcmdx.csv2rdf inputFile ...
It transforms the CSV inputFile
into N-Triples. For example,
java -cp ... riotcmdx.csv2rdf src/test/resources/test.csv
The script reuses Common framework for running RIOT parsers,
so that it also accepts the same arguments
(type "riot --help"
to get command line reminders) from
RIOT Command line tools:
--validate
: Checking mode: same as--strict --sink --check=true
--check=true/false
: Run with checking of literals and IRIs either on or off.--sink
: No output of triples or quads in the standard output (i.e.System.out
).--time
: Output timing information.