Apache Jena - Fuseki: Configuring Fuseki

A Fuseki server is configured by defining the data services (data and actions available on the data). There is also server configuration although this is often unnecessary.

The data services configuration can come from:

For Fuseki Full (webapp with UI):

The directory FUSEKI_BASE/configuration/ with one data service assembler per file (includes endpoint details and the dataset description.)
The system database. This includes uploaded assembler files. It also keeps the state of each data service (whether it’s active or offline).
The server configuration file config.ttl. For compatibility, the server configuration file can also have data services.
The command line, if not running as a web application from a .war file.

FUSEKI_BASE is the location of the Fuseki run area.

For Fuseki Main:

The command line, using --conf to provide a configuration file.
The command line, using arguments (e.g. --mem /ds or --tdb2 --loc DB2 /ds).
Programmatic configuration of the server.

See Fuseki Security for more information on security configuration.

Examples

Example server configuration files can be found at jena-fuseki2/examples.

Security and Access Control

Access Control can be configured on any of the server, data service or dataset. Fuseki Data Access Control.

Separately, Fuseki Full has request based security filtering provided by Apache Shiro: Fuseki Full Security

Fuseki Configuration File

A Fuseki server can be set up using a configuration file. The command-line arguments for publishing a single dataset are a short cut that, internally, builds a default configuration based on the dataset name given.

The configuration is an RDF graph. One graph consists of one server description, with a number of services, and each service offers a number of endpoints over a dataset.

The example below is all one file (RDF graph in Turtle syntax) split to allow for commentary.

Prefix declarations

Some useful prefix declarations:

PREFIX fuseki:  <http://jena.apache.org/fuseki#>
PREFIX rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
PREFIX tdb2:    <http://jena.apache.org/2016/tdb#>
PREFIX tdb1:    <http://jena.hpl.hp.com/2008/tdb#>
PREFIX ja:      <http://jena.hpl.hp.com/2005/11/Assembler#>
PREFIX :        <#>

Assembler Initialization

All datasets are described by assembler descriptions. Assemblers provide an extensible way of describing many kinds of objects.

Defining the service name and endpoints available

Each data service assembler defines:

The base name
The operations and endpoint names
The dataset for the RDF data.

This example offers SPARQL Query, SPARQL Update and SPARQL Graph Store protocol, as well as file upload.

See Data Service Configuration Syntax for the complete details of the endpoint configuration description. Here, we show some examples.

The original configuration syntax, using, for example, fuseki:serviceQuery, is still supported.

The base name is /ds.

## Updatable in-memory dataset.

<#service1> rdf:type fuseki:Service ;
    fuseki:name   "ds" ;       # http://host:port/ds
    fuseki:endpoint [ 
         # SPARQL query service
        fuseki:operation fuseki:query ; 
        fuseki:name "sparql"
    ] ;
    fuseki:endpoint [ 
         # SPARQL query service (alt name)
        fuseki:operation fuseki:query ; 
        fuseki:name "query" 
    ] ;

    fuseki:endpoint [ 
         # SPARQL update service
        fuseki:operation fuseki:update ; 
        fuseki:name "update" 
    ] ;

    fuseki:endpoint [ 
         # HTML file upload service
        fuseki:operation fuseki:upload ; 
        fuseki:name "upload" 
    ] ;

    fuseki:endpoint [ 
         # SPARQL Graph Store Protocol (read)
        fuseki:operation fuseki:gsp_r ; 
        fuseki:name "get" 
    ] ;
    fuseki:endpoint [ 
        # SPARQL Graph Store Protcol (read and write)
        fuseki:operation fuseki:gsp_rw ; 
        fuseki:name "data" 
    ] ;

    fuseki:dataset  <#dataset> ;
    .

<#dataset> refers to a dataset description in the same file.

HTTP requests will include the service name: http://host:port/ds/sparql?query=....

Read-only service

This example offers only read-only endpoints (SPARQL Query and HTTP GET SPARQL Graph Store protocol).

This service offers read-only access to a dataset with a single graph of data.

<#service2> rdf:type fuseki:Service ;
    fuseki:name      "/ds-ro" ;   # http://host:port/ds-ro
    fuseki:endpoint  [ fuseki:operation fuseki:query ; fuseki:name "sparql" ];
    fuseki:endpoint  [ fuseki:operation fuseki:query ; fuseki:name "query" ];
    fuseki:endpoint  [ fuseki:operation fuseki:gsp_r ; fuseki:name "data" ];
    fuseki:dataset           <#dataset> ;
    .

Data services on the dataset

The standard SPARQL operations can also be defined on the dataset URL with no secondary service name:

<#service2> rdf:type fuseki:Service ;
    fuseki:name     "/dataset" ;
    fuseki:endpoint  [ fuseki:operation fuseki:query ];
    fuseki:endpoint  [ fuseki:operation fuseki:gsp_r ];
    fuseki:dataset  <#dataset> ;
    .

HTTP requests use the URL of the dataset.

SPARQL Query: http://host:port/dataset?query=...
Fetch the default graph (SPARQL Graph Store Protocol): http://host:port/dataset?default

Server Configuration

If you need to load additional classes, or set global parameters, then these go in FUSEKI_BASE/config.ttl.

Additional classes can not be loaded if running as a .war file. You will need to create a custom .war file consisting of the contents of the Fuseki web application and the additional classes

The server section is optional.

If absent, fuseki configuration is performed by searching the configuration file for the type fuseki:Service.

Server Section

[] rdf:type fuseki:Server ;
   # Server-wide context parameters can be given here.
   # For example, to set query timeouts: on a server-wide basis:
   # Format 1: "1000" -- 1 second timeout
   # Format 2: "10000,60000" -- 10s timeout to first result, then 60s timeout to for rest of query.
   # See java doc for ARQ.queryTimeout
   # ja:context [ ja:cxtName "arq:queryTimeout" ;  ja:cxtValue "10000" ] ;
   #
   # ARQ.updateTimeout (since jena-5.3.0) follows the same pattern as for the query timeout.
   # ja:context [ ja:cxtName "arq:updateTimeout" ;  ja:cxtValue "60000" ] ;
   #
   # Explicitly choose which services to add to the server.
   # If absent, include all descriptions of type `fuseki:Service`.
   # fuseki:services (<#service1> <#service2>)
   .

Datasets

In-memory

An in-memory dataset, with data in the default graph taken from a local file.

<#dataset> rdf:type ja:MemoryDataset ;
    ## Optional: load with data on start-up
    ## ja:data <file:Data/books.ttl> 
    .

TDB2

<#dataset> rdf:type      tdb2:DatasetTDB2 ;
    tdb2:location "DB2" ;
    # Query timeout on this dataset (1s, 1000 milliseconds)
    ja:context [ ja:cxtName "arq:queryTimeout" ;  ja:cxtValue "1000" ] ;
    # Make the default graph be the union of all named graphs.
    ## tdb2:unionDefaultGraph true ;
    .

TDB1

<#dataset> rdf:type      tdb1:DatasetTDB ;
    tdb1:location "DB" ;
    # Query timeout on this dataset (1s, 1000 milliseconds)
    ja:context [ ja:cxtName "arq:queryTimeout" ;  ja:cxtValue "1000" ] ;
    # Make the default graph be the union of all named graphs.
    ## tdb1:unionDefaultGraph true ;
    .

Inference

An inference reasoner can be layered on top of a dataset as defined above. The type of reasoner must be selected carefully and should not include more reasoning than is required by the application, as extensive reasoning can be detrimental to performance.

You have to build up layers of dataset, inference model, and graph.

<#dataset> rdf:type ja:RDFDataset;
     ja:defaultGraph <#inferenceModel>
     .
     
<#inferenceModel> rdf:type      ja:InfModel;
     ja:reasoner [ ja:reasonerURL <http://example/someReasonerURLHere> ];
     ja:baseModel <#baseModel>;
     .

<#baseModel> rdf:type tdb2:GraphTDB2;  # for example.
     tdb2:location "/some/path/to/store/data/to";
     # etc
     .

where http://example/someReasonerURLHere is one of the URLs below.

Possible reasoners:

Details are in the main documentation for inference.

Generic Rule Reasoner: http://jena.hpl.hp.com/2003/GenericRuleReasoner

The specific rule set and mode configuration can be set through parameters in the configuration Model.
Transitive Reasoner: http://jena.hpl.hp.com/2003/TransitiveReasoner

A simple “reasoner” used to help with API development.

This reasoner caches a transitive closure of the subClass and subProperty graphs. The generated infGraph allows both the direct and closed versions of these properties to be retrieved. The cache is built when the tbox is bound in but if the final data graph contains additional subProperty/subClass declarations then the cache has to be rebuilt.

The triples in the tbox (if present) will also be included in any query. Any of tbox or data graph are allowed to be null.

RDFS Rule Reasoner: http://jena.hpl.hp.com/2003/RDFSExptRuleReasoner

A full implementation of RDFS reasoning using a hybrid rule system, together with optimized subclass/subproperty closure using the transitive graph caches. Implements the container membership property rules using an optional data scanning hook. Implements datatype range validation.
Full OWL Reasoner: http://jena.hpl.hp.com/2003/OWLFBRuleReasoner

A hybrid forward/backward implementation of the OWL closure rules.
Mini OWL Reasoner: http://jena.hpl.hp.com/2003/OWLMiniFBRuleReasoner

Key limitations over the normal OWL configuration are:
- omits the someValuesFrom => bNode entailments
- avoids any guard clauses which would break the find() contract
- omits inheritance of range implications for XSD datatype ranges
Micro OWL Reasoner: http://jena.hpl.hp.com/2003/OWLMicroFBRuleReasoner

This only supports:

RDFS entailments
basic OWL axioms like ObjectProperty subClassOf Property
intersectionOf, equivalentClass and forward implication of unionOf sufficient for traversal of explicit class hierarchies
Property axioms (inverseOf, SymmetricProperty, TransitiveProperty, equivalentProperty)

There is some experimental support for the cheaper class restriction handling which should not be relied on at this point.