Jena RDF/XML Input How-To (ARP)

Legacy Documentation : not up-to-date

The original ARP parser will be removed from Jena.

The current RDF/XML parser is RRX.


This is a guide to the RDF/XML legacy ARP input subsystem of Jena.

The ARP RDF/XML parser is designed for use with RIOT and to have the same handling of errors, IRI resolution, and treatment of base IRIs as other RIOT readers.

The ARP0 parser is the original standalone parser.

RDF/XML Input

The usual way to access the RDF/XML parser is via RDFDataMgr or RDFParser.

Model model = RDFDataMgr.loadModel("data.arp");

or

Model model = RDFParser.source("data.arp").toModel();

Note the file extension is arp.

Legacy ARP RDF/XML parser

RIOT integrated ARP parser

To access the parse from Java code use constants RRX.RDFXML_ARP1.

The syntax name is arp or arp1.

The file extension is arp or arp1.

Original ARP0 parser

To access the parse from Java code use constants RRX.RDFXML_ARP0.

The syntax name is arp0.

The file extension is arp0.

Details of the original Jena RDF/XML parser, ARP.


Advanced RDF/XML Input

For access to these advanced features, first get an RDFReader object that is an instance of an ARP parser, by using the getReader() method on any Model. It is then configured using the [setProperty](/documentation/javadoc/jena/org.apache.jena.core/org/apache/jena/rdfxml/xmlinput0/JenaReader.html#setProperty(java.lang.String, java.lang.Object))(String, Object) method. This changes the properties for parsing RDF/XML. Many of the properties change the RDF parser, some change the XML parser. (The Jena RDF/XML parser, ARP, implements the RDF grammar over a Xerces2-J XML parser). However, changing the features and properties of the XML parser is not likely to be useful, but was easy to implement.

[setProperty](/documentation/javadoc/jena/org.apache.jena.core/org/apache/jena/rdfxml/xmlinput0/JenaReader.html#setProperty(java.lang.String, java.lang.Object))(String, Object) can be used to set and get:

  • ARP properties
    These allow fine grain control over the extensive error reporting capabilities of ARP. And are detailed directly below.
  • SAX2 features
    See Xerces features. Value should be given as a String "true" or "false" or a Boolean.
  • SAX2 properties
    See Xerces properties.
  • Xerces features
    See Xerces features. Value should be given as a String "true" or "false" or a Boolean.
  • Xerces properties
    See Xerces properties.

ARP properties

An ARP property is referred to either by its property name, (see below) or by an absolute URL of the form http://jena.hpl.hp.com/arp/properties/<PropertyName>. The value should be a String, an Integer or a Boolean depending on the property.

ARP property names and string values are case insensitive.

Property Name Description Value class Legal Values
iri-rules Set the engine for checking and resolving. "strict" sets the IRI engine with rules for valid IRIs, XLink and RDF; it does not permit spaces in IRIs. "iri"sets the IRI engine to IRI (RFC 3986, RFC 3987) . The default is "lax"(for backwards compatibility), the rules for RDF URI references only, which does permit spaces although the use of spaces is not good practice. String lax
strict
iri
error-mode ARPOptions.setDefaultErrorMode()
ARPOptions.setLaxErrorMode()
ARPOptions.setStrictErrorMode()
ARPOptions.setStrictErrorMode(int)
This allows a coarse-grained approach to control of error handling. Setting this property is equivalent to setting many of the fine-grained error handling properties.
String default
lax
strict
strict-ignore
strict-warning
strict-error
strict-fatal
embedding ARPOptions.setEmbedding(boolean)
This sets ARP to look for RDF embedded within an enclosing XML document.
String or Boolean true
false
ERR_<XXX>
WARN_<XXX>
IGN_<XXX>
See ARPErrorNumbers for a complete list of the error conditions detected. Setting one of these properties is equivalent to the method ARPOptions.setErrorMode(int, int). Thus fine-grained control over the behaviour in response to specific error conditions is possible. String or Integer EM_IGNORE
EM_WARNING
EM_ERROR
EM_FATAL

To set ARP properties, create a map of values to be set and put this in parser context:

    Map<String, Object> properties = new HashMap<>();
    // See class ARPErrorNumbers for the possible ARP properties.
    properties.put("WARN_BAD_NAME", "EM_IGNORE");

    // Build and run a parser
    Model model = RDFParser.create()
        .lang(Lang.RDFXML)
        .source(...)
        .set(SysRIOT.sysRdfReaderProperties, properties)
        .base("http://base/")
        .toModel();
    System.out.println("== Parsed data output in Turtle");
    RDFDataMgr.write(System.out,  model, Lang.TURTLE);

See example ExRIOT_RDFXML_ReaderProperties.java.

Legacy Example

As an example, if you are working in an environment with legacy RDF data that uses unqualified RDF attributes such as “about” instead of “rdf:about”, then the following code is appropriate:

Model m = ModelFactory.createDefaultModel();
RDFReader arp = m.getReader();
m = null; // m is no longer needed.
// initialize arp
// Do not warn on use of unqualified RDF attributes.
arp.setProperty("WARN_UNQUALIFIED_RDF_ATTRIBUTE","EM_IGNORE");

…

InputStream in = new FileInputStream(fname);
arp.read(m,in,url);
in.close();

As a second example, suppose you wish to work in strict mode, but allow "daml:collection", the following works:

 …
 arp.setProperty("error-mode", "strict" );
 arp.setProperty("IGN_DAML_COLLECTION","EM_IGNORE");
 …

The other way round does not work.

 …
 arp.setProperty("IGN_DAML_COLLECTION","EM_IGNORE");
 arp.setProperty("error-mode", "strict" );
 …

This is because in strict mode IGN_DAML_COLLECTION is treated as an error, and so the second call to setProperty overwrites the effect of the first.

The IRI rules and resolver can be set on a per-reader basis:

InputStream in = ... ;
String baseURI = ... ;
Model model = ModelFactory.createDefaultModel();
RDFReader r = model.getReader("RDF/XML");
r.setProperty("iri-rules", "strict") ;
r.setProperty("error-mode", "strict") ; // Warning will be errors.

// Alternative to the above "error-mode": set specific warning to be an error.
//r.setProperty( "WARN_MALFORMED_URI", ARPErrorNumbers.EM_ERROR) ;
r.read(model, in, baseURI) ;
in.close();

The global default IRI engine can be set with:

ARPOptions.setIRIFactoryGlobal(IRIFactory.iriImplementation()) ;

or other IRI rule engine from IRIFactory.