Jena RDF/XML Input How-To

Legacy Documentation : may not be up-to-date

Original RDF/XML HowTo.

This is a guide to the RDF/XML legacy input subsystem of Jena, ARP.

Advanced RDF/XML Input

For access to these advanced features, first get an RDFReader object that is an instance of an ARP parser, by using the getReader() method on any Model. It is then configured using the [setProperty](/documentation/javadoc/jena/org.apache.jena.core/org/apache/jena/rdfxml/xmlinput0/JenaReader.html#setProperty(java.lang.String, java.lang.Object))(String, Object) method. This changes the properties for parsing RDF/XML. Many of the properties change the RDF parser, some change the XML parser. (The Jena RDF/XML parser, ARP, implements the RDF grammar over a Xerces2-J XML parser). However, changing the features and properties of the XML parser is not likely to be useful, but was easy to implement.

[setProperty](/documentation/javadoc/jena/org.apache.jena.core/org/apache/jena/rdfxml/xmlinput0/JenaReader.html#setProperty(java.lang.String, java.lang.Object))(String, Object) can be used to set and get:

  • ARP properties
    These allow fine grain control over the extensive error reporting capabilities of ARP. And are detailed directly below.
  • SAX2 features
    See Xerces features. Value should be given as a String "true" or "false" or a Boolean.
  • SAX2 properties
    See Xerces properties.
  • Xerces features
    See Xerces features. Value should be given as a String "true" or "false" or a Boolean.
  • Xerces properties
    See Xerces properties.

ARP properties

An ARP property is referred to either by its property name, (see below) or by an absolute URL of the form<PropertyName>. The value should be a String, an Integer or a Boolean depending on the property.

ARP property names and string values are case insensitive.

Property Name Description Value class Legal Values
iri-rules Set the engine for checking and resolving. "strict" sets the IRI engine with rules for valid IRIs, XLink and RDF; it does not permit spaces in IRIs. "iri"sets the IRI engine to IRI (RFC 3986, RFC 3987) . The default is "lax"(for backwards compatibility), the rules for RDF URI references only, which does permit spaces although the use of spaces is not good practice. String lax
error-mode ARPOptions.setDefaultErrorMode()
This allows a coarse-grained approach to control of error handling. Setting this property is equivalent to setting many of the fine-grained error handling properties.
String default
embedding ARPOptions.setEmbedding(boolean)
This sets ARP to look for RDF embedded within an enclosing XML document.
String or Boolean true
See ARPErrorNumbers for a complete list of the error conditions detected. Setting one of these properties is equivalent to the method ARPOptions.setErrorMode(int, int). Thus fine-grained control over the behaviour in response to specific error conditions is possible. String or Integer EM_IGNORE

To set ARP properties, create a map of values to be set and put this in parser context:

    Map<String, Object> properties = new HashMap<>();
    // See class ARPErrorNumbers for the possible ARP properties.
    properties.put("WARN_BAD_NAME", "EM_IGNORE");

    // Build and run a parser
    Model model = RDFParser.create()
        .set(SysRIOT.sysRdfReaderProperties, properties)
    System.out.println("== Parsed data output in Turtle");
    RDFDataMgr.write(System.out,  model, Lang.TURTLE);

See example

Legacy Example

As an example, if you are working in an environment with legacy RDF data that uses unqualified RDF attributes such as “about” instead of “rdf:about”, then the following code is appropriate:

Model m = ModelFactory.createDefaultModel();
RDFReader arp = m.getReader();
m = null; // m is no longer needed.
// initialize arp
// Do not warn on use of unqualified RDF attributes.


InputStream in = new FileInputStream(fname);,in,url);

As a second example, suppose you wish to work in strict mode, but allow "daml:collection", the following works:

 arp.setProperty("error-mode", "strict" );

The other way round does not work.

 arp.setProperty("error-mode", "strict" );

This is because in strict mode IGN_DAML_COLLECTION is treated as an error, and so the second call to setProperty overwrites the effect of the first.

The IRI rules and resolver can be set on a per-reader basis:

InputStream in = ... ;
String baseURI = ... ;
Model model = ModelFactory.createDefaultModel();
RDFReader r = model.getReader("RDF/XML");
r.setProperty("iri-rules", "strict") ;
r.setProperty("error-mode", "strict") ; // Warning will be errors.

// Alternative to the above "error-mode": set specific warning to be an error.
//r.setProperty( "WARN_MALFORMED_URI", ARPErrorNumbers.EM_ERROR) ;, in, baseURI) ;

The global default IRI engine can be set with:

ARPOptions.setIRIFactoryGlobal(IRIFactory.iriImplementation()) ;

or other IRI rule engine from IRIFactory.

Further details

Details of ARP, the Jena RDF/XML parser