Legacy Documentation : may not be up-to-date
Original RDF/XML HowTo.
This is a guide to the RDF/XML legacy input subsystem of Jena, ARP.
Advanced RDF/XML Input
For access to these advanced features, first get an RDFReader
object that is an instance of an ARP parser, by using the
getReader
()
method on any Model
. It is then configured using the
[setProperty
](/documentation/javadoc/jena/org.apache.jena.core/org/apache/jena/rdfxml/xmlinput0/JenaReader.html#setProperty(java.lang.String, java.lang.Object))(String, Object)
method. This changes the properties for parsing RDF/XML. Many of
the properties change the RDF parser, some change the XML parser.
(The Jena RDF/XML parser, ARP, implements the
RDF grammar
over a Xerces2-J XML
parser). However, changing the features and properties of the XML
parser is not likely to be useful, but was easy to implement.
[setProperty
](/documentation/javadoc/jena/org.apache.jena.core/org/apache/jena/rdfxml/xmlinput0/JenaReader.html#setProperty(java.lang.String, java.lang.Object))(String, Object)
can be used to set and get:
- ARP properties
These allow fine grain control over the extensive error reporting capabilities of ARP. And are detailed directly below. - SAX2 features
See Xerces features. Value should be given as a String"true"
or"false"
or aBoolean
. - SAX2 properties
See Xerces properties. - Xerces features
See Xerces features. Value should be given as a String"true"
or"false"
or aBoolean
. - Xerces properties
See Xerces properties.
ARP properties
An ARP property is referred to either by its property name, (see
below) or by an absolute URL of the form
http://jena.hpl.hp.com/arp/properties/<PropertyName>
. The value
should be a String, an Integer or a Boolean depending on the
property.
ARP property names and string values are case insensitive.
Property Name | Description | Value class | Legal Values |
---|---|---|---|
iri-rules |
Set the engine for checking and resolving. "strict" sets the IRI engine with rules for valid IRIs, XLink and RDF; it does not permit spaces in IRIs. "iri" sets the IRI engine to IRI (RFC 3986, RFC 3987) . The default is "lax" (for backwards compatibility), the rules for RDF URI references only, which does permit spaces although the use of spaces is not good practice. |
String | lax strict iri |
error-mode |
ARPOptions.setDefaultErrorMode() ARPOptions.setLaxErrorMode() ARPOptions.setStrictErrorMode() ARPOptions.setStrictErrorMode(int) This allows a coarse-grained approach to control of error handling. Setting this property is equivalent to setting many of the fine-grained error handling properties. |
String |
default lax strict strict-ignore strict-warning strict-error strict-fatal |
embedding |
ARPOptions.setEmbedding(boolean) This sets ARP to look for RDF embedded within an enclosing XML document. |
String or Boolean |
true false |
ERR_<XXX> WARN_<XXX> IGN_<XXX> |
See ARPErrorNumbers for a complete list of the error conditions detected. Setting one of these properties is equivalent to the method ARPOptions.setErrorMode(int, int) . Thus fine-grained control over the behaviour in response to specific error conditions is possible. |
String or Integer |
EM_IGNORE EM_WARNING EM_ERROR EM_FATAL |
To set ARP properties, create a map of values to be set and put this in parser context:
Map<String, Object> properties = new HashMap<>();
// See class ARPErrorNumbers for the possible ARP properties.
properties.put("WARN_BAD_NAME", "EM_IGNORE");
// Build and run a parser
Model model = RDFParser.create()
.lang(Lang.RDFXML)
.source(...)
.set(SysRIOT.sysRdfReaderProperties, properties)
.base("http://base/")
.toModel();
System.out.println("== Parsed data output in Turtle");
RDFDataMgr.write(System.out, model, Lang.TURTLE);
See example ExRIOT_RDFXML_ReaderProperties.java.
Legacy Example
As an example, if you are working in an environment with legacy RDF data that uses unqualified RDF attributes such as “about” instead of “rdf:about”, then the following code is appropriate:
Model m = ModelFactory.createDefaultModel();
RDFReader arp = m.getReader();
m = null; // m is no longer needed.
// initialize arp
// Do not warn on use of unqualified RDF attributes.
arp.setProperty("WARN_UNQUALIFIED_RDF_ATTRIBUTE","EM_IGNORE");
…
InputStream in = new FileInputStream(fname);
arp.read(m,in,url);
in.close();
As a second example, suppose you wish to work in strict mode, but
allow "daml:collection"
, the following works:
…
arp.setProperty("error-mode", "strict" );
arp.setProperty("IGN_DAML_COLLECTION","EM_IGNORE");
…
The other way round does not work.
…
arp.setProperty("IGN_DAML_COLLECTION","EM_IGNORE");
arp.setProperty("error-mode", "strict" );
…
This is because in strict mode
IGN_DAML_COLLECTION
is treated as an error, and so the second call to setProperty
overwrites the effect of the first.
The IRI rules and resolver can be set on a per-reader basis:
InputStream in = ... ;
String baseURI = ... ;
Model model = ModelFactory.createDefaultModel();
RDFReader r = model.getReader("RDF/XML");
r.setProperty("iri-rules", "strict") ;
r.setProperty("error-mode", "strict") ; // Warning will be errors.
// Alternative to the above "error-mode": set specific warning to be an error.
//r.setProperty( "WARN_MALFORMED_URI", ARPErrorNumbers.EM_ERROR) ;
r.read(model, in, baseURI) ;
in.close();
The global default IRI engine can be set with:
ARPOptions.setIRIFactoryGlobal(IRIFactory.iriImplementation()) ;
or other IRI rule engine from IRIFactory
.