RDF/XML Input in Jena

Legacy Documentation : not up-to-date

The original ARP parser will be removed from Jena

The current RDF/XML parser is RRX.


This section details the Jena RDF/XML parser. ARP is the parsing subsystem in Jena for handling the RDF/XML syntax.

ARP Features

  • Java based RDF parser.
  • Compliant with RDF Syntax and RDF Test Cases Recommendations.
  • Compliant with following standards and recommendations:
    • xml:lang
      xml:lang is fully supported, both in RDF/XML and any document embedding RDF/XML. Moreover, the language tags are checked against RFC1766, RFC3066, ISO639-1, ISO3166.
    • xml:base
      xml:base is fully supported, both in RDF/XML and any document embedding RDF/XML.
    • URI
      All URI references are checked against RFC2396. The treatment of international URIs implements the concept of RDF URI Reference.
    • XML Names
      All rdf:ID’s are checked against the XML Names specification.
    • Unicode Normal Form C
      String literals are checked for conformance with an early uniform normalization processing model.
    • XML Literals
      rdf:parseType='Literal' is processed respecting namespaces, processing instructions and XML comments. This follows the XML exclusive canonicalizations recommendation with comments.
    • Relative Namespace URI references
      Namespace URI references are checked in light of the W3C XML Plenary decision.
  • Command-line RDF/XML error checking.
  • Can be used independently of Jena, with customizable StatementHandler.
  • Highly configurable error processing.
  • Xerces based XML parsing.
  • Processes both standalone and embedded RDF/XML.
  • Streaming parser, suitable for large files.
  • Supports SAX and DOM, for integration with non-file XML sources.