PropertyTable Implementations
There are 2 implementations for PropertyTable
. The pros and cons are summarised in the following table:
PropertyTable Implementation | Description | Supported Indexes | Advantages | Disadvantages |
---|---|---|---|---|
PropertyTableArrayImpl |
implemented by a two-dimensioned Java array of Nodes |
SPO, PSO | compact memory usage, fast for querying with S and P, fast for query a whole Row |
slow for query with O, table Row/Column size provided |
PropertyTableHashMapImpl |
implemented by several Java HashMaps |
PSO, POS | fast for querying with O, table Row/Column size not required | more memory usage for HashMaps |
By default, [PropertyTableArrayImpl]((https://github.com/apache/jena/tree/main/jena-csv/src/main/java/org/apache/jena/propertytable/impl/PropertyTableArrayImpl.java) is used as the PropertyTable
implementation held by GraphCSV
.
If you want to switch to PropertyTableHashMapImpl, just use the static method of GraphCSV.createHashMapImpl()
to replace the default new GraphCSV()
way.
Here is an example:
Model model_csv_array_impl = ModelFactory.createModelForGraph(new GraphCSV(file)); // PropertyTableArrayImpl
Model model_csv_hashmap_impl = ModelFactory.createModelForGraph(GraphCSV.createHashMapImpl(file)); // PropertyTableHashMapImpl
StageGenerator Optimization for GraphPropertyTable
Accessing from SPARQL via Graph.find()
will work, but it’s not ideal. Some optimizations can be done for processing a SPARQL basic graph pattern. More explicitly, in the method of OpExecutor.execute(OpBGP, ...)
, when the target for the query is a GraphPropertyTable
, it can get a whole Row
, or Rows
, of the table data and match the pattern with the bindings.
The optimization of querying a whole Row
in the PropertyTable are supported now.
The following query pattern can be transformed into a Row
querying, without generating triples:
?x :prop1 ?v .
?x :prop2 ?w .
...
It’s made by using the extension point of StageGenerator
, because it’s now just concerned with BasicPattern
.
The detailed workflow goes in this way:
- Split the incoming
BasicPattern
by subjects, (i.e. it becomes multiple sub BasicPatterns grouped by the same subjects. (see QueryIterPropertyTable ) - For each sub
BasicPattern
, if theTriple
size within is greater than 1 (i.e. at least 2Triples
), it’s turned into aRow
querying, and processed by QueryIterPropertyTableRow, else if it contains only 1Triple
, it goes for the traditionalTriple
querying bygraph.graphBaseFind()
In order to turn on this optimization, we need to register the StageGeneratorPropertyTable into ARQ context, before performing SPARQL querying:
StageGenerator orig = (StageGenerator)ARQ.getContext().get(ARQ.stageGenerator) ;
StageGenerator stageGenerator = new StageGeneratorPropertyTable(orig) ;
StageBuilder.setGenerator(ARQ.getContext(), stageGenerator) ;