Class GraphMem2Roaring

All Implemented Interfaces:
org.apache.jena.atlas.lib.Copyable<GraphMem2>, Graph, GraphWithPerform

public class GraphMem2Roaring extends GraphMem2
A graph that stores triples in memory. This class is not thread-safe. This in-memory graph supports different indexing strategies to balance RAM usage and performance for various operations. See IndexingStrategy for details on the available strategies.

As long as the index has not been initialized, the memory consumption is very low and the following operations are extremely fast:

One could start without the index, add all triples, and then initialize the index using initializeIndexParallel() for maximum performance.

Purpose: GraphMem2Roaring is ideal for handling extremely large graphs. If you frequently work with such massive data structures, this implementation could be your top choice. In this case, you should not use the IndexingStrategy.MINIMAL strategy, as it is not suitable for large graphs. With the new strategies, this graph also works well for very small graphs, where pattern matching is not needed.

Graph#contains is faster than GraphMem2Fast. Removing triples is a bit slower than GraphMem2Legacy when the index is initialized. Better performance than GraphMem2Fast for operations with triple matches for the pattern S_O, SP_, and _PO on large graphs, due to bit-operations to find intersecting triples. Memory consumption is about 7-99% higher than GraphMem2Legacy when the index is initialized. Suitable for really large graphs like bsbm-5m.nt.gz, bsbm-25m.nt.gz, and possibly even larger.

Internal structure:

  • One indexed hash set (same as GraphMem2Fast uses) that holds all triples.
  • The index for pattern matching consists of three hash maps indexed by subjects, predicates, and objects with RoaringBitmaps as values.
  • The bitmaps contain the indices of the triples in the central hash set.
  • Constructor Details

    • GraphMem2Roaring

      public GraphMem2Roaring()
      Constructs a new GraphMem2Roaring with a default RoaringTripleStore. This constructor initializes the graph with an empty triple store.

      The default strategy is EAGER, because of backwards compatibility. This is not necessarily the best strategy for all use cases, but it reflects the behavior before introducing the indexing strategies.

    • GraphMem2Roaring

      public GraphMem2Roaring(IndexingStrategy indexingStrategy)
      Constructs a new GraphMem2Roaring with the specified indexing strategy.
      Parameters:
      indexingStrategy - the indexing strategy to use for this graph
  • Method Details

    • copy

      public GraphMem2Roaring copy()
      Description copied from class: GraphMem2
      Creates a copy of this graph. Since the triples and nodes are immutable, the copy contains the same triples and nodes as this graph. Modifications to the copy will not affect this graph.
      Specified by:
      copy in interface org.apache.jena.atlas.lib.Copyable<GraphMem2>
      Overrides:
      copy in class GraphMem2
      Returns:
      independent copy of the current graph
    • getIndexingStrategy

      public IndexingStrategy getIndexingStrategy()
      Returns the indexing strategy used by this graph.
      Returns:
      the indexing strategy
    • clearIndex

      public void clearIndex()
      Clear the index of this graph. This will remove all triples from the index and reset the current strategy to the initial one.
    • initializeIndex

      public void initializeIndex()
      Initialize the index of this graph. This will build the index based on the current set of triples. After this call, the graph will behave like an EAGER indexed graph.
    • initializeIndexParallel

      public void initializeIndexParallel()
      Initialize the index of this graph in parallel. This will build the index based on the current set of triples using parallel processing. After this call, the graph will behave like an EAGER indexed graph.
    • isIndexInitialized

      public boolean isIndexInitialized()
      Check if the index of this graph is initialized. This method returns true if the index has been initialized and is ready for use.
      Returns:
      true if the index is initialized, false otherwise