Class PipedRDFIterator<T>

java.lang.Object
org.apache.jena.riot.lang.PipedRDFIterator<T>
Type Parameters:
T - The type of the RDF primitive, should be one of Triple, Quad, or Tuple<Node>
All Implemented Interfaces:
Iterator<T>, org.apache.jena.atlas.lib.Closeable

@Deprecated public class PipedRDFIterator<T> extends Object implements Iterator<T>, org.apache.jena.atlas.lib.Closeable
Deprecated.
To be removed - use AsyncParser.

A PipedRDFIterator should be connected to a PipedRDFStream implementation; the piped iterator then provides whatever RDF primitives are written to the PipedRDFStream

Typically, data is read from a PipedRDFIterator by one thread (the consumer) and data is written to the corresponding PipedRDFStream by some other thread (the producer). Attempting to use both objects from a single thread is not recommended, as it may deadlock the thread. The PipedRDFIterator contains a buffer, decoupling read operations from write operations, within limits.

Inspired by Java's PipedInputStream and PipedOutputStream

See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final int
    Deprecated.
    Constant for default buffer size
    static final int
    Deprecated.
    Constant for max number of failed poll attempts before the producer will be declared as dead
    static final int
    Deprecated.
    Constant for default poll timeout in milliseconds, used to stop the consumer deadlocking in certain circumstances
  • Constructor Summary

    Constructors
    Constructor
    Description
    Deprecated.
    Creates a new piped RDF iterator with the default buffer size of DEFAULT_BUFFER_SIZE.
    PipedRDFIterator(int bufferSize)
    Deprecated.
    Creates a new piped RDF iterator
    PipedRDFIterator(int bufferSize, boolean fair)
    Deprecated.
    Creates a new piped RDF iterator
    PipedRDFIterator(int bufferSize, boolean fair, int pollTimeout, int maxPolls)
    Deprecated.
    Creates a new piped RDF iterator
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    Deprecated.
    May be called by the consumer when it is finished reading from the iterator, if the producer thread has not finished it will receive an error the next time it tries to write to the iterator
    Deprecated.
    Gets the most recently seen Base IRI
    Deprecated.
    Gets the prefix map which contains the prefixes seen so far in the stream
    boolean
    Deprecated.
     
    Deprecated.
     
    void
    Deprecated.
     

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface java.util.Iterator

    forEachRemaining
  • Field Details

    • DEFAULT_BUFFER_SIZE

      public static final int DEFAULT_BUFFER_SIZE
      Deprecated.
      Constant for default buffer size
      See Also:
    • DEFAULT_POLL_TIMEOUT

      public static final int DEFAULT_POLL_TIMEOUT
      Deprecated.
      Constant for default poll timeout in milliseconds, used to stop the consumer deadlocking in certain circumstances
      See Also:
    • DEFAULT_MAX_POLLS

      public static final int DEFAULT_MAX_POLLS
      Deprecated.
      Constant for max number of failed poll attempts before the producer will be declared as dead
      See Also:
  • Constructor Details

    • PipedRDFIterator

      public PipedRDFIterator()
      Deprecated.
      Creates a new piped RDF iterator with the default buffer size of DEFAULT_BUFFER_SIZE.

      Buffer size must be chosen carefully in order to avoid performance problems, if you set the buffer size too low you will experience a lot of blocked calls so it will take longer to consume the data from the iterator. For best performance the buffer size should be at least 10% of the expected input size though you may need to tune this depending on how fast your consumer thread is.

    • PipedRDFIterator

      public PipedRDFIterator(int bufferSize)
      Deprecated.
      Creates a new piped RDF iterator

      Buffer size must be chosen carefully in order to avoid performance problems, if you set the buffer size too low you will experience a lot of blocked calls so it will take longer to consume the data from the iterator. For best performance the buffer size should be roughly 10% of the expected input size though you may need to tune this depending on how fast your consumer thread is.

      Parameters:
      bufferSize - Buffer size
    • PipedRDFIterator

      public PipedRDFIterator(int bufferSize, boolean fair)
      Deprecated.
      Creates a new piped RDF iterator

      Buffer size must be chosen carefully in order to avoid performance problems, if you set the buffer size too low you will experience a lot of blocked calls so it will take longer to consume the data from the iterator. For best performance the buffer size should be roughly 10% of the expected input size though you may need to tune this depending on how fast your consumer thread is.

      The fair parameter controls whether the locking policy used for the buffer is fair. When enabled this reduces throughput but also reduces the chance of thread starvation. This likely need only be set to true if there will be multiple consumers.

      Parameters:
      bufferSize - Buffer size
      fair - Whether the buffer should use a fair locking policy
    • PipedRDFIterator

      public PipedRDFIterator(int bufferSize, boolean fair, int pollTimeout, int maxPolls)
      Deprecated.
      Creates a new piped RDF iterator

      Buffer size must be chosen carefully in order to avoid performance problems, if you set the buffer size too low you will experience a lot of blocked calls so it will take longer to consume the data from the iterator. For best performance the buffer size should be roughly 10% of the expected input size though you may need to tune this depending on how fast your consumer thread is.

      The fair parameter controls whether the locking policy used for the buffer is fair. When enabled this reduces throughput but also reduces the chance of thread starvation. This likely need only be set to true if there will be multiple consumers.

      The pollTimeout parameter controls how long each poll attempt waits for data to be produced. This prevents the consumer thread from blocking indefinitely and allows it to detect various potential deadlock conditions e.g. dead producer thread, another consumer closed the iterator etc. and errors out accordingly. It is unlikely that you will ever need to adjust this from the default value provided by DEFAULT_POLL_TIMEOUT.

      The maxPolls parameter controls how many poll attempts will be made by a single consumer thread within the context of a single call to hasNext() before the iterator declares the producer to be dead and errors out accordingly. You may need to adjust this if you have a slow producer thread or many consumer threads.

      Parameters:
      bufferSize - Buffer size
      fair - Whether the buffer should use a fair locking policy
      pollTimeout - Poll timeout in milliseconds
      maxPolls - Max poll attempts
  • Method Details

    • hasNext

      public boolean hasNext()
      Deprecated.
      Specified by:
      hasNext in interface Iterator<T>
    • next

      public T next()
      Deprecated.
      Specified by:
      next in interface Iterator<T>
    • remove

      public void remove()
      Deprecated.
      Specified by:
      remove in interface Iterator<T>
    • getBaseIri

      public String getBaseIri()
      Deprecated.
      Gets the most recently seen Base IRI
      Returns:
      Base IRI
    • getPrefixes

      public PrefixMap getPrefixes()
      Deprecated.
      Gets the prefix map which contains the prefixes seen so far in the stream
      Returns:
      Prefix Map
    • close

      public void close()
      Deprecated.
      May be called by the consumer when it is finished reading from the iterator, if the producer thread has not finished it will receive an error the next time it tries to write to the iterator
      Specified by:
      close in interface org.apache.jena.atlas.lib.Closeable