From Apache Jena version
2.7.x onwards, TDB is now installed as part of a single integrated Jena
package. There is no longer a need to install a separate TDB package to run the TDB command line
tools, or to use TDB in your Java programs. See the downloads page for details on getting the latest Jena
From the location The directory
bin/ contains shell scripts to run the commands
from the command line. The scripts are bash scripts which should work
on Linux systems, Windows systems using Cygwin and
Mac/OS systems. The directory
bat/ contains Windows batch files which
provide the same functionality for Windows systems that are not using
Set the environment variable
JENAROOT to the root of the the Jena
Then set the
PATH to include the
This can be done in
.bashrc, or its equivalent on Mac OS/X, to ensure that the environment
variables are always available.
Setting environment variables in Windows is slightly involved. You can set them each time you start a command prompt:
SET JENAROOT=\Users\somebody\dev\apache-jena SET PATH=%PATH%;%JENAROOT\bin
or you can follow this guide or one like it to set the environment variables so that they are available every time you launch the command prompt.
Each command then has command-specific arguments described below.
All commands support
--help to give details of named and
There are two equivalent forms of named argument syntax:
--arg=val --arg val
TDB has a number of configuration options which can be set from the command line using:
Using tdb: is really a short hand for the URI prefix http://jena.hpl.hp.com/TDB# so the full URI form is
TDB commands use an assembler description for the persistent store
or a direct reference to the directory with the index and node files:
The assembler description follow the form for a dataset given in TDB assembler description page.
If neither assembler file nor location is given,
Bulk loader and index builder. Performans bulk load operations more efficiently than simply reading RDF into a TDB-back model.
Bulk loader and index builder. Faster than
tdbloader but only works
on Linux and Mac OS/X since it relies on some Unix system utilities.
This bulk loader can only be used to create a database. It may
overwrite existing data. It requires accepts the
--loc argument and a
list of files to load e.g.
> tdbloader2 --loc /path/for/database input1.ttl input2.ttl ...
There are various other advanced options available to customise the
behaviour of the bulk loader. Run with
--help to see the full usage
It is possible to do builds in phases by using the
tdbloader2index scripts separately though this should only be used
by advanced users. You can also do this by passing the
argument to the
tdbloader2 script and specifying
The indexing phase of the build uses the
sort utility to prepare the raw
data for indexing, this can potentially require large amounts of disk space
and the scripts will automatically check and warn/abort if the disk space
looks to be/is insufficient.
If you are building a large dataset (i.e. gigabytes of input data) you may wish to have the PipeViewer tool installed on your system as this will provide extra progress information during the indexing phase of the build.
Invoke a SPARQL query on a store. Use
--time for timing
information. The store is attached on each run of this command so
timing includes some overhead not present in a running system.
Details about query execution can be obtained -- see notes on the TDB Optimizer.
Dump the store in N-Quads format.
Produce a statistics for the dataset. See the TDB Optimizer description..