jOOQ: Stream processing PostgreSQL query results

· 10 min read
Pablo González Doval   postgresql, jooq, java

When Java 8 came out, it proved a great effort to cope with the great demand of stream processing. Streams and lambdas allow you to process the data in a list asynchronously, so that you don’t have to wait until the last element of step 1 is calculated in order to start step 2 (apart from the evident code cleanness).

With great amounts of data, this reduces a very painful bottleneck, so why wouldn’t we want this huge data-related improvement in our ORMs? Of course, the query planner needs to do all the calculations, but as soon as it starts to pop data out, we want to start processing them.

This is one of the advantages jOOQ provides. So what’s jOOQ? In a world in which a great share of the ORM market is taken by magical annotation-based solutions, jOOQ takes a totally different approach: it provides a strongly typed way to access your data by autogenerating the necessary classes.

It’s not among my intentions to decide which approach is best, but in this post I’ll try to convince you that, although it’s not the most famous solution, jOOQ can be a face-to-face contender with any other ORM in the world.

Understanding the basics

The most important part of jOOQ’s functioning is that it autogenerates classes that represent your database objects. This way, it makes it very easy for you to write expressive SQL-like java code. This task of generating classes is a separate process, that is not at a code-generation stage. As such, it should be run in a Maven (mvn) task.

So, first of all we need to import jOOQ. Typically, something like this is enough:

<dependency>
   <groupId>org.jooq</groupId>
   <artifactId>jooq</artifactId>
   <version>${jooq.version}</version>
</dependency>

And then, create the task for code generation. Here we’ll post a simple example based on our own experience, but jOOQ’s doc has much more information on the topic, for example:

But as a simple example, this one may be good:

<plugin>
   <groupId>org.jooq</groupId>
   <artifactId>jooq-codegen-maven</artifactId>
   <version>3.11.4</version>

   <dependencies>
       <dependency>
           <groupId>org.postgresql</groupId>
           <artifactId>postgresql</artifactId>
           <version>42.2.4</version>
       </dependency>

       <dependency>
           <groupId>org.jooq</groupId>
           <artifactId>jooq</artifactId>
           <version>3.11.4</version>
       </dependency>
   </dependencies>

   <executions>
       <execution>
           <id>jooq-codegen-offline</id>
           <phase>generate-sources</phase>
           <goals>
               <goal>generate</goal>
           </goals>
           <configuration>
               <jdbc>
                   <url>jdbc:postgresql://url_to_your_database</url>
               </jdbc>
               <generator>
                   <database>
                       <schemata>
                           <schema>
                               <inputSchema>params</inputSchema>
                           </schema>
                       </schemata>
                   </database>
                   <target>
                       <packageName>com.ongres.persistence.jooq.generated.offline</packageName>
                   </target>
               </generator>
           </configuration>
       </execution>
   </executions>

   <configuration>

       <jdbc>
           <driver>org.postgresql.Driver</driver>
           <user>your username</user>
           <password>your password</password>
       </jdbc>

       <generator>
           <database>
               <name>org.jooq.util.postgres.PostgresDatabase</name>
               <includes>.*</includes>

               <forcedTypes>
                   <forcedType>
                       <name>UUID</name>
                       <expression>.*</expression>
                       <types>(?i:UUID)</types>
                   </forcedType>
               </forcedTypes>
           </database>
           <target>
               <directory>target/generated-sources/jooq</directory>
           </target>
       </generator>
   </configuration>
</plugin>

As you can see, in this example we specify:

Also, we can see how easy it is to tell jOOQ to map all fields that have UUID as database type with the Java class UUID. What happens if you add new tables or modify the DDL of the existing ones? Nothing, just run the maven task to reimport all the classes, and you’re done with the problem.

Tables and Records

Among all the classes that jOOQ generates, we’ll focus on 2 types:

Basic example

Let’s start with a first example, taken from the source code used for the development of postgresqlCO.NF, a tool that makes PostgreSQL configuration easy for human beings.

In this case, we wanted to obtain the info of an exact postgresql.conf parameter, in the given language. The query is simple enough to see the way a basic jOOQ query typically looks:

dslContext
   .selectFrom(ProcessedFields.PROCESSED_FIELDS)
   .where(ProcessedFields.PROCESSED_FIELDS.NAME.eq(parameter))
   .and(ProcessedFields.PROCESSED_FIELDS.LANG.eq(language.getAbbrev()))
   .fetchOne());

The first thing we can see is that everything is typed. Needless to say, the less queries or parts of a query that you’re forced to write as a String, the better. It becomes more battle proof, more DB-independent, easier to migrate… Never trust Strings in this context.

In jOOQ, fields in the select clause are typed, tables in the from clause are typed, conditions in the where clause are typed… or more accurately speaking, all of them can be typed (you can also write them as Strings, if you want).

Also, there are several methods to return the result. The one used here, fetchOne(), is good for returning a single result, but we’ll see more in other examples. Of course, in this example the fetchOne() method returns an instance on ProcessedFieldsRecord, as it is the record associated with the queried table.

DslContext

In order to run a query, we need an instance of DslContext. This class contains all the necessary information to access a database, and serves as the starting point for our queries (it’s the class that contains the select/delete/update/insertInto methods).

In case you need to connect alternatively to different databases, or to use different connections(read-only connections vs write connection), I’d recommend you to wrap them in a simple DslContextSupplier.

For example, in a part of postgresqlCO.NF, we have two databases, one in a context we call online and another one in a context called offline. Well, the solution looks pretty simple:

public enum Context {
 OFFLINE,
 ONLINE
}
public interface DslContextSupplier {
 <T> T executeFunction(Context context, Function<DSLContext, T> function);
 void executeConsumer(Context context, Consumer<DSLContext> consumer);
 <T, IT> T executeFunctionWithStream(Context context, Function<DSLContext, Stream<IT>> producer, Function<Stream<IT>, T> consumer);
 <IT> void executeConsumerWithStream(Context context, Function<DSLContext, Stream<IT>> producer, Consumer<Stream<IT>> consumer);
}

Taking care of closing resources in jOOQ is as important as it always is in DB access. That’s the reason why we have so many methods in our DslContextSupplier:

Advanced queries: jOOQ magic

One of the goals in postgresqlCO.NF is to allow you to upload your config files, update parameter values and comments as many times as you want, and download them in whatever format you want. In order to do this, we need to calculate which is the last version of each parameter. Yes, we’re going to use typed window functions in jOOQ. More concretely, we’ll use a row_number() and then filter those with row_number=1. The SQL code would be:

SELECT [all the fields but row_number]
FROM (
    SELECT *,
row_number() OVER(PARTITION BY param ORDER BY ts) row_number
FROM postgresql_conf_edit
WHERE row_number=1
AND configuration = ?
) a;

The whole point of this article is: can we do THAT in Java with jOOQ? Yes, we can:

public <T> T getAggregatedFileChanges(UUID configurationId,
   Function<Stream<PostgresqlConfFileChange>, T> callback) {

 Function<DSLContext, Stream<Record>> producer = (DSLContext dslContext) -> {
   List<SelectField<?>> nestedFields =
       new ArrayList<>(Arrays.asList(PostgresqlconfEdit.POSTGRESQLCONF_EDIT.fields()));
   Field<Integer> rowNumber = DSL.rowNumber().over()
       .partitionBy(PostgresqlconfEdit.POSTGRESQLCONF_EDIT.PARAM)
       .orderBy(PostgresqlconfEdit.POSTGRESQLCONF_EDIT.TS.desc())
       .as("row_number");
   nestedFields.add(rowNumber);

   return dslContext
       .select()
       .from(dslContext.select(nestedFields)
           .from(PostgresqlconfEdit.POSTGRESQLCONF_EDIT)
           .where(PostgresqlconfEdit.POSTGRESQLCONF_EDIT.CONFIGURATION.eq(configurationId))
       )
       .where(rowNumber.eq(1))
       .fetchStream();
 };

 Function<Stream<Record>, T> consumer = stream -> callback.apply(stream
     .map(record -> new PostgresqlConfFileChangeImpl(
         record.into(PostgresqlconfEdit.POSTGRESQLCONF_EDIT)))
 );

 return supplier.executeFunctionWithStream(Context.ONLINE, producer, consumer);
}

In order to understand what this does, please keep in mind that the producer function creates the jOOQ-generated Stream<IntermediateType> in a try-with-resources block, and that the consumer turns it into whatever type T we want to return from the DAO method. As you see, the method receives a callback that is applied to the obtained data, in order to return this wildcard type T. That said, let’s get down to the great advantages jOOQ gives in this case:

Not only queries

You’re not constrained to querying, jOOQ also allows you to do any kind of DML (create, insert, update). Tip: don’t forget to call the easy-to-forget execute() method as the last step of your statement! Otherwise, no operation will be actually executed (by the way, in this example we needed the id of the inserted tuple, so we don’t have the need of running execute() but returning() and fetchOne():

RawFieldsRecord rawFieldsRecord = dslContext.insertInto(RawFields.RAW_FIELDS)
   .columns(RawFields.RAW_FIELDS.CATEGORY_ID,      RawFields.RAW_FIELDS.DEFAULT_VALUE,
       RawFields.RAW_FIELDS.CONTEXT, RawFields.RAW_FIELDS.ENUMVALS,
       RawFields.RAW_FIELDS.MAX_VAL, RawFields.RAW_FIELDS.MIN_VAL,
       RawFields.RAW_FIELDS.NAME, RawFields.RAW_FIELDS.NEEDS_RESTART,
       RawFields.RAW_FIELDS.UNIT, RawFields.RAW_FIELDS.VARTYPE,
       RawFields.RAW_FIELDS.VERSION)
   .values(categoriesRecord.getId(), pgSettingsRecord.getBootVal(),
       pgSettingsRecord.getContext(), pgSettingsRecord.getEnumvals(),
       pgSettingsRecord.getMaxVal(), pgSettingsRecord.getMinVal(),
       pgSettingsRecord.getName(), needsRestart, pgSettingsRecord.getUnit(),
       pgSettingsRecord.getVartype(), pgVersion.getDotted())
   .returning(RawFields.RAW_FIELDS.ID).fetchOne();

And that’s all for now, I hope this post helped you understand how jOOQ can make querying PostgreSQL from Java natural, easier and safer, while allowing you to create Stream of the results which you can process with regular Java8+ or reactive APIs. Could it be better?

comments powered by Disqus