HowTo? Series: JOLT - Part 1

SHARE

JOLT2

Data comes in different shapes and sizes. Sometimes, we want to shape the structure of this data to one that fits your needs. With JSON-data, you can achieve this using JOLT. In this series, I'll explain what JOLT is and how it can help you when working with JSON data.

JOLT

JOLT is an abbreviation and stands for JsOn Language to Transform. It is a tool that allows you to do JSON-to-JSON transformations where you can adjust the overall structure of a JSON-file to another structure that better suits your needs.

JOLT is a Java library at its core, but it is also readily available within other tools. My first encounter with JOLT would be using Apache NiFi, a data ingestion and distribution system. NiFi allows us to design and manage data flows where data can be routed, transformed and distributed. One of the many possible transformations within NiFi is a 'JoltTransform'.

How to use JOLT

When working with JOLT, we use a base structure in a JSON format that is composed of an 'operation' and a 'spec'.

Interestingly, this structure is part of an array, which means we can chain multiple JOLT transformations in one go.

In this part of the series I will focus on the different operations available within JOLT.

Operations

In general, there are a total of 5 operations that are used when defining a JOLT specification. They generally allow us to navigate to a desired level and apply a specific operation on the data at that level.

Shift

You can use the 'shift'-operation to move values within a JSON-document from one location to another. It allows us to navigate to a field or object and choose where to place this value in the desired output.

If no operation is specified in the JOLT-specification, the default operation will be the 'shift'-operation.

Example (shift):

JOLT example shift
Default

The 'default'-operation is used to insert new fields or objects with a default value if it does not exist. If the field/object does exist, this operation will have no effect.

Remove

This operation simply allows us to remove a specific field or object by assigning an empty string to the field/object. If we don't assign the empty string, the operation will fail.

Sort

With the 'sort'-operation, you can sort all fields and objects within a JSON-document in alphabetical order. Important note: sorting only applies to the field- and object-names, not their values.

The 'sort'-operation is the only operation that does not need the definition of a spec. Defining the operation is enough.

Cardinality

With the use of 'cardinality'-operation, you can transform fields and objects into lists of objects and vice versa.

Documentation/hands-on

Definitely have a look at the next webpage where the different JOLT operations are described in more detail and with multiple examples:

https://intercom.help/godigibee/en/articles/4044359-transformer-getting-to-know-jolt

Even better would be some hands-on experience. The link underneath will send you to a playground that allows you to apply JOLT to any JSON input that you provide:

https://jolt-demo.appspot.com/#inception

A JOLT playground is also implemented within Apache NiFi:

 

Hopefully you've learned something new that might come in handy next time you are struggling with the structure of a JSON-document. In part 2 of this series, I'll talk more about the difference between LHS and RHS and we'll have a first look at the use of 'operators'.

Need help with your Data Project?

 

Ready to set off on a BIG journey?

The top notch technologies we use set us apart from other consultancies