Big Industries Academy
HowTo? Series: JOLT - Part 1
Data comes in different shapes and sizes. Sometimes, we want to shape the structure of this data to one that fits your needs. With JSON-data, you can achieve this using JOLT. In this series, I'll explain what JOLT is and how it can help you when working with JSON data.
JOLT is an abbreviation and stands for JsOn Language to Transform. It is a tool that allows you to do JSON-to-JSON transformations where you can adjust the overall structure of a JSON-file to another structure that better suits your needs.
JOLT is a Java library at its core, but it is also readily available within other tools. My first encounter with JOLT would be using Apache NiFi, a data ingestion and distribution system. NiFi allows us to design and manage data flows where data can be routed, transformed and distributed. One of the many possible transformations within NiFi is a 'JoltTransform'.
How to use JOLT
When working with JOLT, we use a base structure in a JSON format that is composed of an 'operation' and a 'spec'.
Interestingly, this structure is part of an array, which means we can chain multiple JOLT transformations in one go.
In this part of the series I will focus on the different operations available within JOLT.
In general, there are a total of 5 operations that are used when defining a JOLT specification. They generally allow us to navigate to a desired level and apply a specific operation on the data at that level.
You can use the 'shift'-operation to move values within a JSON-document from one location to another. It allows us to navigate to a field or object and choose where to place this value in the desired output.
If no operation is specified in the JOLT-specification, the default operation will be the 'shift'-operation.
The 'default'-operation is used to insert new fields or objects with a default value if it does not exist. If the field/object does exist, this operation will have no effect.
This operation simply allows us to remove a specific field or object by assigning an empty string to the field/object. If we don't assign the empty string, the operation will fail.
With the 'sort'-operation, you can sort all fields and objects within a JSON-document in alphabetical order. Important note: sorting only applies to the field- and object-names, not their values.
The 'sort'-operation is the only operation that does not need the definition of a spec. Defining the operation is enough.
With the use of 'cardinality'-operation, you can transform fields and objects into lists of objects and vice versa.
Definitely have a look at the next webpage where the different JOLT operations are described in more detail and with multiple examples:
Even better would be some hands-on experience. The link underneath will send you to a playground that allows you to apply JOLT to any JSON input that you provide:
A JOLT playground is also implemented within Apache NiFi:
Hopefully you've learned something new that might come in handy next time you are struggling with the structure of a JSON-document. In part 2 of this series, I'll talk more about the difference between LHS and RHS and we'll have a first look at the use of 'operators'.
Morad Aoulad Abdenabi
Morad is known as an enthusiastic and reliable person and graduated as a professional Bachelor in Electronics-ICT. He holds a big interest in Big Data which only grew even more during his internship at Big Industries. Developing himself and learning new skills will always be of importance to him. Outside IT, he likes to travel and fly his drone to capture cool sceneries.