Home Artificial Intelligence Database Data Transformation for Data Engineers

Database Data Transformation for Data Engineers

1
Database Data Transformation for Data Engineers

Advanced techniques for beginners

AI generated image using Kandinsky

On this story, I would love to lift a discussion on how we transform data. Whether it’s a database, data warehouse or reporting solution we run data transformations based on data models but how will we organise them? I would love to speak concerning the modern data transformation tools you employ. We’ll touch on some nuances of the modular approach, scheduling and data transformation tests. At the top of this text, I’ll provide an example application to run data modelling tasks with data lineage and self-documenting features. I’m very keen to know what you consider it.

I witnessed dozens of assorted ways to run data transformations. Throughout my greater than fifteen-year profession in big data and analytics, I built data pipelines with different design patterns and I’m sure there are more. That’s why I just like the technology world a lot. The multitude of possibilities it offers is solely amazing.

Which operating system do you employ to your data warehouse?

Modern data transformation tools

Modern data transformation tools also generally known as data modelling tools or data warehouse (DWH) operating systems were designed to simplify SQL data manipulation tasks to create datasets, views and tables. Often they use SQL-like dialect to run any possible data definitions (DDL) and manipulations (DML) we would need including data transformation tests and custom dataset creation in development mode.

The abundance of ANSI-SQL data warehouse solutions available in the market makes these tools extremely useful. For example, consider this list of dbt adaptors below. All market leaders are present there.

Making a latest connection using dbt. Image by writer.

dbt stands for database construct tool and it is actually a scheduler application that will be run locally or on the server to run data transformation tasks. For instance, consider this easy model below. It creates a view in our database and we will materialise it let’s say every 5 minutes to preserve the info for analytics. At the highest of the file we’ve got…

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here