data engineering

Anatomy of a Parquet File

Lately, Parquet has grow to be a normal format for data storage in Big Data ecosystems. Its column-oriented format offers several benefits: Faster query execution when only a subset of columns is being processed Quick calculation...

7 Powerful DBeaver Suggestions and Tricks to Improve Your SQL Workflow

DBeaver is probably the most powerful open-source SQL IDE, but there are several features people don’t learn about. On this post, I'll share with you many features to hurry up your workflow, with zero...

Practical SQL Puzzles That Will Level Up Your Skill

There are some Sql patterns that, once you realize them, you begin seeing them in all places. The solutions to the puzzles that I'll show you today are literally quite simple SQL queries, but...

Don’t Let Conda Eat Your Hard Drive

Should you’re an Anaconda user, that  make it easier to manage package dependencies, avoid compatibility conflicts, and share your projects with others. Unfortunately, they may take over your computer’s hard disk. I write plenty of...

Why Data Scientists Should Care about Containers — and Stand Out with This Knowledge

“I train models, analyze data and create dashboards — why should I care about Containers?” Many people who find themselves latest to the world of knowledge science ask themselves this query. But imagine you will...

ML Feature Management: A Practical Evolution Guide

On this planet of machine learning, we obsess over model architectures, training pipelines, and hyper-parameter tuning, yet often overlook a fundamental aspect: how our features live and breathe throughout their lifecycle. From in-memory calculations...

Data-Centric AI: The Importance of Systematically Engineering Training Data

Over the past decade, Artificial Intelligence (AI) has made significant advancements, resulting in transformative changes across various industries, including healthcare and finance. Traditionally, AI research and development have focused on refining models, enhancing algorithms,...

Recent posts

Popular categories

ASK ANA