Data Engineering — ORM and ODM with Python

-

Photo by David Clode on Unsplash

Manipulate database data leveraging an object-oriented programming paradigm

When working on data science projects, one fundamental pipeline to establish is the one regarding data collection. Real-world Machine Learning mainly differs from Kaggle-like problems because data just isn’t static. We want to scrape web sites, gather data from APIs, and so forth. This fashion of collecting data might look chaotic, and it’s! That’s why we want to structure our code following best practices to bring some type of order to all this mess.

When you identified the sources from which you must gather your data, you might want to collect them in a structured technique to store those in your database. For instance, you would possibly determine that with the intention to train your LLM what you wish are data sources which contain 3 fields: writer, content, and link.

What you would do is to download the information, after which write SQL queries to store and retrieve data out of your database. More commonly it is advisable to implement all of the queries to perform CRUD operations. CRUD stands for create, read, update, and delete. These are the 4 basic functions of persistent storage.

ASK DUKE

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x