Find out how to Construct Popularity-Based Recommenders with Polars Initial Thoughts Most Popular Across All Customers Most Popular Per Customer Conclusion

Artificial Intelligence

Find out how to Construct Popularity-Based Recommenders with Polars Initial Thoughts Most Popular Across All Customers Most Popular Per Customer Conclusion

admin

May 2, 2023

Find out how to Construct Popularity-Based Recommenders with Polars
Initial Thoughts
Most Popular Across All Customers
Most Popular Per Customer
Conclusion

Basic recommenders which are easy to grasp and implement, in addition to fast to coach

Recommender systems are algorithms designed to offer user recommendations based on their past behavior, preferences, and interactions. Becoming integral to numerous industries, including e-commerce, entertainment, and promoting, recommender systems improve user experience, increase customer retention, and drive sales.

While various advanced recommender systems exist, today I need to point out you one of the vital straightforward — yet often difficult to beat — recommenders: the . It is a wonderful baseline recommender that it is best to all the time check out along with a more advanced model, akin to matrix factorization.

We’ll create two different flavors of popularity-based recommenders using in this text. Don’t worry if you might have not used the fast pandas-alternative polars before; this text is an awesome place to learn it along the best way. Let’s start!

Popularity-based recommenders work by suggesting essentially the most continuously purchased products to customers. This vague idea might be was not less than two concrete implementations:

Check which articles are bought most frequently . Recommend these articles to every customer.
Check which articles are bought most frequently . Recommend these per-customer articles to their corresponding customer.

We’ll now show implement these concretely using our own custom-crated dataset.

If you wish to follow together with a real-life dataset, the H&M Personalized Fashion Recommendations challenge on Kaggle provides you with a superb example. Attributable to copyright reasons, I is not going to use this lovely dataset for this text.

The Data

First, we’ll create our own dataset. Be certain that to put in polars in the event you haven’t done so already:

pip install polars

Then, allow us to create random data consisting of a that it is best to interpret as “The shopper with this ID bought the article with that ID.”. We’ll use 1,000,000 customers that can purchase 50,000 products.

import numpy as npnp.random.seed(0)
N_CUSTOMERS = 1_000_000
N_PRODUCTS = 50_000
N_PURCHASES_MEAN = 100 # customers buy 100 articles on average
with open("transactions.csv", "w") as file:
file.write(f"customer_id,article_idn") # header
for customer_id in tqdm(range(N_CUSTOMERS)):
n_purchases = np.random.poisson(lam=N_PURCHASES_MEAN)
articles = np.random.randint(low=0, high=N_PRODUCTS, size=n_purchases)
for article_id in articles:
file.write(f"{customer_id},{article_id}n") # transaction as a row

This medium-sized dataset has , an amount you possibly can find in a business context.

The Task

We now need to construct recommender systems that scan this dataset with a view to recommend popular items in some sense. We’ll make clear two variants of interpret this:

hottest across all customers
hottest per customer

Our recommenders should recommend .

We’ll assess the standard of the recommenders here. Drop me a message in the event you are eager about this topic, though, because it’s value having a separate article about this.

On this recommender, we don’t even care who bought the articles — all the data we want is within the column alone.

High-level, it really works like this:

Load the info.
Count how often each article appears within the column .
Return the ten most frequent products because the suggestion for every customer.