Constructing a Comment Toxicity Ranker Using Hugging Face’s Transformer Models

Artificial Intelligence

Constructing a Comment Toxicity Ranker Using Hugging Face’s Transformer Models

admin

August 7, 2023

Constructing a Comment Toxicity Ranker Using Hugging Face’s Transformer Models

Catching up on NLP and LLM (Part I)

As a Data Scientist, I even have never had the chance to properly explore the most recent progress in Natural Language Processing. With the summer and the brand new boom of Large Language Models because the starting of the yr, I made a decision it was time to dive deep into the sector and embark on some mini-projects. In any case, there may be never a greater technique to learn than by practicing.

As my journey began, I noticed it was complicated to seek out content that takes the reader by the hand and goes, one step at a time, towards a deep comprehension of recent NLP models with concrete projects. That is how I made a decision to start out this recent series of articles.

Constructing a Comment Toxicity Ranker Using HuggingFace’s Transformer Models

In this primary article, we’re going to take a deep dive into constructing a comment toxicity ranker. This project is inspired by the “Jigsaw Rate Severity of Toxic Comments” competition which took place on Kaggle last yr.

The target of the competition was to construct a model with the capability to find out which comment (out of two comments given as input) is essentially the most toxic.

To accomplish that, the model will attribute to each comment passed as input a rating, which determines its relative toxicity.

What this text will cover

In this text, we’re going to train our first NLP Classifier using Pytorch and Hugging Face transformers. I won’t go into the main points of how works transformers, but more into practical details and implementations and initiate some concepts that shall be useful for the subsequent articles of the series.

Particularly, we are going to see:

The way to download a model from Hugging Face Hub
The way to customize and use an Encoder
Construct and train a Pytorch ranker from certainly one of the Hugging Face models

This text is directly addressed to data scientists that will prefer to step their game in NLP from a practical viewpoint. I won’t do much…

Catching up on NLP and LLM (Part I)

Constructing a Comment Toxicity Ranker Using HuggingFace’s Transformer Models

What this text will cover

LEAVE A REPLY Cancel reply