Home Artificial Intelligence Automated profile picture moderation at BlaBlaCar using Deep Learning Profile picture moderation at BlaBlaCar Designing an automatic pipeline Give attention to the BlaBlaCar-specific classification task Key takeaways

Automated profile picture moderation at BlaBlaCar using Deep Learning Profile picture moderation at BlaBlaCar Designing an automatic pipeline Give attention to the BlaBlaCar-specific classification task Key takeaways

Automated profile picture moderation at BlaBlaCar using Deep Learning
Profile picture moderation at BlaBlaCar
Designing an automatic pipeline
Give attention to the BlaBlaCar-specific classification task
Key takeaways

A story of how we drastically reduced manual profile picture moderation at BlaBlaCar.

Greater than to the BlaBlaCar carpooling platform in 2022. This represents roughly 53k pictures per day.

These profile pictures help in our carpooling community. They assist carpoolers , and recognize one another on the meeting point.

Profile pictures aid you know who you carpool with, and simply recognize them on the meeting point.

Our Community Relations team has been moderating pictures for years, since BlaBlaCar was originally created. They assist maintain the very best level of quality on the platform by ensuring profile pictures respect a algorithm. Out of the 53kpictures moderated day by day, roughly 25% are refused.

Profile picture moderation is completed using a algorithm presented on this table.

. The quantity of labor was significant (roughly 12 FTE), and led to substantial delays within the approval of profile pictures during peak times on the platform. As an illustration, the delay approached 3 days in the course of the French strikes in December 2019, vs a number of hours only often.

The Data Services team proposed designing an automatic pipeline for profile picture moderation.

The pipeline takes an image as an input, and returns the next:

  • : either “refuse”, “accept”, or “send to manual labeling”
  • : Within the case of an accepted picture, we perform a crop of the image across the face.

We split the duty into sub-tasks, based on whether or not they were generic or BlaBlaCar-specific:

  1. If there shouldn’t be exactly one face, then refuse the image. Else, crop the image across the face coordinates. To perform this task, we used an open-source library called face-detection. It enabled us to locate faces on pictures using a one-liner of Python code, with an estimated accuracy greater than 99%.
  2. This requires training a classifier to learn the moderation rules of BlaBlaCar. We are going to give attention to this task in the subsequent section.
  3. In some countries, uploading a star’s picture as a profile picture is a standard practice. In such countries, we want to detect celebrity pictures and refuse them. This can be a complex task: Detecting faces is one thing, recognizing faces is something else. Luckily, it shouldn’t be specific to our use-case. For this task, we decided to make use of AWS Rekognition, a paid API because of which lower than 5 lines of code were sufficient to detect whether or not an image was that of a star.

We built a classifier whose purpose is to predict whether or not an image cropped around exactly one face needs to be accepted or not, given the moderation rules of BlaBlaCar.

Constructing the dataset

We collected the info using manual profile picture moderation logs, which gave us access to each the raw picture (before cropping) uploaded by the user, and the associated label (“accept” or “refuse”).

Then, to acquire our final dataset, we applied the identical preprocessing that will be applied in production:

  1. Detect faces on all pictures, using the open-source Python library mentioned above;
  2. If there shouldn’t be exactly one face, then exclude from the training dataset. Otherwise, crop the image across the face coordinates.

Training the model

To coach the model, we used a way called Transfer Learning. In accordance with Wikipedia, this method consists of “storing knowledge gained while solving one problem and applying it to a unique but related problem.”

Concretely, we followed these steps:

  1. We took an open-source deep learning model pre-trained on ImageNet, an expansive open-source picture database, to perform generic classification tasks.
  2. We froze all layers of the network, except two fully-connected layers on the very end. They constitute what we call the “specialization layer.” When fitting the model to our profile picture dataset, these layers will adapt the overall knowledge held by the pre-trained model to the specificities of the BlaBlaCar profile picture moderation task.
We used a pre-trained model and specialized it on BlaBlaCar pictures.

Using this approach, we reached a ROC AUC above .98, which principally implies that the predictive power of the model was excellent.

Partial automation of profile picture moderation using the model

The distribution below shows that the model is capable of separate the info thoroughly:

The info is well-separated by our classifier.

Nonetheless, in the midst of the distribution plot, one can observe an area where there are roughly the identical variety of refused and accepted pictures. We call this the “gray zone.” It’s the part where it’s probably the most difficult to make a call based on the model’s prediction, since it can be near neither 0% nor 100%. Additionally it is the part where the model’s prediction is the least reliable.

As a way to maintain the very best quality standards on the platform, we decided to send the images in the grey zone to manual moderation, as the next graph illustrates:

The grey zone is distributed to manual moderation. The remaining is mechanically accepted or refused.

Moreover, we desired to repeatedly monitor the model’s quality. Thus, we decided to gather unbiased labels by sending a random sample of 5% of all pictures to manual moderation, whatever the model’s prediction. This might enable us to repeatedly measure the model’s performance, notably its precision and recall. The collected labels (either accepted or refused) also help retrain the model.

Using this approach, (vs. 93% when manually moderated). The remaining 20% are sent to manual moderation. The collected labels may even be used to retrain the model, and can help make it higher at predicting complex cases which can be currently near the choice boundary.

The model is in production on the time of writing, and has been for a yr. It requires little or no or no maintenance, since we don’t observe drift on the duty.

For those who had to recollect 3 things from this read, here’s what it is likely to be:

  1. To construct an automatic machine learning or deep learning pipeline, . If a task is generic, A package likely already exists that does exactly what you’re in search of. This fashion you’ll be able to focus your attention and energy on the tasks which can be specific to your use-case.
  2. , by utilizing pre-trained models and adapting them to your specific tasks.
  3. , because it provides the raw material for training the model: labels. But by automating the straightforward decisions, you’ll be able to focus human attention precisely where it has probably the most added value: on the hard cases, closer to the choice boundary, and on quality monitoring.



Please enter your comment!
Please enter your name here