Recent and improved content moderation tooling

Artificial Intelligence

Recent and improved content moderation tooling

admin

March 9, 2023

Recent and improved content moderation tooling

To assist developers protect their applications against possible misuse, we’re introducing the faster and more accurate Moderation endpoint. This endpoint provides OpenAI API developers with free access to GPT-based classifiers that detect undesired content—an instance of using AI systems to help with human supervision of those systems. We have now also released each a technical paper describing our methodology and the dataset used for evaluation.

When given a text input, the Moderation endpoint assesses whether the content is sexual, hateful, violent, or promotes self-harm—content prohibited by our content policy. The endpoint has been trained to be quick, accurate, and to perform robustly across a spread of applications. Importantly, this reduces the probabilities of products “saying” the fallacious thing, even when deployed to users at-scale. As a consequence, AI can unlock advantages in sensitive settings, like education, where it couldn’t otherwise be used with confidence.

Recent and improved content moderation tooling

1 COMMENT

LEAVE A REPLY Cancel reply