Home Artificial Intelligence Recent and improved content moderation tooling

Recent and improved content moderation tooling

1
Recent and improved content moderation tooling

To assist developers protect their applications against possible misuse, we’re introducing the faster and more accurate Moderation endpoint. This endpoint provides OpenAI API developers with free access to GPT-based classifiers that detect undesired content—an instance of using AI systems to help with human supervision of those systems. We have now also released each a technical paper describing our methodology and the dataset used for evaluation.

When given a text input, the Moderation endpoint assesses whether the content is sexual, hateful, violent, or promotes self-harm—content prohibited by our content policy. The endpoint has been trained to be quick, accurate, and to perform robustly across a spread of applications. Importantly, this reduces the probabilities of products “saying” the fallacious thing, even when deployed to users at-scale. As a consequence, AI can unlock advantages in sensitive settings, like education, where it couldn’t otherwise be used with confidence.

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here