Recent and improved content moderation tooling

To assist developers protect their applications against possible misuse, we’re introducing the faster and more accurate Moderation endpoint. This endpoint provides OpenAI API developers with free access to GPT-based classifiers that detect undesired content—an instance of using AI systems to help with human supervision of those systems. We have now also released each a technical paper describing our methodology and the dataset used for evaluation.

When given a text input, the Moderation endpoint assesses whether the content is sexual, hateful, violent, or promotes self-harm—content prohibited by our content policy. The endpoint has been trained to be quick, accurate, and to perform robustly across a spread of applications. Importantly, this reduces the probabilities of products “saying” the fallacious thing, even when deployed to users at-scale. As a consequence, AI can unlock advantages in sensitive settings, like education, where it couldn’t otherwise be used with confidence.

Recent and improved content moderation tooling

What are your thoughts on this topic?
Let us know in the comments below.

1 COMMENT

Share this article

Recent posts

AI’s Growing Power Needs: Tech Industry’s Move Towards Nuclear Power

“Human Intelligence Created”… Human Intelligence Challenge Spreads Against ‘Made by AI’

What We Still Don’t Understand About Machine Learning

OpenAI Unveils SearchGPT: A Recent AI-Powered Search Engine

Public Release: Kling AI Video Generator

Recent and improved content moderation tooling

What are your thoughts on this topic? Let us know in the comments below.

1 COMMENT

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.