OpenAI Red Teaming Network

Artificial Intelligence

OpenAI Red Teaming Network

admin

September 22, 2023

Q: What’s going to joining the network entail?

A: Being a part of the network means it’s possible you’ll be contacted about opportunities to check a latest model, or test an area of interest on a model that’s already deployed. Work conducted as an element of the network is conducted under a non-disclosure agreement (NDA), though now we have historically published lots of our red teaming findings in System Cards and blog posts. You will likely be compensated for time spent on red teaming projects.

Q: What’s the expected time commitment for being an element of the network?

A: The time that you just determine to commit might be adjusted depending in your schedule. Note that not everyone within the network will likely be contacted for each opportunity, OpenAI will make selections based on the appropriate fit for a selected red teaming project, and emphasize latest perspectives in subsequent red teaming campaigns. Whilst little as 5 hours in a single yr would still be helpful to us, so don’t hesitate to use should you have an interest but your time is proscribed.

Q: When will applicants be notified of their acceptance?

A: OpenAI will likely be choosing members of the network on a rolling basis and you possibly can apply until December 1, 2023. After this application period, we’ll re-evaluate opening future opportunities to use again.

Q: Does being an element of the network mean that I will likely be asked to red team every latest model?

A: No, OpenAI will make selections based on the appropriate fit for a selected red teaming project, and you must not expect to check every latest model.

Q: What are some criteria you’re on the lookout for in network members?

A: Some criteria we’re on the lookout for are:

Demonstrated expertise or experience in a selected domain relevant to red teaming
Enthusiastic about improving AI safety
No conflicts of interest
Diverse backgrounds and traditionally underrepresented groups
Diverse geographic representation
Fluency in multiple language
Technical ability (not required)

Q: What are other collaborative safety opportunities?

A: Beyond joining the network, there are other collaborative opportunities to contribute to AI safety. As an example, one option is to create or conduct safety evaluations on AI systems and analyze the outcomes.

OpenAI’s open-source Evals repository (released as a part of the GPT-4 launch) offers user-friendly templates and sample methods to jump-start this process.

Evaluations can range from easy Q&A tests to more-complex simulations. As concrete examples, listed here are sample evaluations developed by OpenAI for evaluating AI behaviors from quite a lot of angles:

Persuasion

MakeMeSay: How well can an AI system trick one other AI system into saying a secret word?
MakeMePay: How well can an AI system persuade one other AI system to donate money?
Ballot Proposal: How well can an AI system influence one other AI system’s support of a political proposition?

Steganography (hidden messaging)

Steganography: How well can an AI system pass secret messages without being caught by one other AI system?
Text Compression: How well can an AI system compress and decompress messages, to enable hiding secret messages?
Schelling Point: How well can an AI system coordinate with one other AI system, without direct communication?

We encourage creativity and experimentation in evaluating AI systems. Once accomplished, we welcome you to contribute your evaluation to the open-source Evals repo to be used by the broader AI community.

You may as well apply to our Researcher Access Program, which provides credits to support researchers using our products to review areas related to the responsible deployment of AI and mitigating associated risks.

LEAVE A REPLY Cancel reply