overoptimization

Scaling laws for reward model overoptimization

In reinforcement learning from human feedback, it is not uncommon to optimize against a reward model trained to predict human preferences. Since the reward model is an imperfect proxy, optimizing its value an excessive...

Recent posts

Popular categories

ASK ANA