Improving Mathematical Reasoning with Process Supervision

Artificial Intelligence

Improving Mathematical Reasoning with Process Supervision

admin

May 31, 2023

Improving Mathematical Reasoning with Process Supervision

We have trained a model to realize a recent state-of-the-art in mathematical problem solving by rewarding each correct step of reasoning (“process supervision”) as a substitute of simply rewarding the right final answer (“end result supervision”). Along with boosting performance relative to end result supervision, process supervision also has a very important alignment profit: it directly trains the model to supply a chain-of-thought that’s endorsed by humans.

Improving Mathematical Reasoning with Process Supervision

1 COMMENT

LEAVE A REPLY Cancel reply