We built a neural theorem prover for Lean that learned to unravel a wide range of difficult high-school olympiad problems, including problems from the AMC12 and AIME competitions, in addition to two problems adapted from the IMO.
Multi-Armed Bandits with delayed rewards in successive trialsThis trend nonetheless doesn't generalize to grids with smaller batch numbers. For the case where M=2 the variety of samples in the primary batch of the geometric...