Archive
Tag: RLHF
Ai2’s RewardBench 2 Is a Tougher Benchmark for Testing How Well AI Models Reflect Human Judgment
Ai2 has released an update to its RewardBench benchmark, making it more capable of evaluating reward models. The next-generation test is built using new, more complex examples to assess how accurately AI models can produce answers that are as accurate as those of a human. In its first round of testing, RewardBench 2 ranks Google’s […]
Ai2 Releases Tulu 3 to Unlock the Open-Source Post-Training Black Box
Ai2 has introduced an addition to its Tulu suite of models to level the playing field between open-source and proprietary closed models in post-training performance. Coming nearly a year after its predecessor, Tulu 3 aims to help models avoid forgetting core skills when undergoing specialized training, such as following instructions, coding, doing math, having knowledge […]
