Artificial intelligence is rapidly evolving, and the Laude Institute is fueling this progress with its groundbreaking 'Slingshots' grants!
Announced on November 6, 2025, at 1:55 PM PST, this initiative is designed to propel the science and practical application of AI forward. The Slingshots program functions as an accelerator for AI researchers, offering resources often unavailable in academic settings. This includes financial backing, powerful computing capabilities, and crucial product and engineering support. In return, the recipients commit to delivering a tangible outcome, such as a startup, an open-source codebase, or another valuable artifact.
The inaugural cohort comprises 15 diverse projects, with a strong emphasis on the complex challenge of AI evaluation. Several of these projects will be familiar to those who follow TechCrunch, including the command-line coding benchmark Terminal Bench and the latest iteration of the long-standing ARC-AGI project.
But here's where it gets interesting: other projects are tackling established evaluation problems with fresh perspectives. For instance, Formula Code, developed by researchers from Caltech and UT Austin, aims to assess AI agents' proficiency in optimizing existing code. Meanwhile, BizBench, based out of Columbia University, is proposing a comprehensive benchmark specifically for 'white-collar AI agents.' Additional grants are exploring innovative structures for reinforcement learning and model compression.
John Boda Yang, co-founder of SWE-Bench, is also part of this cohort, leading the new CodeClash project. Inspired by the success of SWE-Bench, CodeClash will assess code through a dynamic, competition-based framework. Yang hopes this approach will foster continued progress. He told TechCrunch, "I do think people continuing to evaluate on core third-party benchmarks drives progress. I’m a little bit worried about a future where benchmarks just become specific to companies."
This raises a crucial question: is the future of AI evaluation best served by independent benchmarks, or will company-specific standards become the norm? What are your thoughts? Share your opinions in the comments below!