Large Language Models » Ken Yeung

Artificial Intelligence July 10, 2025

Ai2’s FlexOlmo Lets You Co-Develop AI Language Models Without Sharing Data

Ai2 has unveiled FlexOlmo, a new framework that lets multiple organizations jointly develop language models without requiring centralized data pooling. Ideal for those in regulated industries, it provides developers with granular control over how and when their data influences the model, addressing long-standing concerns regarding privacy, IP protection, and data sovereignty in collaborative AI projects. […]

Artificial Intelligence July 1, 2025

Ai2’s SciArena Benchmarks AI for Science, Inspired by ChatBot Arena

Nonprofit AI lab Ai2 has launched a new platform to help researchers evaluate which AI models perform best on scientific literature tasks. Called SciArena, it’s an open, collaborative service that enables head-to-head comparisons of large language models. Think ChatBot Arena—but built for the scientific research community. “Measuring progress in using AI agents for literature-grounded scientific […]

Artificial Intelligence June 3, 2025

Ai2’s RewardBench 2 Is a Tougher Benchmark for Testing How Well AI Models Reflect Human Judgment

Ai2 has released an update to its RewardBench benchmark, making it more capable of evaluating reward models. The next-generation test is built using new, more complex examples to assess how accurately AI models can produce answers that are as accurate as those of a human. In its first round of testing, RewardBench 2 ranks Google’s […]

Apps May 19, 2025

Microsoft Adds Model Tuning to M365 Copilot to Build Domain-Specific AI

Microsoft is adding a new feature to its Microsoft 365 (M365) Copilot assistant that transforms it into a domain expert. Announced at Microsoft Build 2025, the feature is called Copilot Tuning and is a low-code service allowing developers to train models on an organization’s own data without needing the assistance of data scientists. Imagine a […]

Artificial Intelligence May 6, 2025

ServiceNow and Nvidia Debut Apriel Nemotron 15B, an Open-Source Reasoning Model Built for Faster, Cheaper Agentic AI

ServiceNow and Nvidia have had a long-standing partnership building generative AI solutions for the enterprise. This week, at ServiceNow’s Knowledge customer conference, the two are introducing the latest fruits of their labor, a new large language model called Apriel Nemotron 15B with reasoning capabilities. The companies believe it performs as well as OpenAI’s o1-mini, Alibaba’s […]

Artificial Intelligence May 1, 2025

Microsoft’s New Phi-4 Variants Show Just How Far Small AI Can Go

Microsoft is doubling down on small language models with new Phi-4 variants that aim to prove a bold idea: small AI can think big. The new Phi-4-reasoning, Phi-4-reasoning-plus, and Phi-4-mini-reasoning models are optimized for complex tasks, such as math and coding, and outperform much larger models while running on devices with limited resources. While we […]

Artificial Intelligence April 30, 2025

AWS Releases Amazon Nova Premier, Its ‘Most Capable Model’ for Complex Task

Months after Amazon Chief Executive Andy Jassy first introduced the world to the Nova foundation model family, the company has released what might be its top-of-the-line variant, Amazon Nova Premier. Billed as Amazon’s “most capable model,” it can process text, images, and video (though not audio yet) and tackle complex, multi-step tasks that demand precise […]

Artificial Intelligence April 19, 2025

ServiceNow Expands Its AI Ambitions With New Apriel Small Language Models

ServiceNow, the enterprise automation platform, has released a family of small language models called Apriel, available in two variants: a 5B base model and an instruct-tuned version. The company lists that they are designed to handle general-purpose tasks, from answering questions and retrieving information, to content generation, code assistance, reasoning, and creative writing. Both models […]

Artificial Intelligence April 9, 2025

Ai2’s OLMoTrace Tool Reveals the Origins of AI Model Training Data

Ai2 has launched a tool to help developers answer a critical question: Where do AI models get their data? Called OLMoTrace, it’s an open-source application offering fact-checking of information provided in prompt replies. This data traceability is vital for those interested in governance, regulation, and auditing. This feature is currently available on Ai2’s flagship model, […]

Artificial Intelligence April 5, 2025

Meta Launches Llama 4 Scout and Maverick, Open-Weight Multimodal Models That Outperform GPT-4 and Gemini

Meta has released the first two models in its new Llama 4 family, which it claims will power “more personalized multimodal experiences.” On Saturday, the company introduced Llama 4 Scout and Maverick, both with 17 billion active parameters. Their debut follows a report that Meta delayed Llama 4’s launch multiple times after failing to meet […]