No Result
View All Result
  • Login
Monday, September 15, 2025
FeeOnlyNews.com
  • Home
  • Business
  • Financial Planning
  • Personal Finance
  • Investing
  • Money
  • Economy
  • Markets
  • Stocks
  • Trading
  • Home
  • Business
  • Financial Planning
  • Personal Finance
  • Investing
  • Money
  • Economy
  • Markets
  • Stocks
  • Trading
No Result
View All Result
FeeOnlyNews.com
No Result
View All Result
Home Business

AI keeps getting more powerful, making it harder to judge how smart models actually are

by FeeOnlyNews.com
2 months ago
in Business
Reading Time: 4 mins read
A A
0
AI keeps getting more powerful, making it harder to judge how smart models actually are
Share on FacebookShare on TwitterShare on LInkedIn



How do you judge an AI model when it’s already starting to perform better than human beings? That’s the challenge faced by researchers like Russell Wald, executive director of the Stanford Institute for Human-Centered Artificial Intelligence (HAI). 

“As of 2024, there are very few task categories where human ability surpasses AI, and even in these areas, the performance gap between AI and humans is shrinking rapidly,” Wald said last week in a presentation hosted at the Fortune Brainstorm AI Singapore conference. “AI is exceeding human capabilities and it’s becoming increasingly harder for us to benchmark.”

The HAI releases the AI Index each year, which aims to provide a comprehensive, data-driven snapshot of where AI is today. At Fortune Brainstorm AI Singapore, Wald shared a few highlights from the 2025 edition of the AI index, such as the increasing power of today’s models, the growing dominance of industry on the AI frontier, and how China is poised to overtake the U.S.

The following transcript has been lightly edited for conciseness and clarity.

I’m Russell Wald, the executive director of the Stanford Institute for Human-Centered Artificial Intelligence, or what we call “HAI”. 

We are Stanford University’s globally recognized interdisciplinary research institute at the forefront of shaping AI development for the public good. HAI was established in 2019 with the goal of advancing AI research, education, policy and practice. And, through our convening role and rigorous study of AI, we have become the trusted partner on AI governance for decision makers in industry, government and civil society. 

I’m going to talk about what we produce at HAI, which is the AI index, an annual data driven analysis of trends in AI that tracks research, development, deployment and the socio-economic impact of AI across academia, government and industry.

We see AI performance consistently improve year over year. We use Midjourney, a text-to-image generator, asking for a hyper-realistic image of Harry Potter. And from February 2022 to July 2024, we see rapidly increasing quality in these generated images. 

In 2022, the model produced cartoonish, inaccurate renderings of Harry Potter, but by 2024, it could create startlingly realistic depictions. We have gone from what mirrors a Picasso painting to an uncanny rendering of Daniel Radcliffe, the actor who played Harry Potter in the movies. 

Because of this consistent performance growth, we are increasingly challenged when it comes to benchmarking these models. As of 2024, there are very few task categories where human ability surpasses AI, and even in these areas, the performance gap between AI and humans is shrinking rapidly. From image recognition to competition-level mathematics to PhD-level science questions, AI is exceeding human capabilities and it’s becoming increasingly harder for us to benchmark.

From healthcare to transportation, AI is rapidly moving from the lab to our daily life. In 2023, the U.S. Food and Drug Administration approved 223 AI-enabled medical devices, up from just six in 2015. 

On the roads, self-driving cars are no longer experimental. For example, Waymo, which I regularly take while living in San Francisco, is one of the largest U.S. operators and provides over 150,000 autonomous rides each week, while Baidu’s affordable Apollo Go robotaxi has a fleet now that serves numerous cities across China. 

Business use of AI increased significantly after stagnating from 2017 to 2023. The latest McKinsey report reveals that 78% of surveyed respondents say their organizations have begun to use AI in at least one business function, marking a significant increase from 55% in 2023. 

Driven by increasingly capable small models, the inference cost for a system performing at the level of [GPT 3.5] dropped over 280-fold between November 2022 and October 2024. Hardware costs have declined 30% annually, while energy efficiency has improved by 40% each year. 

Open-weight models are also closing the gap with closed models, reducing the performance [gap] from 8% to just 1.7% on some benchmarks in a single year. Together, these trends are rapidly lowering the barriers to advanced AI. 

However, even with inference and hardware costs going down, training costs remain out of reach for academia and most small players. Nearly 90% of notable AI models in 2024 came from industry, which is up from 60% in 2023. And while academia remains a top source of highly cited research, it does struggle at this point to stay as advanced at the frontier level. 

Model scale continues to grow rapidly. Training compute doubles every five months, datasets every eight, and power use annually. Yet performance gaps are shrinking. The score difference between the top and 10th ranked models fell from 11.9% to 5.4% in a year, and the top two models are now separated by just 0.7%. The frontier is increasingly competitive and increasingly crowded. 

In recent years, AI model performance at the frontier has converged, with multiple providers now offering highly capable models. This marks a shift from late 2022, when ChatGPT’s launch, widely seen as AI’s breakthrough into the public consciousness, coincided with the landscape dominated by just two players: OpenAI and Google. 

One of the most important things to note is that the transformer model cost $930 for Google to train in 2017—and that is the T in GPT, the baseline level of architecture—and now today we’re at $200 million to train Gemini Ultra. 

Last year’s AI index was among the first publications to highlight the lack of standard benchmarks for AI safety and responsibility evaluations. The index has also been analyzing global public opinion. If you are from a non-Western industrialized nation, you are more likely to view AI positively than not. China has an 83% positive view, Indonesia 80%, and Thailand 77%. Whereas Canada is at 40%, the U.S. 39%, and the Netherlands 36%. 

I’ll close with the geopolitical situation. The U.S. still maintains a lead in AI, followed closely by China. However, this gap is tightening. My intention is not to exacerbate the idea of an AI arms race between China and the U.S., but instead to highlight the different approaches between the most advanced frontier AI model developers. 

Over the last several years, the U.S. has relied on a few proprietary model providers. Meanwhile, China has deeply invested in its talent base, and more importantly, an open-source environment. If this trend continues, and I appear next year, at this rate, China would surpass the U.S. in terms of model performance. 



Source link

Tags: harderJudgeMakingModelsPowerfulSmart
ShareTweetShare
Previous Post

Amsterdam’s Labfresh raises €1M via crowdfunding to launch smart womenswear line

Next Post

Nuvama shares sink 6.5% in 2 days amid tax raids tied to Jane Street probe

Related Posts

Google’s market cap tops  trillion for the first time

Google’s market cap tops $3 trillion for the first time

by FeeOnlyNews.com
September 15, 2025
0

Google parent Alphabet (GOOG, GOOGL) became the fourth company to hit a market cap of $3 trillion Monday. The stock...

These are the tasks Indeed’s new CEO says HR leaders should hand over to AI agents

These are the tasks Indeed’s new CEO says HR leaders should hand over to AI agents

by FeeOnlyNews.com
September 15, 2025
0

Just three months after returning to the top job, Indeed CEO Hisayuki “Deko” Idekoba says he’s regularly working 15-hour days...

Three top execs leave digital bank One Zero

Three top execs leave digital bank One Zero

by FeeOnlyNews.com
September 15, 2025
0

Israeli digital bank One Zero today announced that three top executives are leaving: Deputy CEO and chief revenue officer...

Elon Musk buys  billion worth of Tesla shares from open market

Elon Musk buys $1 billion worth of Tesla shares from open market

by FeeOnlyNews.com
September 15, 2025
0

Tesla Inc Chief Executive Officer Elon Musk has purchased company's shares worth $1 billion from the open market. He bought...

I’m 35 and finally financially stable — but now my parents want to borrow K for a new roof. What do I do?

I’m 35 and finally financially stable — but now my parents want to borrow $10K for a new roof. What do I do?

by FeeOnlyNews.com
September 15, 2025
0

At 35, Kelly is just starting to feel like she’s in control of her finances. Her family didn’t have a...

From Gaza to Europe: How one Palestinian outsmarted war, smugglers, and the Mediterranean using ChatGPT and a jet ski

From Gaza to Europe: How one Palestinian outsmarted war, smugglers, and the Mediterranean using ChatGPT and a jet ski

by FeeOnlyNews.com
September 15, 2025
0

It took more than a year, several thousand dollars, ingenuity, setbacks and a jet ski: this is how Muhammad Abu...

Next Post
Nuvama shares sink 6.5% in 2 days amid tax raids tied to Jane Street probe

Nuvama shares sink 6.5% in 2 days amid tax raids tied to Jane Street probe

IDFC First Bank allots Rs 4,876 crore worth preference shares to Warburg Pincus affiliate

IDFC First Bank allots Rs 4,876 crore worth preference shares to Warburg Pincus affiliate

  • Trending
  • Comments
  • Latest
1 Stock to Buy, 1 Stock to Sell This Week: Walmart, Target

1 Stock to Buy, 1 Stock to Sell This Week: Walmart, Target

August 17, 2025
Of Property Rights, Civil Society, and Shampoo

Of Property Rights, Civil Society, and Shampoo

September 1, 2025
Engine Capital takes a stake in Avantor. Activist sees several ways to create value

Engine Capital takes a stake in Avantor. Activist sees several ways to create value

August 16, 2025
James Galbraith: Crash in Top Economist Hiring Contradicts Elite-Favoring “Skill Biased Technical Change” Theory

James Galbraith: Crash in Top Economist Hiring Contradicts Elite-Favoring “Skill Biased Technical Change” Theory

September 2, 2025
Vanguard reaches .5M SEC settlement

Vanguard reaches $19.5M SEC settlement

August 29, 2025
RBC wealth revenue rises despite recruiting costs

RBC wealth revenue rises despite recruiting costs

August 27, 2025
Google’s market cap tops  trillion for the first time

Google’s market cap tops $3 trillion for the first time

0
Making ,000 (Tax-Free) from One Real Estate Deal

Making $92,000 (Tax-Free) from One Real Estate Deal

0
Elon Musk buys  billion worth of Tesla shares from open market

Elon Musk buys $1 billion worth of Tesla shares from open market

0
These are the tasks Indeed’s new CEO says HR leaders should hand over to AI agents

These are the tasks Indeed’s new CEO says HR leaders should hand over to AI agents

0
How Did America Build the Arsenal of Democracy? (with Brian Potter)

How Did America Build the Arsenal of Democracy? (with Brian Potter)

0
Strategy Adds 525 BTC as Michael Saylor Says Bitcoin Deserves ‘Credit’

Strategy Adds 525 BTC as Michael Saylor Says Bitcoin Deserves ‘Credit’

0
Google’s market cap tops  trillion for the first time

Google’s market cap tops $3 trillion for the first time

September 15, 2025
These are the tasks Indeed’s new CEO says HR leaders should hand over to AI agents

These are the tasks Indeed’s new CEO says HR leaders should hand over to AI agents

September 15, 2025
Strategy Adds 525 BTC as Michael Saylor Says Bitcoin Deserves ‘Credit’

Strategy Adds 525 BTC as Michael Saylor Says Bitcoin Deserves ‘Credit’

September 15, 2025
Three top execs leave digital bank One Zero

Three top execs leave digital bank One Zero

September 15, 2025
Elon Musk buys  billion worth of Tesla shares from open market

Elon Musk buys $1 billion worth of Tesla shares from open market

September 15, 2025
Making ,000 (Tax-Free) from One Real Estate Deal

Making $92,000 (Tax-Free) from One Real Estate Deal

September 15, 2025
FeeOnlyNews.com

Get the latest news and follow the coverage of Business & Financial News, Stock Market Updates, Analysis, and more from the trusted sources.

CATEGORIES

  • Business
  • Cryptocurrency
  • Economy
  • Financial Planning
  • Investing
  • Market Analysis
  • Markets
  • Money
  • Personal Finance
  • Startups
  • Stock Market
  • Trading

LATEST UPDATES

  • Google’s market cap tops $3 trillion for the first time
  • These are the tasks Indeed’s new CEO says HR leaders should hand over to AI agents
  • Strategy Adds 525 BTC as Michael Saylor Says Bitcoin Deserves ‘Credit’
  • Our Great Privacy Policy
  • Terms of Use, Legal Notices & Disclaimers
  • About Us
  • Contact Us

Copyright © 2022-2024 All Rights Reserved
See articles for original source and related links to external sites.

Welcome Back!

Sign In with Facebook
Sign In with Google
Sign In with Linked In
OR

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Home
  • Business
  • Financial Planning
  • Personal Finance
  • Investing
  • Money
  • Economy
  • Markets
  • Stocks
  • Trading

Copyright © 2022-2024 All Rights Reserved
See articles for original source and related links to external sites.