Congress to Feature Trump on $100 Bill?

A shocking new plan was just introduced in Washington; to celebrate Trump's new "golden age" by placing him on the $100 bill. In the months ahead, this former Presidential Advisor predicts the government will release a massive multi-trillion-dollar asset which it has held back for more than a century.

AI 'gold rush' for chatbot training data could run out of human-written text

MATT O'BRIEN
June 06, 2024

Artificial intelligence systems like ChatGPT could soon run out of what keeps making them smarter -- the tens of trillions of words people have written and shared online.

A new study released Thursday by research group Epoch AI projects that tech companies will exhaust the supply of publicly available training data for AI language models by roughly the turn of the decade -- sometime between 2026 and 2032.

Comparing it to a "literal gold rush" that depletes finite natural resources, Tamay Besiroglu, an author of the study, said the AI field might face challenges in maintaining its current pace of progress once it drains the reserves of human-generated writing.

In the short term, tech companies like ChatGPT-maker OpenAI and Google are racing to secure and sometimes pay for high-quality data sources to train their AI large language models - for instance, by signing deals to tap into the steady flow of sentences coming out of Reddit forums and news media outlets.

In the longer term, there won't be enough new blogs, news articles and social media commentary to sustain the current trajectory of AI development, putting pressure on companies to tap into sensitive data now considered private -- such as emails or text messages -- or relying on less-reliable "synthetic data" spit out by the chatbots themselves.

"There is a serious bottleneck here," Besiroglu said. "If you start hitting those constraints about how much data you have, then you can't really scale up your models efficiently anymore. And scaling up models has been probably the most important way of expanding their capabilities and improving the quality of their output."

The researchers first made their projections two years ago -- shortly before ChatGPT's debut -- in a working paper that forecast a more imminent 2026 cutoff of high-quality text data. Much has changed since then, including new techniques that enabled AI researchers to make better use of the data they already have and sometimes "overtrain" on the same sources multiple times.

But there are limits, and after further research, Epoch now foresees running out of public text data sometime in the next two to eight years.

The team's latest study is peer-reviewed and due to be presented at this summer's International Conference on Machine Learning in Vienna, Austria. Epoch is a nonprofit institute hosted by San Francisco-based Rethink Priorities and funded by proponents of effective altruism -- a philanthropic movement that has poured money into mitigating AI's worst-case risks.

Besiroglu said AI researchers realized more than a decade ago that aggressively expanding two key ingredients -- computing power and vast stores of internet data -- could significantly improve the performance of AI systems.

The amount of text data fed into AI language models has been growing about 2.5 times per year, while computing has grown about 4 times per year, according to the Epoch study. Facebook parent company Meta Platforms recently claimed the largest version of their upcoming Llama 3 model -- which has not yet been released -- has been trained on up to 15 trillion tokens, each of which can represent a piece of a word.

But how much it's worth worrying about the data bottleneck is debatable.

"I think it's important to keep in mind that we don't necessarily need to train larger and larger models," said Nicolas Papernot, an assistant professor of computer engineering at the University of Toronto and researcher at the nonprofit Vector Institute for Artificial Intelligence.

Papernot, who was not involved in the Epoch study, said building more skilled AI systems can also come from training models that are more specialized for specific tasks. But he has concerns about training generative AI systems on the same outputs they're producing, leading to degraded performance known as "model collapse."

Training on AI-generated data is "like what happens when you photocopy a piece of paper and then you photocopy the photocopy. You lose some of the information," Papernot said. Not only that, but Papernot's research has also found it can further encode the mistakes, bias and unfairness that's already baked into the information ecosystem.

If real human-crafted sentences remain a critical AI data source, those who are stewards of the most sought-after troves -- websites like Reddit and Wikipedia, as well as news and book publishers -- have been forced to think hard about how they're being used.

"Maybe you don't lop off the tops of every mountain," jokes Selena Deckelmann, chief product and technology officer at the Wikimedia Foundation, which runs Wikipedia. "It's an interesting problem right now that we're having natural resource conversations about human-created data. I shouldn't laugh about it, but I do find it kind of amazing."

While some have sought to close off their data from AI training -- often after it's already been taken without compensation -- Wikipedia has placed few restrictions on how AI companies use its volunteer-written entries. Still, Deckelmann said she hopes there continue to be incentives for people to keep contributing, especially as a flood of cheap and automatically generated "garbage content" starts polluting the internet.

AI companies should be "concerned about how human-generated content continues to exist and continues to be accessible," she said.

From the perspective of AI developers, Epoch's study says paying millions of humans to generate the text that AI models will need "is unlikely to be an economical way" to drive better technical performance.

As OpenAI begins work on training the next generation of its GPT large language models, CEO Sam Altman told the audience at a United Nations event last month that the company has already experimented with "generating lots of synthetic data" for training.

"I think what you need is high-quality data. There is low-quality synthetic data. There's low-quality human data," Altman said. But he also expressed reservations about relying too heavily on synthetic data over other technical methods to improve AI models.

"There'd be something very strange if the best way to train a model was to just generate, like, a quadrillion tokens of synthetic data and feed that back in," Altman said. "Somehow that seems inefficient."

------------

The Associated Press and OpenAI have a licensing and technology agreement that allows OpenAI access to part of AP's text archives.

Continue Reading...

Popular

Marjorie Taylor Greene Goes Bargain Shopping, Discloses Buying These Two Stocks At 52-Week Lows

Congresswoman Marjorie Taylor Greene disclosed some new stocks recently. Unlike past trades in 2024 and 2025, the latest disclosure is rather unique.

Metals... Not Missles... Is the New Arms Race - Ad

China and Russia control 70% of the world's critical minerals, giving them leverage over the West. One N. American discovery could help shift that balance by developing the metals essential for defense systems.

MacKenzie Scott Has Donated More Than $19 Billion, Yet Her Wealth Grows Faster

MacKenzie Scott, the billionaire philanthropist and ex-wife of Amazon founder Jeff Bezos, has donated a staggering $19.25 billion since 2020.

Peter Thiel Once Explained Why Bitcoin Won't Go Up 'Dramatically' And How It's Set For A 'Volatile, Bumpy Ride' Thanks To BlackRock

Bitcoin's ongoing struggles have brought renewed attention to comments made last year by Palantir Technologies co-founder Peter Thiel, who predicted that the leading cryptocurrency was unlikely to see a dramatic surge

America's Defense Future Starts Underground - Ad

A N. American metals project just caught the attention of Rio Tinto - a mining giant. With four projects in key regions, this firm is aligned with Washington's push to rebuild the defense-metal supply chain.

Democratic senator accuses Trump of playing politics with aviation safety during shutdown

Democratic Sen. Tammy Duckworth suggested during a hearing Wednesday that the Trump administration was playing politics with the aviation system during to force an agreement to reopen the government.

Trump's China Tariff U-Turn, Ray Dalio's 'Melt-Up' Warning And More: This Week In Economy

Weekend roundup: Trump's China tariff shift, $17T investment claim, Dalio warns of market melt-up, shutdown hits GDP, Schiff weighs in on Supreme Court review.

Copper and Gold in Scale, Not Just Grade - Ad

This isn't a narrow system. It's a thick, mineralized zone delivering copper and gold together - in a province with infrastructure in place. New drill targets are already being tested.

Trump Wants Washington Commanders' $3.7 Billion Stadium Named After Him: Report

President Donald Trump is seeking to have the Washington Commanders' new $3.7 billion stadium named after him.

Zohran Mamdani Says No More Thanking Veterans Today, Forgetting Tomorrow — Trump, Obama And Others Express Gratitude For Service

America's top political and tech leaders — including Donald Trump, Barack Obama, Tim Cook, and Sundar Pichai — marked Veterans Day 2025 with tributes honoring the courage and sacrifice of U.S. service members.

Elon's $25 Trillion Confession - Ad

Elon Musk: "Tesla will become a $25 trillion company." That would make Tesla 8x bigger than Apple today. How is that possible? He admits it's all thanks to this one AI breakthrough that will take AI out of our computer screens and manifest a 250x boom here in the real world.

Lebanon's most wanted drug trafficker taken into custody, authorities say

BEIRUT (AP) — Lebanon's most wanted drug trafficker was arrested Thursday after years on the run, authorities said.

This Is the Type of Drill Hole That Changes Everything - Ad

A 19.5 metre zone returned 6.93% CuEq with a core 6.3 metre interval at 17.91%. The structure is bigger, richer, and more gold-loaded than expected. Drilling is active, and majors are watching.

Pete Hegseth Says War Department Preparing For 'Action' In Nigeria

U.S. Defense Secretary Hegseth agrees with Trump's order to prepare for action in Nigeria to stop the killing of Christians by Islamist militants.

"Tech Prophet" Who Predicted the iPhone Now Predicts... - Ad

George Gilder - who predicted the iPhone 17 years early and gave Reagan the first microchip - is making his boldest call yet. He says an American nanotech "super-convergence" could mint more millionaires than any event in recent memory. He's found 3 stocks set to benefit the most.

Indians who fled a Myanmar cyberscam center are being flown home from Thailand

MAE SOT, Thailand (AP) — is repatriating on Thursday the first batch of hundreds of its nationals who last month fled to Thailand from Myanmar, where most had been working at a .

Abu Dhabi hosts oil summit as OPEC+ halts production hikes planned for first quarter of 2026

ABU DHABI, United Arab Emirates (AP) — Abu Dhabi hosted a major oil summit Monday, hours after the OPEC+ cartel and its allies said it would halt further production increases planned in the first quarter of 2026 over concerns of too much supply in the market.

The Market Just Crossed a Dangerous Line - Ad

The man who predicted the 2008 crash and 2020 says today's soaring markets are NOT a bubble - they're something far stranger and more dangerous. He says it's about to change everything you know about money.

NATO member Romania signs agreement with Germany’s Rheinmetall to build a gunpowder plant

BUCHAREST, Romania (AP) — NATO member Romania signed an agreement Monday with German defense company Rheinmetall to build a gunpowder factory in central Romania, as

Criminal case against Boeing over deadly 737 Max plane crashes is dismissed by a US judge

A federal judge in Texas has agreed to dismiss a criminal conspiracy charge against Boeing in connection with two that killed 346 people.

Trump Signs Law to Launch Dollar 2.0 - Ad

Trump just signed law S.1582, unleashing the biggest money shift in 100+ years. For the first time since 1913, private firms - not the Fed - can mint a "Dollar 2.0." Treasury says it could drain $6.6T from banks and pay 10X current savings rates. Early investors in minting firms could see 40X returns by 2032.

Brazilian coffee, beef and tropical fruit will still be tariffed 40%, says Brazil’s vice president

RIO DE JANEIRO (AP) — Brazilian Vice President Geraldo Alckmin said Saturday that Brazilian exported goods to the U.S. including coffee, beef and tropical fruits would still be tariffed 40%, despite President Donald Trump’s decision to remove some import taxes.

Gold Near $4,073. Copper Tight. This Drill Hit Came Just in Time. - Ad

Dual exposure to two surging metals, plus 17.91% CuEq over mineable width, and infrastructure on site - this is what juniors dream of. And the market is just starting to notice.

TSLA, PLTR, IREN And More: 5 Stocks That Dominated Investor Buzz This Week

Retail investors talked up five hot stocks this week (Nov. 3–7) on X and Reddit's r/WallStreetBets: TSLA, PLTR, MSTR, AMD, IREN.

Zohran Mamdani Was Crypto Bettors' Overwhelming Favorite For New York City Mayor At 100% Odds — And He Just Won

Zohran Mamdani won the high-stakes New York City mayoral race Tuesday,  a victory widely anticipated by cryptocurrency bettors, who had overwhelmingly backed the Democratic Socialist.

Congress to Feature Trump on $100 Bill? - Ad

A shocking new plan was just introduced in Washington; to celebrate Trump's new "golden age" by placing him on the $100 bill. In the months ahead, this former Presidential Advisor predicts the government will release a massive multi-trillion-dollar asset which it has held back for more than a century.

Britain's Treasury chief prepares the ground for a tax-hiking budget

LONDON (AP) — U.K. on Tuesday signaled she will raise taxes in her budget this month, arguing that the economy is sicker than the government knew when it took office last year.

Cathie Wood Goes All-In On Peter Thiel's Crypto Play Bullish With Back-To-Back Million-Dollar Buys

Cathie Wood-led Ark Invest purchased shares in Bullish, a crypto exchange backed by Peter Thiel, through three of its funds. Other key trades were also made, including buying shares in CRISPR and Beam Therapeutics.

Metals... Not Missles... Is the New Arms Race - Ad

China and Russia control 70% of the world's critical minerals, giving them leverage over the West. One N. American discovery could help shift that balance by developing the metals essential for defense systems.

Bitcoin's Crash Below $100,000 Isn't The End: Wall Street Vet Says: 'We Have To Get Through This'

For the first time since July, Bitcoin (CRYPTO: BTC) fell below $100,000 on Tuesday as the crypto sell-off saw $1.7 billion in liquidations in 24 hours.

America's Defense Future Starts Underground - Ad

A N. American metals project just caught the attention of Rio Tinto - a mining giant. With four projects in key regions, this firm is aligned with Washington's push to rebuild the defense-metal supply chain.

Why Did MediciNova Stock (MNOV) Jump Over 87% In After-Hours Trading?

MediciNova shares soared over 87% in after-hours trading on Thursday following the publication of promising research.

Trump Touts 'Really Good Deal' With China As US Stock Futures Rally — Dow Up 91 Points While Gold, US Dollar Remain Flat

U.S. stock futures are surging on Sunday evening, following greater clarity and easing trade tensions between the United States and China over the weekend, following the summit between President Donald Trump and Chinese President Xi Jinping in South Korea last week.

Copper and Gold in Scale, Not Just Grade - Ad

This isn't a narrow system. It's a thick, mineralized zone delivering copper and gold together - in a province with infrastructure in place. New drill targets are already being tested.

IREN Skyrockets After Sealing $9.7 Billion AI Cloud Deal With Microsoft

Pre-market trading sees IREN shares up after securing $9.7B cloud contract with Microsoft, partnering with Dell.

Trending Now

Information, charts or examples are for illustration and educational purposes only and not for individualized investment management This message contains commercial elements, such as advertising. We only send these offers to those who have opted in to our newsletter. Past performance is not indicative of future results. For these reasons we strongly suggest trading in a DEMO/Simulated account. The information provided by us is for educational and informational purposes only. We make no representations or warranties concerning the products, practices or procedures of any company or entity mentioned or recommended and have not determined if the statements and opinions of the advertiser are accurate, correct or truthful. If you use, act upon or make decisions in reliance on information contained or any external source linked within it, you do so at your own peril and agree to hold us, our officers, directors, shareholders, affiliates and agents without fault.

Copyright markethundred.com
Privacy Policy | Terms of Service