This Isn't Their First Gold Win-And Their Next Big Project Could Be Even Bigger

The team behind this Nevada gold miner has built billion-dollar gold companies before-and they're at it again. Drilling is already underway on a 27-square-mile land package, with nearly 2M ounces already defined...with more to come. With production live and expansion in motion, this sets up to be their next big win.

AI 'gold rush' for chatbot training data could run out of human-written text

MATT O'BRIEN
June 06, 2024

Artificial intelligence systems like ChatGPT could soon run out of what keeps making them smarter -- the tens of trillions of words people have written and shared online.

A new study released Thursday by research group Epoch AI projects that tech companies will exhaust the supply of publicly available training data for AI language models by roughly the turn of the decade -- sometime between 2026 and 2032.

Comparing it to a "literal gold rush" that depletes finite natural resources, Tamay Besiroglu, an author of the study, said the AI field might face challenges in maintaining its current pace of progress once it drains the reserves of human-generated writing.

In the short term, tech companies like ChatGPT-maker OpenAI and Google are racing to secure and sometimes pay for high-quality data sources to train their AI large language models - for instance, by signing deals to tap into the steady flow of sentences coming out of Reddit forums and news media outlets.

In the longer term, there won't be enough new blogs, news articles and social media commentary to sustain the current trajectory of AI development, putting pressure on companies to tap into sensitive data now considered private -- such as emails or text messages -- or relying on less-reliable "synthetic data" spit out by the chatbots themselves.

"There is a serious bottleneck here," Besiroglu said. "If you start hitting those constraints about how much data you have, then you can't really scale up your models efficiently anymore. And scaling up models has been probably the most important way of expanding their capabilities and improving the quality of their output."

The researchers first made their projections two years ago -- shortly before ChatGPT's debut -- in a working paper that forecast a more imminent 2026 cutoff of high-quality text data. Much has changed since then, including new techniques that enabled AI researchers to make better use of the data they already have and sometimes "overtrain" on the same sources multiple times.

But there are limits, and after further research, Epoch now foresees running out of public text data sometime in the next two to eight years.

The team's latest study is peer-reviewed and due to be presented at this summer's International Conference on Machine Learning in Vienna, Austria. Epoch is a nonprofit institute hosted by San Francisco-based Rethink Priorities and funded by proponents of effective altruism -- a philanthropic movement that has poured money into mitigating AI's worst-case risks.

Besiroglu said AI researchers realized more than a decade ago that aggressively expanding two key ingredients -- computing power and vast stores of internet data -- could significantly improve the performance of AI systems.

The amount of text data fed into AI language models has been growing about 2.5 times per year, while computing has grown about 4 times per year, according to the Epoch study. Facebook parent company Meta Platforms recently claimed the largest version of their upcoming Llama 3 model -- which has not yet been released -- has been trained on up to 15 trillion tokens, each of which can represent a piece of a word.

But how much it's worth worrying about the data bottleneck is debatable.

"I think it's important to keep in mind that we don't necessarily need to train larger and larger models," said Nicolas Papernot, an assistant professor of computer engineering at the University of Toronto and researcher at the nonprofit Vector Institute for Artificial Intelligence.

Papernot, who was not involved in the Epoch study, said building more skilled AI systems can also come from training models that are more specialized for specific tasks. But he has concerns about training generative AI systems on the same outputs they're producing, leading to degraded performance known as "model collapse."

Training on AI-generated data is "like what happens when you photocopy a piece of paper and then you photocopy the photocopy. You lose some of the information," Papernot said. Not only that, but Papernot's research has also found it can further encode the mistakes, bias and unfairness that's already baked into the information ecosystem.

If real human-crafted sentences remain a critical AI data source, those who are stewards of the most sought-after troves -- websites like Reddit and Wikipedia, as well as news and book publishers -- have been forced to think hard about how they're being used.

"Maybe you don't lop off the tops of every mountain," jokes Selena Deckelmann, chief product and technology officer at the Wikimedia Foundation, which runs Wikipedia. "It's an interesting problem right now that we're having natural resource conversations about human-created data. I shouldn't laugh about it, but I do find it kind of amazing."

While some have sought to close off their data from AI training -- often after it's already been taken without compensation -- Wikipedia has placed few restrictions on how AI companies use its volunteer-written entries. Still, Deckelmann said she hopes there continue to be incentives for people to keep contributing, especially as a flood of cheap and automatically generated "garbage content" starts polluting the internet.

AI companies should be "concerned about how human-generated content continues to exist and continues to be accessible," she said.

From the perspective of AI developers, Epoch's study says paying millions of humans to generate the text that AI models will need "is unlikely to be an economical way" to drive better technical performance.

As OpenAI begins work on training the next generation of its GPT large language models, CEO Sam Altman told the audience at a United Nations event last month that the company has already experimented with "generating lots of synthetic data" for training.

"I think what you need is high-quality data. There is low-quality synthetic data. There's low-quality human data," Altman said. But he also expressed reservations about relying too heavily on synthetic data over other technical methods to improve AI models.

"There'd be something very strange if the best way to train a model was to just generate, like, a quadrillion tokens of synthetic data and feed that back in," Altman said. "Somehow that seems inefficient."

------------

The Associated Press and OpenAI have a licensing and technology agreement that allows OpenAI access to part of AP's text archives.

Continue Reading...

Popular

FBI Director Kash Patel Bought 2 Stocks In 2025 — One's Already A Meme Favorite

FBI Director Kash Patel was asked about two stocks he bought earlier this year by a member of Congress on Wednesday and gave a surprising answer. The transactions come as members of Congress are working on legislation to ban lawmakers and cabinet members from buying and selling stocks.

More than 60 containers fall off ship in Long Beach port

LOS ANGELES (AP) — More than 60 containers toppled off a cargo ship Tuesday morning in the Port of Long Beach, tumbling overboard and floating in the water.

This Gold Miner Has the Map, the Metal, and the Momentum - Ad

JPMorgan is forecasting that gold will push past $4,000, and this Nevada operation is positioned to benefit. Historical estimates show its property could hold around 1.8 million ounces of measured and indicated gold, backed by infrastructure and billionaire investors. The company's already producing, refining gold onsite and looking to expand. That's not a dream-stage story-that's a head start.

US firm agrees $500 million investment deal with Pakistan for critical minerals

ISLAMABAD (AP) — A U.S. metals company signed a $500 million investment deal with Pakistan on Monday.

Tucker Carlson Asks OpenAI CEO Sam Altman If He Ordered Employee's Murder

During an interview with OpenAI CEO Sam Altman, Tucker Carlson suggested Altman may have played a role in the death of former employee Suchir Balaji.

If You Keep Cash In a U.S. Bank Account... Read This Now - Ad

The Treasury Department just issued a stunning warning: U.S. banks could lose up to $6.6 trillion of customer deposits as Americans rush into a new form of money... That's just been authorized under President Trump's highly controversial new law, S.1582. If you have any cash in a checking or savings account... this will affect you directly.

US media quickly forced to revisit a thorny question: How should a president's health be covered?

Early in Donald Trump's Tuesday, Fox News' Peter Doocy asked a question that surely baffled people who avoided social media for Labor Day.

Elon Musk's xAI Reportedly Lays Off 500 Workers From Data Annotation Team As Startup Pivots To Specialist AI Tutors For Grok Expansion

Elon Musk's xAI has laid off about 500 data annotators in a major restructuring, shifting focus to specialist AI tutors to accelerate Grok's development despite recently securing $10 billion in new funding.

Did This New AI Just Replace Google? - Ad

Elon Musk's xAI just launched a model that could disrupt Google and even challenge ChatGPT. For the first time, you can stake in xAI with as little as $500 - but you must move before the Oct 1 funding round.

At TIFF, the mid-sized movie strives to survive

TORONTO (AP) — Anyone will tell you it’s the audiences that make the . They aren’t purely industry folks, like they are in Cannes or Venice, but more boisterous, enthusiastic moviegoers with their own rituals, like growling like buccaneers at the piracy warning that plays before each screening.

Gold's Best Year Since 1978—What's Driving The Relentless Rally?

Gold has surged 37% in 2025, topping $3,500 an ounce. Discover what's driving the rally, from Fed politics to central bank buying.

Discover 3 Stocks Projected to Dethrone Nvidia (NVDA) - Ad

Futurist and Stock Analyst, Eric Fry, published a brand-new research report naming three outsider stocks have the power to completely disrupt the "seemingly invincible" Mag 7. For the time being, this research report is available through this page - at no charge.

Trump to host top tech CEOs — except Musk — at White House dinner Thursday

WASHINGTON (AP) — will host a high-powered list of tech CEOs for a dinner at the White House on Thursday night.

The Fed Just Got Kneecapped - Here's What Happens Next - Ad

A new law, S.1582, has just kneecapped the Federal Reserve - handing select companies the legal authority to mint a new form of American money. Investors who move now could see 40X gains by 2032... while the rest are left scrambling, wondering how they missed it.

Someone could win $1.8B Powerball jackpot Saturday. Odds are their identity will remain a mystery

After Iowa gas station employee Timothy Schultz won a $29 million lottery jackpot in 1999, he decided to hold a press conference. Lottery officials told him it would help him avoid being “hounded by media" since state law required his name to be disclosed anyway.

[Revealed] Trump's Next AI Executive Orders? - Ad

A White House insider with direct ties to Trump's inner circle just revealed what he calls "Manhattan II" - a potential $2.2 trillion AI initiative set to launch as soon as Oct 15. He says this could mirror past U.S. projects that minted fortunes - with small firms soaring 5,000% to 10,000% over two decades. And now he's giving away his #1 stock pick for free before the deadline.

Plunging Mortgage Rates Could Light Up These 9 Stocks

Mortgage rates are dropping at their fastest pace in nearly a year. Homebuilders, loan servicers and retailers could benefit.

What happens to Trump’s tariffs now that a federal appeals court has knocked them down?

WASHINGTON (AP) — President Donald Trump has audaciously claimed virtually unlimited power to bypass Congress and impose sweeping taxes on foreign products.

From Ore to Gold Bars-This Nevada Mine Is Already Producing - Ad

Most juniors explore...Few actually produce. This one's already pouring gold thanks to infrastructure that would cost tens of millions to build today. With major land holdings and room to scale, it's one to watch. Discover the little known gold company that's already producing.

Trump Trying To 'Take Over The Fed,' Says Elizabeth Warren: Turkey, Argentina And Nixon-Era Serve As Cautionary Tales, Warns Senator

Sen. Elizabeth Warren (D-Mass.) accused President Donald Trump of attempting to undermine the independence of the Federal Reserve, warning that such a move could spark inflation, driving up costs substantially for American families.

Spain sweltered under hottest summer on record

MADRID (AP) — Spain said Tuesday that this summer was the hottest on record for the southern European nation, which like the entire Mediterranean region is being hard hit by climate change.

Analysts Project This Stock Could Jump to $14 a Share. You Can Still Invest for $3.50. - Ad

The AI company making heart disease easier to detect is offering investors $3.50 investment units that include one convertible preferred share and one warrant, providing investors with access to 2 common shares. Based on analyst 1- year projections, that amounts to a near-term 500% return potential.

A rebel-held Congolese city uses damaged banknotes due to a cash shortage

BUKAVU, Congo (AP) — In the city of in eastern , Alain Mukumiro argues in a small wooden hut with a shopkeeper who refuses to take his money.

Robinhood Stock Surges Over 7% In Monday Pre-Market: What's Going On?

The stock of Robinhood Markets Inc. (NASDAQ: HOOD) surged 7.46% during the Monday pre-market trading session following Friday's news that it will be included in the S&P 500.

Is This Elon's Worst Nightmare? - Ad

Elon's empire looks doomed - crashing sales, lost tax credits, and media backlash. But behind the scenes, Tesla is about to unleash a breakthrough Forbes calls a "multi-trillion-dollar opportunity." It's not the end - it's the start of a 25,000% AI comeback.

Qualcomm And Valeo Broaden Collaboration To Speed Hands Off Driving Features

Qualcomm and Valeo partner to launch ADAS and AD solutions, integrating Snapdragon Ride SoC and software stack with Valeo's sensors and systems.

Von der Leyen proposes bolder EU sanctions against Israel over the war in Gaza

ANTWERP, Belgium (AP) — European Commission President Ursula von der Leyen broke Wednesday with her pro-Israel stance and announced plans to seek sanctions and a partial trade suspension against Israel over .

This Isn't Their First Gold Win-And Their Next Big Project Could Be Even Bigger - Ad

The team behind this Nevada gold miner has built billion-dollar gold companies before-and they're at it again. Drilling is already underway on a 27-square-mile land package, with nearly 2M ounces already defined...with more to come. With production live and expansion in motion, this sets up to be their next big win.

Toyota Reshapes US Strategy: More Hybrids, New EVs, Fewer Lexus Sedans

Toyota Motor shares are trading lower on Wednesday after Reuters reported that the carmaker was adjusting its U.S. manufacturing footprint.

A preliminary report on Lisbon's streetcar tragedy is expected Friday

LISBON, Portugal (AP) — Details started to emerge about the people who were killed when a derailed, as the first investigative report examining what caused the popular Lisbon tourist attraction to crash was expected to be released Friday.

This Gold Miner Has the Map, the Metal, and the Momentum - Ad

JPMorgan is forecasting that gold will push past $4,000, and this Nevada operation is positioned to benefit. Historical estimates show its property could hold around 1.8 million ounces of measured and indicated gold, backed by infrastructure and billionaire investors. The company's already producing, refining gold onsite and looking to expand. That's not a dream-stage story-that's a head start.

Elon Musk's SpaceX Acquires EchoStar Spectrum To Transform Global Satellite Connectivity

EchoStar is selling AWS-4 and H-block spectrum to SpaceX for $17 billion, funding debt payments and expanding 5G services with Starlink Direct to Cell.

Don't look now, but there's an AI-generated Italian teacup on your child's phone. What does it mean?

In the first half of 2025, she racked up over 55 million views on TikTok and 4 million likes, mostly from tweens glued to their cellphones. Not bad for an AI-generated cartoon ballerina with a cappuccino teacup for a head.

If You Keep Cash In a U.S. Bank Account... Read This Now - Ad

The Treasury Department just issued a stunning warning: U.S. banks could lose up to $6.6 trillion of customer deposits as Americans rush into a new form of money... That's just been authorized under President Trump's highly controversial new law, S.1582. If you have any cash in a checking or savings account... this will affect you directly.

Internet entrepreneur Kim Dotcom's latest legal bid to halt deportation from New Zealand is rejected

WELLINGTON, New Zealand (AP) — A court has rejected the latest bid by internet entrepreneur Kim Dotcom to halt his deportation to the United States on charges related to his file-sharing website Megaupload.

Did This New AI Just Replace Google? - Ad

Elon Musk's xAI just launched a model that could disrupt Google and even challenge ChatGPT. For the first time, you can stake in xAI with as little as $500 - but you must move before the Oct 1 funding round.

Elizabeth Warren Suppprts Bipartisan Bill To Ban Individual Stock Trading By Lawmakers: 'It's About Time'

Sen. Elizabeth Warren (D-Mass.) is lauding a new bipartisan effort to ban members of Congress from trading individual stocks, calling the move long overdue.

Trending Now

Information, charts or examples are for illustration and educational purposes only and not for individualized investment management This message contains commercial elements, such as advertising. We only send these offers to those who have opted in to our newsletter. Past performance is not indicative of future results. For these reasons we strongly suggest trading in a DEMO/Simulated account. The information provided by us is for educational and informational purposes only. We make no representations or warranties concerning the products, practices or procedures of any company or entity mentioned or recommended and have not determined if the statements and opinions of the advertiser are accurate, correct or truthful. If you use, act upon or make decisions in reliance on information contained or any external source linked within it, you do so at your own peril and agree to hold us, our officers, directors, shareholders, affiliates and agents without fault.

Copyright markethundred.com
Privacy Policy | Terms of Service