Retire Comfortably with These New Monthly Income ETFs?

Retirement should be freedom, not stress. Yet outdated advice and tiny returns leave many trapped. Kelly G. broke free with a revolutionary income strategy once reserved for the wealthy - hitting her "Freedom Number" faster than she dreamed. You might already have enough too.

AI 'gold rush' for chatbot training data could run out of human-written text

MATT O'BRIEN
June 06, 2024

Artificial intelligence systems like ChatGPT could soon run out of what keeps making them smarter -- the tens of trillions of words people have written and shared online.

A new study released Thursday by research group Epoch AI projects that tech companies will exhaust the supply of publicly available training data for AI language models by roughly the turn of the decade -- sometime between 2026 and 2032.

Comparing it to a "literal gold rush" that depletes finite natural resources, Tamay Besiroglu, an author of the study, said the AI field might face challenges in maintaining its current pace of progress once it drains the reserves of human-generated writing.

In the short term, tech companies like ChatGPT-maker OpenAI and Google are racing to secure and sometimes pay for high-quality data sources to train their AI large language models - for instance, by signing deals to tap into the steady flow of sentences coming out of Reddit forums and news media outlets.

In the longer term, there won't be enough new blogs, news articles and social media commentary to sustain the current trajectory of AI development, putting pressure on companies to tap into sensitive data now considered private -- such as emails or text messages -- or relying on less-reliable "synthetic data" spit out by the chatbots themselves.

"There is a serious bottleneck here," Besiroglu said. "If you start hitting those constraints about how much data you have, then you can't really scale up your models efficiently anymore. And scaling up models has been probably the most important way of expanding their capabilities and improving the quality of their output."

The researchers first made their projections two years ago -- shortly before ChatGPT's debut -- in a working paper that forecast a more imminent 2026 cutoff of high-quality text data. Much has changed since then, including new techniques that enabled AI researchers to make better use of the data they already have and sometimes "overtrain" on the same sources multiple times.

But there are limits, and after further research, Epoch now foresees running out of public text data sometime in the next two to eight years.

The team's latest study is peer-reviewed and due to be presented at this summer's International Conference on Machine Learning in Vienna, Austria. Epoch is a nonprofit institute hosted by San Francisco-based Rethink Priorities and funded by proponents of effective altruism -- a philanthropic movement that has poured money into mitigating AI's worst-case risks.

Besiroglu said AI researchers realized more than a decade ago that aggressively expanding two key ingredients -- computing power and vast stores of internet data -- could significantly improve the performance of AI systems.

The amount of text data fed into AI language models has been growing about 2.5 times per year, while computing has grown about 4 times per year, according to the Epoch study. Facebook parent company Meta Platforms recently claimed the largest version of their upcoming Llama 3 model -- which has not yet been released -- has been trained on up to 15 trillion tokens, each of which can represent a piece of a word.

But how much it's worth worrying about the data bottleneck is debatable.

"I think it's important to keep in mind that we don't necessarily need to train larger and larger models," said Nicolas Papernot, an assistant professor of computer engineering at the University of Toronto and researcher at the nonprofit Vector Institute for Artificial Intelligence.

Papernot, who was not involved in the Epoch study, said building more skilled AI systems can also come from training models that are more specialized for specific tasks. But he has concerns about training generative AI systems on the same outputs they're producing, leading to degraded performance known as "model collapse."

Training on AI-generated data is "like what happens when you photocopy a piece of paper and then you photocopy the photocopy. You lose some of the information," Papernot said. Not only that, but Papernot's research has also found it can further encode the mistakes, bias and unfairness that's already baked into the information ecosystem.

If real human-crafted sentences remain a critical AI data source, those who are stewards of the most sought-after troves -- websites like Reddit and Wikipedia, as well as news and book publishers -- have been forced to think hard about how they're being used.

"Maybe you don't lop off the tops of every mountain," jokes Selena Deckelmann, chief product and technology officer at the Wikimedia Foundation, which runs Wikipedia. "It's an interesting problem right now that we're having natural resource conversations about human-created data. I shouldn't laugh about it, but I do find it kind of amazing."

While some have sought to close off their data from AI training -- often after it's already been taken without compensation -- Wikipedia has placed few restrictions on how AI companies use its volunteer-written entries. Still, Deckelmann said she hopes there continue to be incentives for people to keep contributing, especially as a flood of cheap and automatically generated "garbage content" starts polluting the internet.

AI companies should be "concerned about how human-generated content continues to exist and continues to be accessible," she said.

From the perspective of AI developers, Epoch's study says paying millions of humans to generate the text that AI models will need "is unlikely to be an economical way" to drive better technical performance.

As OpenAI begins work on training the next generation of its GPT large language models, CEO Sam Altman told the audience at a United Nations event last month that the company has already experimented with "generating lots of synthetic data" for training.

"I think what you need is high-quality data. There is low-quality synthetic data. There's low-quality human data," Altman said. But he also expressed reservations about relying too heavily on synthetic data over other technical methods to improve AI models.

"There'd be something very strange if the best way to train a model was to just generate, like, a quadrillion tokens of synthetic data and feed that back in," Altman said. "Somehow that seems inefficient."

------------

The Associated Press and OpenAI have a licensing and technology agreement that allows OpenAI access to part of AP's text archives.

Continue Reading...

Popular

What's Going On With Snowflake Shares Wednesday?

Snowflake Inc. (NYSE:SNOW) is in the spotlight Wednesday ahead of third-quarter earnings after the market closes.

Netflix Stock Slides As Alarm Bells Ring Related To Potential HBO Max Bundle

Netflix Inc (NASDAQ:NFLX) shares slipped on Wednesday amid cost concerns related to a potential HBO Max bundle.

Real-Time Finance Hits Wall Street - Ad

A new ETF built for the payments revolution is now trading. Seconds-level settlement and a 0.50% fee* make it one of the cleanest structures yet.

Trump 'Very Seriously' Considering Australian Retirement Savings System For US Working People: 'It's A Good Plan'

President Donald Trump has revealed that his administration is actively exploring the possibility of implementing a retirement savings system similar to Australia's in the United States.

The $43B Big Pharma Story is Starting Over-With a New Player - Ad

Big Pharma once paid $43B for a small biotech with a similar platform. Now, a new company is following that same playbook, leveraging its patented delivery technology to attract partnerships and near-term revenue potential.

Trump plans to weaken vehicle mileage rules that limit air pollution

WASHINGTON (AP) — is expected to announce a proposal Wednesday to weaken for the auto industry, loosening regulatory pressure on automakers to control pollution from gasoline-powered cars and trucks, according to several people familiar with the White House plans.

DOJ Greenlights Strikes on Drug Boats, Citing Fentanyl as Potential Chemical Weapon

The DOJ has classified Fentanyl as a potential chemical weapon. This classification has led to the authorization of strikes on drug-smuggling boats.

Copper Is Tight, Silver Is Rising - And This Early Nevada Play Hits the Timing Perfectly - Ad

AI, electrification, battery storage, and data centers are pushing copper and silver demand sharply higher, even as supply stays tight. This region in Nevada offers rare multi-metal potential, and a new company has secured land in a district the surging district. It's one of the cleanest timing setups in the sector right now.

Alibaba's New AI App Qwen Becomes One Of Fastest-Growing Globally

Alibaba's Qwen app becomes a major player in the global AI race, with 149% MAU growth and strong stock performance for the Chinese e-commerce giant.

Nvidia's Most Valuable List for 2026 - Ad

Many companies partnering with Nvidia have seen their own stocks go up... That includes ASML, up 4,501%... Synopsys, up 3,745%. And Taiwan Semiconductor, which has soared as much as 9,793%. You won't find these companies anywhere in Nvidia's official Partner Network. That's why I call them Nvidia's "Unauthorized" Silent Partners. In 2026, a new set of them is poised to benefit.

Trump says he's rebuilding Dulles airport while his administration is fixing the 'people movers'

WASHINGTON (AP) — President Donald Trump said Tuesday that his administration will embark on a reconstruction of Dulles International Airport in northern Virginia.

The New Payments ETF Is Live on NASDAQ: - Ad

Money is moving to real-time rails, and a newly listed ETF now gives investors direct exposure. Fast settlement. Institutional custody. Simple access.

AppLovin's Merchant Boom Hints At Q4 Upside, Analyst Says

AppLovin Corp (NASDAQ: APP) sees strong adoption of its Axon ad tech, setting up for potential Q4 upside surprise.

The Next Biggest Bull Run In Over 50 Years - Ad

Gold has hit all-time highs, breaking $4,000 an ounce - but history shows it could be on the verge of its biggest bull run in over half a century... triggered by a likely major event, eerily similar to what happened in the 1970s. (It's NOT inflation or anything you're likely expecting.) Now, a top analyst says you can capture ALL of the upside without touching a risky miner or a boring exchange-traded fund. He sees extraordinary potential gains long term with very little risk.

Samsung Vs Apple: The Foldable Phone War Just Went Nuclear

Samsung just launched a game-changing foldable phone, putting pressure on Apple's rumored foldable iPhone. Will foldables become mainstream?

$270,000 Drug. One Competitor. Billion-Dollar Market. - Ad

Phase 3 trial targets recurrent pericarditis with an oral therapy that could disrupt the only approved treatment. And their heart failure program launches in 2026.

Trump Withdraws Support For 'Wacky' Marjorie Taylor Greene In Sudden, Fiery Split: 'I Can't Take...'

President Donald Trump said he is withdrawing his endorsement of longtime ally Rep. Marjorie Taylor Greene — here's what happened.

California revokes 17,000 driver's licenses. But the state disputes it is over immigration concerns

California plans to revoke 17,000 commercial driver’s licenses given to immigrants after the Trump administration raised concerns about people in the country illegally receiving licenses to drive a semitruck or a bus. But Gov. Gavin Newsom said that isn't the reason.

3 Companies Nvidia Needs for their $24 Trillion Conquest - Ad

Nvidia's charging into two seismic tech frontiers projected to be worth over $24 TRILLION! And they're in a race to dominate first. But here's the dirty secret Nvidia won't admit... They can't do it alone. Nvidia needs 3 Silent Partners... This $24 trillion pivot hinges on them.

BRICS Outpaces G7, The Undeniable Economic Rebalancing

BRICS+ now has a larger share of world GDP than G7. BRICS+ GDP, measured in PPP, is $75.6 trillion vs. G7's $56.6 trillion.

Trump Signs Law to Launch Dollar 2.0 - Ad

Trump just signed law S.1582, unleashing the biggest money shift in 100+ years. For the first time since 1913, private firms - not the Fed - can mint a "Dollar 2.0." Treasury says it could drain $6.6T from banks and pay 10X current savings rates. Early investors in minting firms could see 40X returns by 2032.

All 14 victims identified from fiery UPS cargo plane crash in Louisville

LOUISVILLE, Ky. (AP) — A grandfather and his young granddaughter. An electrician with two young children. A woman standing in line at a scrap metal business.

Elon Musk Says He Doesn't Buy Stocks, But Thinks Google Will Be 'Pretty Valuable' In The Future And Nvidia's An 'Obvious' One

Elon Musk says he doesn't invest in stocks but believes Google and Nvidia are best positioned to dominate the future economy through AI, robotics, and space, while both companies continue to post strong earnings and market gains.

Retire Comfortably with These New Monthly Income ETFs? - Ad

Retirement should be freedom, not stress. Yet outdated advice and tiny returns leave many trapped. Kelly G. broke free with a revolutionary income strategy once reserved for the wealthy - hitting her "Freedom Number" faster than she dreamed. You might already have enough too.

Cathie Wood Bets Big On These Stocks As Bitcoin, Ethereum Crash —Dumps Instagram Rival

On Tuesday, Cathie Wood-led Ark Invest made significant trades, notably increasing its holdings in Bullish (NYSE:BLSH), Coinbase Glo

What's Going On With The Uptick In Hewlett Packard Enterprise (HPE) Stock?

Hewlett Packard Enterprise Co (NYSE:HPE) shares are trading higher Tuesday. The company announced it will be one of the first to offer AMD's "Helios" AI rack-scale architecture with an expandable Ethernet network.

Real-Time Finance Hits Wall Street - Ad

A new ETF built for the payments revolution is now trading. Seconds-level settlement and a 0.50% fee* make it one of the cleanest structures yet.

Nvidia, Dell, Coinbase, Gorilla Technology And Archer Aviation: Why These 5 Stocks Are On Investors' Radars Today

U.S. stocks closed lower on Monday, with the Dow slipping 1.2% to 46,590.24, the S&P 500 easing 0.92% to 6,672.41, and the Nasdaq dipping 0.84% to 22,708.07. These are the top stocks that gained the attention of retail traders and investors through the day:

OpenAI's Partners Rake Up $96 Billion Debt as AI Industry's Borrowing Trend Escalates

Companies supplying data centers, chips, and processing power to OpenAI have racked up a staggering $96 billion in debt to fund their operations.

The $43B Big Pharma Story is Starting Over-With a New Player - Ad

Big Pharma once paid $43B for a small biotech with a similar platform. Now, a new company is following that same playbook, leveraging its patented delivery technology to attract partnerships and near-term revenue potential.

New York advances casinos at a Bronx golf course and near Mets stadium

NEW YORK (AP) — Casinos proposed for a golf course in the Bronx and next to the New York Mets’ ballpark are poised to cash in on a for the New York City area.

Stock Market Today: S&P 500, Nasdaq Futures Rise After Breaking 5-Day Winning Streak—MongoDB, CrowdStrike, American Eagle In Focus

U.S. stock futures rose on Tuesday after declining on Monday. Futures of major benchmark indices were higher. On Monday, the month of December kicked off with benchmark indices declining after a five-day winning streak.

Copper Is Tight, Silver Is Rising - And This Early Nevada Play Hits the Timing Perfectly - Ad

AI, electrification, battery storage, and data centers are pushing copper and silver demand sharply higher, even as supply stays tight. This region in Nevada offers rare multi-metal potential, and a new company has secured land in a district the surging district. It's one of the cleanest timing setups in the sector right now.

AI may be scoring your college essay. Welcome to the new era of admissions

Students applying to college know they can’t — or at least shouldn’t — use AI chatbots to and personal statements. So it might come as a surprise that some schools are now using artificial intelligence to read them.

Trending Now

Information, charts or examples are for illustration and educational purposes only and not for individualized investment management This message contains commercial elements, such as advertising. We only send these offers to those who have opted in to our newsletter. Past performance is not indicative of future results. For these reasons we strongly suggest trading in a DEMO/Simulated account. The information provided by us is for educational and informational purposes only. We make no representations or warranties concerning the products, practices or procedures of any company or entity mentioned or recommended and have not determined if the statements and opinions of the advertiser are accurate, correct or truthful. If you use, act upon or make decisions in reliance on information contained or any external source linked within it, you do so at your own peril and agree to hold us, our officers, directors, shareholders, affiliates and agents without fault.

Copyright markethundred.com
Privacy Policy | Terms of Service