U.S. Loses to China in Shocking WAR GAMES

Pentagon says new "living missile" key to winning a future conflict. Investors stand to reap 35,960% on shares of the small defense contractor that makes powerful new weapon.

AI 'gold rush' for chatbot training data could run out of human-written text

MATT O'BRIEN
June 06, 2024

Artificial intelligence systems like ChatGPT could soon run out of what keeps making them smarter -- the tens of trillions of words people have written and shared online.

A new study released Thursday by research group Epoch AI projects that tech companies will exhaust the supply of publicly available training data for AI language models by roughly the turn of the decade -- sometime between 2026 and 2032.

Comparing it to a "literal gold rush" that depletes finite natural resources, Tamay Besiroglu, an author of the study, said the AI field might face challenges in maintaining its current pace of progress once it drains the reserves of human-generated writing.

In the short term, tech companies like ChatGPT-maker OpenAI and Google are racing to secure and sometimes pay for high-quality data sources to train their AI large language models - for instance, by signing deals to tap into the steady flow of sentences coming out of Reddit forums and news media outlets.

In the longer term, there won't be enough new blogs, news articles and social media commentary to sustain the current trajectory of AI development, putting pressure on companies to tap into sensitive data now considered private -- such as emails or text messages -- or relying on less-reliable "synthetic data" spit out by the chatbots themselves.

"There is a serious bottleneck here," Besiroglu said. "If you start hitting those constraints about how much data you have, then you can't really scale up your models efficiently anymore. And scaling up models has been probably the most important way of expanding their capabilities and improving the quality of their output."

The researchers first made their projections two years ago -- shortly before ChatGPT's debut -- in a working paper that forecast a more imminent 2026 cutoff of high-quality text data. Much has changed since then, including new techniques that enabled AI researchers to make better use of the data they already have and sometimes "overtrain" on the same sources multiple times.

But there are limits, and after further research, Epoch now foresees running out of public text data sometime in the next two to eight years.

The team's latest study is peer-reviewed and due to be presented at this summer's International Conference on Machine Learning in Vienna, Austria. Epoch is a nonprofit institute hosted by San Francisco-based Rethink Priorities and funded by proponents of effective altruism -- a philanthropic movement that has poured money into mitigating AI's worst-case risks.

Besiroglu said AI researchers realized more than a decade ago that aggressively expanding two key ingredients -- computing power and vast stores of internet data -- could significantly improve the performance of AI systems.

The amount of text data fed into AI language models has been growing about 2.5 times per year, while computing has grown about 4 times per year, according to the Epoch study. Facebook parent company Meta Platforms recently claimed the largest version of their upcoming Llama 3 model -- which has not yet been released -- has been trained on up to 15 trillion tokens, each of which can represent a piece of a word.

But how much it's worth worrying about the data bottleneck is debatable.

"I think it's important to keep in mind that we don't necessarily need to train larger and larger models," said Nicolas Papernot, an assistant professor of computer engineering at the University of Toronto and researcher at the nonprofit Vector Institute for Artificial Intelligence.

Papernot, who was not involved in the Epoch study, said building more skilled AI systems can also come from training models that are more specialized for specific tasks. But he has concerns about training generative AI systems on the same outputs they're producing, leading to degraded performance known as "model collapse."

Training on AI-generated data is "like what happens when you photocopy a piece of paper and then you photocopy the photocopy. You lose some of the information," Papernot said. Not only that, but Papernot's research has also found it can further encode the mistakes, bias and unfairness that's already baked into the information ecosystem.

If real human-crafted sentences remain a critical AI data source, those who are stewards of the most sought-after troves -- websites like Reddit and Wikipedia, as well as news and book publishers -- have been forced to think hard about how they're being used.

"Maybe you don't lop off the tops of every mountain," jokes Selena Deckelmann, chief product and technology officer at the Wikimedia Foundation, which runs Wikipedia. "It's an interesting problem right now that we're having natural resource conversations about human-created data. I shouldn't laugh about it, but I do find it kind of amazing."

While some have sought to close off their data from AI training -- often after it's already been taken without compensation -- Wikipedia has placed few restrictions on how AI companies use its volunteer-written entries. Still, Deckelmann said she hopes there continue to be incentives for people to keep contributing, especially as a flood of cheap and automatically generated "garbage content" starts polluting the internet.

AI companies should be "concerned about how human-generated content continues to exist and continues to be accessible," she said.

From the perspective of AI developers, Epoch's study says paying millions of humans to generate the text that AI models will need "is unlikely to be an economical way" to drive better technical performance.

As OpenAI begins work on training the next generation of its GPT large language models, CEO Sam Altman told the audience at a United Nations event last month that the company has already experimented with "generating lots of synthetic data" for training.

"I think what you need is high-quality data. There is low-quality synthetic data. There's low-quality human data," Altman said. But he also expressed reservations about relying too heavily on synthetic data over other technical methods to improve AI models.

"There'd be something very strange if the best way to train a model was to just generate, like, a quadrillion tokens of synthetic data and feed that back in," Altman said. "Somehow that seems inefficient."

------------

The Associated Press and OpenAI have a licensing and technology agreement that allows OpenAI access to part of AP's text archives.

Continue Reading...

Popular

Trump Shooting Recalls Memories Of Reagan Assassination Attempt: Here's How Markets React To Political Violence

An assassination attempt on former President Donald Trump on July 13, 2024, is a leading headline for the stock markets to open the week and comes ahead of the 2024 presidential election.

This Analyst Sees Potential Downside In Eli Lilly, Adjusts Model For Acquired IPR&D Charges

Cantor Fitzgerald analyst Louise Chen reiterated an Overweight rating on Eli Lilly with a price target of $885. Eli Lilly's Q2 FY24 earnings report on August 8 will include a pre-tax IPR&D charge of $154 million, impacting GAAP and non-GAAP EPS by $0.14.

Nvidia Is Pivoting to Solve Big Tech's $1 Trillion Problem - Ad

Nvidia is the hottest company in the world thanks to its chip business. But here's the thing: Nvidia is making a massive $1 trillion pivot ... To solve AI's biggest problem. But it's not making this move by itself. A new set of companies are partnering with Nvidia in this trillion-dollar venture.

US Banks Sound Alarm On Lower-Income Struggles Before Election

Financial giants warn of lower-income woes pre-election, amid depleted stimulus impact and inflation fears.

Old Pal And Babish Revolutionize Bakeries And Kitchens With New THC-Infused Sugar

Old Pal and Babish launch a revolutionary THC-infused sugar, blending culinary creativity with cannabis innovation...

The End of America's Global Leadership - Ad

Believe it or not, this October surprise could end up being the real election story when historians write their books. And no matter who wins, it may be impossible to restore America's global standing afterwards.

Verizon And AT&T Clash Over $14B FirstNet Spectrum Proposal

Verizon Communications Inc. (NYSE: VZ) and AT&T Inc. (NYSE: T) are at odds over a proposal to enhance emergency responder services, which Verizon claims would unfairly benefit AT&T by $14 billion. AT&T and partners urge regulators to allocate more wireless frequencies to FirstNet, a network supporting emergency services.

The Surprising Way to Play AI (Not Stocks!) - Ad

Almost nine out of 10 Americans completely ignore this market. And your broker probably isn't going to help you invest in this niche. That's because this niche of the AI market is completely outside of the stock market. Take advantage of these potential AI superstars....

Venice nets $2.2 million in day-tripper tax pilot. Opponents say it failed to deter visitors

VENICE, Italy (AP) — Venice on Sunday wrapped up a charging day-trippers an entrance fee, more than 2 million euros ($2.2 million) richer and determined to extend the levy, but opponents in the fragile lagoon city called the experiment a failure.

Trump Vs. Biden: New 2024 Election Poll Show Tie, But Here's Who Independent Voters Favor

A new 2024 election poll shows Donald Trump and Joe Biden tied. The poll shows inflation a key concern for voters moving forward.

Elon Musk's Crazy New Experiment...REVEALED - Ad

On January 28th of 2024... Elon Musk launched a crazy AI experiment involving a real human in California. Elon already invested $100 million of his own money into this AI project... Because he knows the profits here could be ridiculous.

Top 2 Energy Stocks That May Fall Off A Cliff This Quarter

The RSI is a momentum indicator, which compares a stock’s strength on days when prices go up to its strength on days when prices go down.

Trump Or Biden? Most Economists Say This Candidate's Win Could Bring Higher Inflation

Most economists in a WSJ survey believe a Trump win could raise inflation, deficits, and interest rates more than a Biden win.

Man Who Called Nvidia at $1.10 Says Buy This Now... - Ad

In 2004, a man predicted Nvidia's rise. Now, he says a new company, which IPO'd in 2023, could soar like Nvidia. It signed a major deal with Apple for its AI tech in iPhones and iMacs. Could it be the next trillion-dollar company? See why he believes it's among "The Next Magnificent Seven."

Wall Street's Most Accurate Analysts Weigh In On 3 Energy Stocks With Over 6% Dividend Yields

During times of turbulence and uncertainty in the markets, even when markets are at all-time highs, many investors turn to dividend-yielding stocks.

Microsoft, Amazon Or Palantir: Reddit Users Debate How To Invest $4,900

Reddit users debate best stock picks for $4.9K investment. Microsoft and Amazon seen as strong choices, while Palantir's potential sparks debate.

Military to spend billions on "Living Missile" - Ad

We've just prepared a new report on the small defense contractor that makes this weapon - plus three other small defense firms best positioned to ride this mega-trend...

Elon Musk Responds To Vivek Ramaswamy As To Who's Running The Country After Trump-Biden Debate: 'Maybe Nobody'

Tesla Inc. CEO Elon Musk gave a cryptic response to a comment from former GOP candidate Vivek Ramaswamy about who's actually running the country.

Why Nike Shares Are Falling Today

Nike shares plummet nearly 15% premarket as fiscal 2025 revenue outlook disappoints amid challenges in digital sales and Greater China market.

Seven Unknown AI Stocks That Could Dominate the Next Six Years - Ad

The original "Magnificent Seven" stocks generated 16,800% over the last 20 years. But now a new set of AI stocks is set to take over. Alex Green dubs them "The Next Magnificent Seven." And he's arguing that just $1,000 in each could turn into more than $1 million in less than six years.

As Nvidia And Other Tech Stocks Soar, Expert Downgrades Sector Amid Overvaluation Concerns

The tech sector, a major driver of the stock market's recent gains, has been downgraded by Truist's Chief Strategist and CIO, Keith Lerner. This move comes amid concerns of overvaluation and a subsequent recommendation to invest in alternative sectors.

The Election Shock Nobody Expected... (But You'll See it Coming) - Ad

It comes down to just 9 words... Nine simple words which when spoken, will determine the outcome of the 2024 election... These words have nothing to do with Trump, the economy, or what's happening in Ukraine, China or Israel. This is the scandal that isn't being reported (yet.) But my political operatives tracked down this story.

Why Pitney Bowes Shares Are Shooting Higher today

Pitney Bowes shares up on update of cost rationalization initiative, $70M in savings identified. The cost cuts are anticipated to be largely reflected in second half of 2024 pre-tax earnings and fully reflected in 2025.

Norway starts stockpiling grain, citing the pandemic, war and climate change

COPENHAGEN, Denmark (AP) — The Norwegian government on Tuesday signed a deal to start stockpiling grain, saying the COVID-19 pandemic, a war in Europe and climate change have made it necessary.

Are Banks Already Using Your ESG Score Against You? - Ad

As we speak, the powers that be are building a full-scale social credit system ...And soon, it could be used to control the way you live your life. Legendary financial forecaster, Dr. Martin Weiss recorded an urgent message on how to prepare, including the three steps you should consider RIGHT NOW.

Chip Maker Intel Aims For $1B In Software Revenue By 2027

Intel Corp (NASDAQ ) is set to reach $1 billion in software revenue by 2027, according to CTO Greg Lavender. Since 2021, when Lavender joined from VMware, Intel's software revenue has surpassed $100 million. With strategic investments in AI, performance, and security, Intel is enhancing its position in the AI chip market.

Joe Biden Shares His Performance Enhancer Drink Ahead Of 2024 US Presidential Debate: 'I Don't Know What They've Got In These'

Ahead of the debate, Biden took to X, sharing a humorous post about his "performance enhancer" drink, DARK BRANDON'S SECRET SAUCE, and joked, "I don't know what they've got in these performance enhancers, but I'm feeling pretty jacked up. Try it yourselves, folks.

This Is Where the World's Richest Men Are Putting Their Money - Ad

Bill Gates, Peter Thiel, Mark Zuckerberg -- they're pouring millions into something we call Imperium technology. And you can invest in this tech alongside them starting with just $10. Bill Gates called Imperium "one of the most powerful technologies of the 21st century."

Ford's Jim Farley Embraces EVs, Exciting Rival Tesla CEO Elon Musk, But Report Says Nearly Half Of EV Owners In US Want To Switch Back To Gas

Ford CEO Jim Farley on Friday wrote an article on X, detailing his long-standing love for gas vehicles and a newfound love for EVs, and rival CEO Elon Musk is excited. The sentiment, however, is not shared by EV owners across the U.S.

Elon Did It Again: This Could be Bigger than Tesla - Ad

After revolutionizing online payment processing (PayPal), space exploration (SpaceX) and the auto industry (Tesla)... He's getting ready to do it again with his new A.I. venture. This could be bigger than Tesla, bigger than SpaceX, and bigger than Paypal.

Bitcoin Rebounds From ETF Outflows, Back Above $61K

Bitcoin (CRYPTO: BTC) has bounced from its poor showing in Monday trading and is trading back above $61,000, despite significant net outflows from spot ETFs.

Trump Says Obama 'Never Respected Biden' In Latest Attack On President: 'Thought He Was Dumb, And A Total Lightweight'

Former President Donald Trump surmised that former President Barack Obama is scheming to replace current President Joe Biden with Vice President Kamala Harris.

The AI Presentation 'They' Don't Want You to See - Ad

Wall Street legend confesses, "I feel a sense of duty to share what I know with as many people as I can... that's why I made this free for all to view."

Bitcoin Continues Sideways Despite $21M ETF Inflow On Wednesday

Bitcoin spot ETFs saw net inflows of $21.5 million, signaling a shift in investor sentiment. Experts will discuss this and more at Benzinga's event.

Trump Vice President Pick Coming 'Any Time': Betting Odds Show New Favorite Emerging

Since the first presidential debate, much of the political conversation has been on whether President Joe Biden would step down from the 2024 presidential election.

China's Secret Plan to Bankrupt Millions of Americans? - Ad

A mountain of evidence compiled over the past 20 months makes it clear to us that China has put in a covert plan that will cost the U.S. government $9 trillion...and harm ordinary citizens. How you handle this situation will be one of the most important financial decisions you ever make.

Micron Shares Slipping 8% After Hours 'Bad News' But 'Good News' Is The AI Trade Is Intact, Says Gene Munster

Micron Technology Inc. experienced a significant drop of nearly 8% in after-hours trading, following its latest earnings report. However, Gene Munster, the managing partner at Deepwater Asset Management, remains optimistic about the company's AI prospects.

Trending Now

Information, charts or examples are for illustration and educational purposes only and not for individualized investment management This message contains commercial elements, such as advertising. We only send these offers to those who have opted in to our newsletter. Past performance is not indicative of future results. For these reasons we strongly suggest trading in a DEMO/Simulated account. The information provided by us is for educational and informational purposes only. We make no representations or warranties concerning the products, practices or procedures of any company or entity mentioned or recommended and have not determined if the statements and opinions of the advertiser are accurate, correct or truthful. If you use, act upon or make decisions in reliance on information contained or any external source linked within it, you do so at your own peril and agree to hold us, our officers, directors, shareholders, affiliates and agents without fault.

Copyright markethundred.com
Privacy Policy | Terms of Service