hckrnws
Someone should author an ELI5 (or slightly older) guide to how LLMs, RL, Agents, CoT, etc, all work and what all these acronyms mean. And then, add to it, daily or weekly, as new development arise. I don't want to keep reading dozens of articles, white papers, tweets, etc, as new developments happen. I want to go back to the same knowledge base, that's authored by the same person (or people), that maintains a consistent reading and comprehension level, and builds on prior points.
It seems like the AI space is moving impossibly fast, and its just ridiculously hard to keep up unless 1) you work in this space, 2) are very comfortable with the technology behind it, so you can jump in at any point and understand it.
Just ask an Internet-enabled LLM like You.com to do it. This is what they are good that. Wikipedia satisfies your repository requirement.
haha just have gpt operator do it
>people re-creating R1 (some claim for $30)
R1 or the R1 finetunes? Not the same thing...
HF is busy recreating R1 itself but that seems to be a pretty big endevour not a $30 thing
This is indeed a massive exaggeration, I'm pretty sure the $30 experiment is this one: https://threadreaderapp.com/thread/1882839370505621655.html (github: https://github.com/Jiayi-Pan/TinyZero).
And while this is true that this experiment shows that you can reproduce the concept of direct reinforcement learning of an existing LLM, in a way that makes it develop reasoning in the same fashion Deepseek-R1 did, this is very far from a re-creation of R1!
Maybe they mistake recreation for the cp command
Most important, R1 shut down some very complex ideas (like DPO & MCTS) and showed that the path forward is simple, basic RL.
This isn't quite true. R1 used a mix of RL and supervised fine-tuning. The data used for supervised fine-tuning may have been model-generated, but the paper implies it was human-curated: they kept only the 'correct' answers.I think what you're saying is consistent with the quote: human curation of SFT data is indeed not complex. There might be extra work on top of that RL, but it's the same work that's been done throughout LLM development.
Additionally, in the following days, I've seen evidence suggesting that the SFT part might not even be necessary. I'd argue that work wouldn't have happened if R1 wasn't released in the open.
So the conclusion is AI is about to "increase in abilities at an exponential rate", with the only data point being that R1 was sucessfully able to acheive o1 levels as an open source model? In other words, two extremely unrelated themes?
Does this guy know people were writing verbatim the same thing in like... 2021? Still always incredible to me the same repeated hype over and over rise to the surface. Oh well... old man gonna old man
> Does this guy know people were writing verbatim the same thing in like... 2021?
Given how far gen AIs have improved since 2021, these people were quite spot on.
People keep saying that DeepSeek R1's training cost is just $5.6M. Where is the source?
I'm not asking for the proof. Just the source, even a self-claimed statement. I've read the R1's paper and it doesn't say the number of $5.6M. Is it somewhere in DeepSeek's press release?
this is a pretty hype-laden/twitter-laden article, i would not trust it to explain things to you
Sure. But perhaps some hype is justified? Here's what a senior research scientist from nvidia says:
> We are living in a timeline where a non-US company is keeping the original mission of OpenAI alive - truly open, frontier research that empowers all. It makes no sense. The most entertaining outcome is the most likely.
> DeepSeek-R1 not only open-sources a barrage of models but also spills all the training secrets. They are perhaps the first OSS project that shows major, sustained growth of an RL flywheel. (…)
The benchmarks for the different models focus on math and coding accuracy. I have a use-case for a model where those two functions are completely irrelevant and I’m only interested in writing (chat, stories, etc). I guess you can’t really benchmark ‘concepts’ as easily as logic.
With distillation, can a model be made that strips out most of the math and coding stuff?
> completely irrelevant and I’m only interested in writing (chat, stories, etc)
There's a person keeping track of a few writing prompts and the evolution of the quality of text with each new shiny model. They shared this link somewhere, can't find the source but I had it bookmarked for further reading. Have a look at it and see if it's something you'd like.
https://eqbench.com/results/creative-writing-v2/deepseek-ai_...
Here's a better link: https://eqbench.com/creative_writing.html
The R1 sample reads way better than anything else on the leaderboard to me. Quite a jump.
Why is the main character named Rhys in most (?) of them? Llama[1], Claude[3], Mistral[4] & DeepSeek-r1[5] samples all named the main character Rhys, even though that was no where specified in the prompt? GPT-4o gives the character a different name[6]. Gemini[2] names the bookshop person Rhys instead! Am I just missing something really obvious? I feel like I'm missing something big that's right in front of me
[1] https://eqbench.com/results/creative-writing-v2/meta-llama__... [2] https://eqbench.com/results/creative-writing-v2/gemini-1.5-f... [3] https://eqbench.com/results/creative-writing-v2/claude-3-opu... [4] https://eqbench.com/results/creative-writing-v2/mistralai__M... [5] https://eqbench.com/results/creative-writing-v2/deepseek-ai_... [6] https://eqbench.com/results/creative-writing-v2/gpt-4o-2024-...
Completely agree.
The only measurable flaw I could find was the errant use of an opening quote (‘) in
> He huffed a laugh. "Lucky you." His gaze drifted to the stained-glass window, where rain blurred the world into watercolors. "I bombed my first audition. Hamlet, uni production. Forgot ‘to be or not to be,' panicked, and quoted Toy Story."
It's pretty amazing I can find no fault with the actual text. No grammar errors, I like the writing, it competes with the quality and engagingness of a large swath of written fiction (yikes), I wanna read the next chapter.
> It's pretty amazing I can find no fault with the actual text. No grammar errors, I like the writing, it competes with the quality and engagingness of a large swath of written fiction (yikes), I wanna read the next chapter.
The lack of "gpt-isms" is really impressive IMO.
Those outputs are really good and come from deepseek-R1 (I assume the full version, not a distilled version).
R1 is quite large (685B params). I’m wondering if you can make a distilled R1 without the coding and math content. 7B works well for me locally. When I go up to 32B I seem to get worse results - I assume it’s just timing out in its think mode… I haven’t had time to really investigate though.
Yes, you can create a writing-focused model through distillation, but it's tricky. *Complete removal* of math/coding abilities is challenging because language models' knowledge is interconnected - the logical thinking that helps solve equations also helps structure coherent stories.
I understood that at least some of these big models (llama?) is basically bootstrapped with code. is there truth to that?
Yes, code is a key training component. Open-Llama explicitly used programming data as one of seven training components. However, newer models like Llama 3.1 405B have shifted to using synthetic data instead. Code helps develop structured reasoning patterns but isn't the sole foundation - models combine it with general web text, books, etc.
Nice explainer. R1 hit sensational mainstream news which has resulted in some confusion and alarm among family and friends. It’s hard to succinctly explain this doesn’t mean China is destroying us, that Americans immediately started working with the breakthrough, cost optimization is inevitable in computing, etc.
T or F?
Nobody really saw the LLM leap coming
Nobody really saw R1 coming
We don’t know what’s coming
So, is AI already reasoning or not?
Depends on your definition of reasoning. Creating valid chains of thought? Yes. Sentient? No.
No. AI learns to predict reasons, and doing so as it predicts the answer improves its accuracy at predicting the answer.
In summary, even though they are called "reasoning" models, they are still based on prediction and pattern matching, not true logical reasoning. The improvement in accuracy is likely due to better leveraging of the model's statistical knowledge, rather than any deeper understanding of the problem's logic. And the reasons you see it output have nothing to do with the actual reasons it used to determine the answer.
In fact, R1.Zero hints that, it might be even better to let the AI follow a chain of thought that doesn't actually make logical sense or is understandable, and that doing so could even further improve its ability to accurately predict solutions to code, math and logic problems.
Yes, that's what OpenAI o1 does, and DeepSeek R1. Also Google Gemini 2.0 Thinking models. It's a way to significantly improve benchmark scores, especially in math.
It's funny to watch too. I played with Gemini 2.0 on Google AI Studio and asked it to "come up with your favorite song as you take a long walk to really think this through".
The reasoning can then be shown, and it talked to itself, saying things like "since I'm an AI, I can't take walks, but with a request like this, the user seems to imply that I should choose something that's introspective and meaningful", and went on with how it picked candidates.
I just tried that prompt with gemini-2.0-flash-thinking-exp-01-21
In the reasoning process it concludes on: From the brainstormed genres/artists, select a specific song. It's better to be concrete than vague. For this request, "Nuvole Bianche" by Ludovico Einaudi emerges as a strong candidate. Craft the Explanation and Scenario: Now, build the response around "Nuvole Bianche."
Then in the actual answer it proposes: "Holocene" by Bon Iver.
=)
Yes. ARC AGI benchmark was supposed to last years and is already saturated. The authors are currently creating the second version.
From that article:
> ARC-AGI is a benchmark that’s designed to be simple for humans but excruciatingly difficult for AI. In other words, when AI crushes this benchmark, it’s able to do what humans do.
That's a misunderstanding of what ARC-AGI means. Here's what ARC-AGI creator François Chollet has to say: https://bsky.app/profile/fchollet.bsky.social/post/3les3izgd...
> I don't think people really appreciate how simple ARC-AGI-1 was, and what solving it really means.
> It was designed as the simplest, most basic assessment of fluid intelligence possible. Failure to pass signifies a near-total inability to adapt or problem-solve in unfamiliar situations.
> Passing it means your system exhibits non-zero fluid intelligence -- you're finally looking at something that isn't pure memorized skill. But it says rather little about how intelligent your system is, or how close to human intelligence it is.
Ah! My bad, I edited the article to simply quote Francois. Thanks for catching this, Simon.
> That's a misunderstanding of what ARC-AGI means
Misunderstanding benchmarks seems to be the first step to claiming human level intelligence.
Additionally:
> > ARC-AGI is a benchmark that’s designed to be simple for humans but excruciatingly difficult for AI. In other words, when AI crushes this benchmark, it’s able to do what humans do.
Doesn’t even make logical sense.
This feels like a generalized extension of the classic mis-reasoned response to 'A computer can now play chess.'
Common non-technical chain of thought after learning this: 'Previously, only humans could play chess. Now, computers can play chess. Therefore, computers can now do other things that previously only humans could do.'
The error is assuming that problems can only be solved via levels of human-style general intelligence.
Obviously, this is false from the way that computers calculate arithmetic, optimize via gradient descent, and innumerable other examples, but it does seem to be a common lay misunderstanding.
Probably why IBM abused it with their Watson marketing.
In reality, for reliable capabilities reasoning, the how matters very much.
> Misunderstanding benchmarks seems to be the first step to claiming human level intelligence.
It's known as "hallucination" a.k.a. "guessing or making stuff up", and is a major challenge for human intelligence. Attempts to eradicate it have met with limited success. Some say that human intelligence will never reach AGI because of it.
Thankfully nobody is trying to sell humans as a service in an attempt to replace the existing AIs in the workplace (yet).
I’m sure such a product would be met with ridicule considering how often humans hallucinate. Especially since, as we all know, the only use for humans is getting responses given some prompt.
> Thankfully nobody is trying to sell humans as a service
That’s a description of the entire service economy.
Doesn’t that turn the entire premise on its head? If passing the benchmark means crossing the lower, not the upper threshold, that invalidates most claims derived from it.
Correct. Hence many people constantly bemoaning the hype driven narratives that dominate many AI discussions.
Interesting article, but the flourish ending """AI will soon (if not already) increase in abilities at an exponential rate.""" is not at all substantiated. Would be nice to know how the author gets to that conclusion.
Author here. I do believe it's going to be exponential (not yet), but that's out of scope for the article. However, if someone has a good explainer link for that, please put it here and I'll link it into the post.
All past data shows is exponential growth in the cost of AI systems, not an exponential growth in capability. Capabilities have certainly expanded, but that is hard to measure. The growth curve is just as likely to be sigmoid-shaped. Just a phase transition from "computers process information strictly procedurally" to "computers use fuzzy logic sometimes too". And if we've exhausted all the easy wins, that explains the increased interest in alternative scaling paths.
Obviously predicting the future is hard, and we won't know where this stops till we get there. But I think a degree of skepticism is warranted.
Once AI becomes self-improving, using its intelligence to make itself more intelligent, exponential progress seems like the logical consequence. Any lack of exponential progress before it becomes self-improving doesn't have much bearing on that.
It certainly will be sigmoid-shaped in the end, but the top of the sigmoid could be way beyond human intelligence.
I'm not completely convinced of this, even in the presence of AGI that is peak-human intelligence in all ways (lets say on-par with the top 1% researchers from top AGI labs, with agency and online learning are fully solved). One reason for this is what the sibling comment argues:
> Exponentially smarter AI meets exponentially more difficult wins.
Another is that it doesn't seem like intelligence is the main/only bottleneck to producing better AIs right now. OpenAI seems to think building a $100-500B data center is necessary to stay ahead*, and it seems like most progress thus far has been from scaling compute (not to trivialize architectures and systems optimizations that make that possible). But if GPT-N decides that GPT-N+1 needs another OOM increase in compute, it seems like progress will mostly be limited by how fast increasingly enormous data centers and power plants can be built.
That said, if smart-human-level AGI is reached, I don't think it needs to be exponentially improving to change almost everything. I think AGI is possibly (probably?) in the near-future, also believing that it won't improve exponentially doesn't ease my anxiety about potential bad outcomes.
*Though admittedly DeepSeek _may_ have proven this wrong. Some people seem to think their stated training budget is misleading and/or that they trained on OpenAI outputs (though I'm not sure how this would work for the o models given that they don't provide their thinking trace). I'd be nervous if it was my money going towards Stargate right now.
Well we do have an existence proof that human-level intelligence can be trained and run on a few thousand calories per day. We just haven't figured out how to build something that efficient yet.
The inference and on-line fine tuning stage can run on a few thousand calories a day. The training stage has taken roughly 100 TW * 1bn years ≈ 10²⁸ calories.
Hmm I'm not convinced that human brains have all that much preprogrammed at birth. Babies don't even start out with object permanence. All of human DNA is only six billion bits, which wouldn't be much even if it encoded neural weights instead of protein structures.
Human babies are born significantly premature as a compromise between our upright gait and large head-to-body ratio. A whole lot of neurological development that happens in the first couple of years is innate in humans just like in other mammals, the other mammals just develop them before being born. E.g. a foal can walk within hours of being born.
Babies are born with a fully functioning image recognition stack complete with a segmentation model, facial recognition, gaze estimator, motion tracker and more. Likewise, most of the language model is pre-trained and language acquisition is in large part a pruning process to coalesce unused phonemes, specialize general syntax rules etc. Compare with other animals that lack such a pre-trained model - no matter how much you fine-tune a dog, it's not going to recite Shakespeare. Several other subsystems come online in the first few years with or without training; one example that humans share with other great apes is universal gesture production and recognition models. You can stretch out your arm towards just about any human or chimpanzee on the planet and motion your hand towards your chest and they will understand that you want them to come over. Babies also ship with a highly sophisticated stereophonic audio source segmentation model that can easily isolate speaking voices from background noise. Even when you limit yourself to just I/O related functions, the list goes on from reflexively blinking in response to rapidly approaching objects to complicated balance sensor fusion.
If you're claiming that humans are born with more data than the six gigabits of data encoded in DNA, then how do you think the extra data is passed to the next generation?
I'm not claiming that humans are somehow born with way more than a few billion parameters, no. I'm agreeing that we have an existence proof for the possibility of an efficient model encoding that only requires a few thousand calories to run inference. What we don't have is an existence proof that finding such an encoding can be done with similar efficiency because the one example we have took billions of years of the Earth being irradiated with terawatts of power.
Can we do better than evolution? Probably; evolution is a fairly brute force search approach and we are pretty clever monkeys. After all, we have made multiple orders of magnitude improvements in the state of the art of computations per watt in just a few decades. Can we do MUCH better than evolution at finding efficient intelligences? Maybe, maybe not.
I agree with your take and would slightly refine it to remark that having in mind how protein unfolding / producing works in our bodies, I'd say our genome is heavily compressed and we can witness decompression with an electronic microscope (how RNA serves like a command sequence determining the resulting protein).
The human genome has 6 billion bases, not 6 billion bits. Each base can take one of 4 values, so significantly more data than binary. But maybe not enough of a difference to affect your point.
Looks like actually three billion base pairs in human DNA: https://www.genome.gov/genetics-glossary/Base-Pair#:~:text=O...
So six billion bits since two bits can represent four values. Base pairs and bases are effectively the same because (from the link) "the identity of one of the bases in the pair determines the other member of the pair."
It's 6 billion because you have 2 copies of each chromosome. So 12 billion bits right? But I do think your original point stands. I'm mostly being pedantic.
self improving only when it knows how to test itself . if the test is predictable outcome defined by humans most companies are going to fine tune to pass self improving test , but what happens next . Improvement is vague in terms of who seeks the benefit and may not fall as how humans have thought over millions of years of evolution.
I think we are already way past single-human intellence. No one person understands (or could possibly understand) the whole system from the silicon up. Even if you had one AI "person" a 100x smarter than their coworkers, who can solve hard problems at many levels of the stack, what could they come up with that generations of tens of thousands of humans working together haven't? Something surely, but it could wind up being marginal. Exponentially smarter AI meets exponentially more difficult wins.
>No one person understands (or could possibly understand) the whole system from the silicon up.
I'm not a fan of this meme that seems to be very popular on HN. Someone with knowledge in EE and drivers can easily acquire enough programming knowledge in the higher layers of programming, at which point they can fill the gaps and understand the entire stack. The only real barrier is that hardware today is largely proprietary, meaning you need to actually work at the company that makes it to have access to the details.
Good point. I agree actually, many people do put the work in to understand the whole stack. But one person could not have built the whole thing themselves obviously. All I was trying to say is we already live with superhuman intelligences every day, they are called "teams".
Your argument is that no one person can build a whole cargo container ship, hence cargo container ships are intelligent? The whole of humanity cannot build from scratch a working human digestive track, hence human digestive track is more intelligent than all of humanity?
Things can be complex without being intelligent.
Nope, not my point. My point was that even if we get superhuman AGI, the effect of self-improvement may not be that large.
Care to justify those beliefs or are we just supposed to trust your intuition? Why exponential and not merely quadratic (or some other polynomial)? How do you even quantify "it"? I'm teasing, somewhat, because I don't actually expect you're able to answer. Yours isn't reasoned arguments, merely religious fervor dressed up in techy garb. Prove me wrong!
Not necessarily 'exponential' (more superlinear) in capabilities (yet) but rather in parameters/training data/compute/costs, which may sometimes be confused for the other.
[0]: https://ourworldindata.org/grapher/exponential-growth-of-par...
[1]: https://ourworldindata.org/grapher/exponential-growth-of-dat...
[2]: https://epoch.ai/blog/trends-in-training-dataset-sizes
[3]: https://ourworldindata.org/grapher/exponential-growth-of-com...
If you read the article, he explains that there are multiple scaling paths now, whereas before it was just parameter scaling. I think it's reasonable to estimate faster progress as a result of that observation.
I like that the HN crowd wants to believe AI is hype (as do I), but it's starting to look like wishful thinking. What is useful to consider is that once we do get AGI, the entirety of society will be upended. Not just programming jobs or other niches, but everything all at once. As such, it's pointless to resist the reality that AGI is a near term possibility.
It would be wise from a fulfillment perspective to make shorter term plans and make sure to get the most out of each day, rather than make 30-40 year plans by sacrificing your daily tranquility. We could be entering a very dark era for humanity, from which there is no escape. There is also a small chance that we could get the tech utopia our billionaire overlords constantly harp on about, but I wouldn't bet on it.
>There is also a small chance that we could get the tech utopia our billionaire overlords constantly harp on about, but I wouldn't bet on it.
Mr. Musk's exitement knew no bounds. Like, if they are the ones in control of a near AGI computer system we are so screwed.
This outcome is exactly what I fear most. Paul Graham described Altman as the type of individual who would become the chief of a cannibal tribe after he was parachuted onto their island. I call this type the inverse of the effective altruist: the efficient psychopath. This is the type of person that would have first access to an AGI. I don't think I'm being an alarmist when I say that this type of individual having sole access to AGI would likely produce hell on earth for the rest of us. All wrapped up in very altruistic language of "safety" and "flourishing" of course.
Unfortunately, we seem to be on this exact trajectory. If open source AGI does not keep up with the billionaires, we risk sliding into an inescapable hellscape.
Ye. Altman, Musk. Which Sam was the exploding slave head bracelet guy, was that Sam Fridman?
Dunno about Zuckerberg. Standing still he has somewhat slided into the saner spectrum of tech lords. Nightmare fuel...
"FOSS"-ish LLMs is like. We need those.
that seems a bit harsh dont you think? besides youre the one making the assertion, you kinda need to do the proving ;)
No, I don't think it's overly harsh. This hype is out of control and it's important to push back on breathless "exponential" nonsense. That's a term with well defined easily demonstrated mathematical meaning. If you're going to claim growth in some quantity x is exponential, show me that measurements of that quantity fit an exponential function (as opposed to some other function) or provide me a falsifiable theory predicting said fit.
I believe they are using 'exponential' as a colloquialism rather than a strict mathematical definition.
That aside, we would need to see some evidence of AI developments being bootstrapped by the previous SOTA model as key part of building the next model.
For now, it's still human researchers pushing the SOTA models forwards.
When people use the term exponential I feel that what they really mean is 'making something so _good_ that it can be used to make the N+1 iteration _more good_ than the last.
Well, any shift from "not able to do X" to "possibly able to do X sometimes" is at least exponential. 0.0001% is at least exponentially greater than 0%.
I believe we call that a "step change". It's only really two data points at most so you can't fit a continuous function to it with any confidence.
> It's a bit crazy to think AI capabilities will improve exponentially. I am a very reasonable person, so I just think they'll improve some amount proportional to their current level.
https://www.lesswrong.com/posts/qLe4PPginLZxZg5dP/almost-all...
>No, I don't think it's overly harsh.
Where's the falsifiable framework that demonstrates your conclusion? Or are we just supposed to trust your intuition?
Why is it “important to push back”? XKCD 386?
The key "ability" that will grow exponentially is AIs ability to convert investment dollars into silicon+electricity and then further reduce those into heat energy. Such schemes only seem wasteful to outsiders, those whose salaries are not tied to their ability to convert money into heat. A fun startup would be one that generates useful electricity from the AI investment cycle. If we put the Ai machine under a pot of water, we might then use the resulting steam to drive a turbine.
Due to Carnot's law, you can't get much electricity that way without a big temperature difference. Think about it: the AI machine would have to run at at least 100 degrees Celsius to boil the water, and that's the bare minimum.
But if we can make computers that run at, say, 2000 degrees, without using several times more electricity, then we can capture their waste heat and turn a big portion of it back into electricity to re-feed the computers. It doesn't violate thermodynamics, it's just an alternative possibility to make more computers that use less electricity overall (an alternative to directly trying to reduce the energy usage of silicon logic gates) as long as we're still well above Landauer's limit.
At sea level. Put the machine in a vacuum chamber, or atop a big mountain, and we will boil the Ai kettles at less than 100c.
Also, you don’t have to necessarily use water. You can use alcohol, ammonia or something else with a different boiling point.
It doesn't matter - the fraction of energy you can get is the fraction you decrease the temperature relative to absolute zero.
Try liquid sodium, it vaporizes at 883c
Some datacentres do in fact recover the heat for things like municipal heating. It's tricky though because being near population centres that can use the heat is often (not always) inversely related to things that are good for datacentres like cheap land, power and lack of neighbours to moan about things like construction and cooling system noise.
There was also a startup selling/renting bitcoin miners that doubled as electrical heaters.
The problem is that computers are fundamentally resistors, so at most you can get 100% of the energy back as heat. But a heat pump can give you 2-4 times the energy back. So your AI work (or bitcoin mining) plus the capital outlay of the expensive computers has to be worth the difference.
Orbital Materials is designing wafer substrates that capture carbon and reuse excess heat.
It’s basically the line for all the AI-hype people: “all the problems are going away!”, “soon it’ll all magically make things exponentially good-er-er!”
Alternatively, it’s a restatement of the obvious empirical truth that technology tends in improve on an exponential and not linear curve. Seems like a simpler explanation that doesn’t even require insulting people.
The premise would be better supported if it could be shown that if we could 10x the speed at which matrix multiplication is performed conferred a linear or better increase in performance post GPT-4. As it stands that would just seem to give us current results faster, not better results
Efficiency matters but it took semiconductors decades to care about it. Why would it be different this time around?
I would argue that any given technology tends to improve on an S curve, so exponentially at first and then flattening out. See Moore’s law as a great example.
Or more on topic see the improvements in LLMs since they were invented. At first each release was an order of magnitude better than the last (see GPT 2 vs 3 vs 4), now they’re getting better but at a much slower rate.
Certainly feels like being at the top of an S curve to me, at least until an entirely new architecture is invented to supersede transformers.
Comment was deleted :(
That's why airplanes are the so much faster than they were 20 years ago.
The drumbeat of AI progress has been fairly steady, on log scales.
that doesn't mean ai is improving itself though
My point was that it already was on an exponential trajectory. RL/self-play and the like remove some of the human inputs that were previously required for that growth.
Take the trajectory of chess. handcrafted rules -> policies based on human game statistics -> self-play bootstrapped from human games -> random-initialized self-play.
Chess is a game with complete information. Not a good analogy with the real world.
AI will improve at an exponential rate once it can independently improve AI performance. For example. Once AI can organically identify, test, confirm, deploy an improvement like R1 vs o1 (in terms of perf/watt) then we'll see exponential improvement. Honestly though, that still seems possible within 5 years or less, maybe 3.
Only if the AI can do it faster than humans.
And if the improvements it makes are not asymptotically diminishing.
>Honestly though, that still seems possible within 5 years or less, maybe 3.
If that is a normal human estimation I would guess in reality it is more likely to be in 6-10 years. Which is still good if we get it in 2030 - 2035.
For futurism on things that promise economic rewards, exponential increases are not uncommon.
Currently AI is getting better at sorting the data that already exists, but if enough reddit, wiki posts are wrong its answer is inevitably wrong. Without being able to experiment to test its theories against reality, the AI curve will likely not lead to super-intelligence without humans to assist. That's my 5 cents.
The exponential part may be iffy, but it is self improving.
And this same RL is also creating improvements in small model performance.
So, more LLMs are about to rise in quality.
It's self-improving? So, we can ask AI how to improve AI, and the suggestions actually work?
It's more like Intel in early days using their CPUs to compute layout for bigger CPUs.
Effectively: is the limiting factor to improvement addressable by the thing being improved?
If yes, then you get exponential increases very trivially. If no, then something external continues to bottleneck progress.
[dead]
[flagged]
Perhaps you are not the intended audience for this article.
Why are you so angry? I thought it was a wonderful overview. An even if not, insults are hardly necessary.
Comment was deleted :(
[flagged]
What a useless comment. Point out your qualms with it, explain how you would have done it better or clarify inaccuracies you find. It helps promote discussion and opens the door for more collective information to be considered.
> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.
I know that I'll get a lot of hate and downvotes for this comment.
I want to say that I have all the respect and admiration for these Chinese people, their ingenuity and their way of doing innovation even if they achieve this through technological theft and circumventing embargoes imposed by US (we all know how GPUs find their way into their hands).
We are living a time with a multi-faceted war between the US, China, EU, Russia and others. One of the battlegrounds is AI supremacy. This war (as any war) isn’t about ethics; it’s about survival, and anything goes.
Finally, as someone from Europe, I confess that here is well known that the "US innovates while EU regulates" and that's a shame IMO. I have the impression that EU is doing everything possible to keep us, European citizens, behind, just mere spectators in this tech war. We are already irrelevant, niche players.
"We are living a time with a multi-faceted war between the US, China, EU, Russia and others."
The only way to win this war is to deescalate. Everybody wins.
And AI competition is a good thing for Europe especially when it lags behind technologically.
I don't follow this "And AI competition is a good thing for Europe especially when it lags behind technologically." How that? Please explain your rationale.
Not the person you’re replying to, but my interpretation is that Europe is destined to be a consumer of AI, not a producer. As a consumer, you want a multitude of suppliers because fewer suppliers means slower progress and higher prices.
Couldn't have said it better.
Europe now has access to a model R1, which is as good as the best US model o1, but for free. This is because of competition.
> The only way to win this war is to deescalate
But there is absolutely no way that will happen, so the pragmatic question is which horse to bet on.
Right now it's looking like China. As R1 exemplifies, but also, say, their EV and general manufacturing industry, they are the ones who are actually scrambling to produce more and better stuff (doesn't matter whether you like AI, the point is they're apparently at the forefront of many fields), while countries like the USA are only scrambling to see who gets to own the less and worse stuff they produce. I don't live there so I don't know how that's achieved, what insane human rights violations they have, but from the perspective of only predicting who wins, it doesn't really matter how they win.
You said a lot of controversial things. I'll just zoom in on the last bit:
> I have the impression that EU is doing everything possible to keep us, European citizens, behind, just mere spectators in this tech war. We are already irrelevant, niche players.
Citizens of the US are just as irrelevant, if not more, since none of the productivity gains trickles down to them. Their real wage growth has stagnated since the 1970s, and each year that goes by, their actual power to purchase more goods or services goes down.
The victories in AI only matter to those who will profit from it.
When it comes to the citizens of a country benefiting from AI or not, being the leader in AI tech is not very important. It is more a matter of if AI benefits them or not. That their country has the leading AI tech can as easily results in them having less jobs, and being paid worse, as it could the opposite, depending on the policies of that country.
But given that, it can very much be better to live in the EU, with second grade open-source models, but where the productivity benefits of AI benefit the general citizens, then to live in the US, where the productivity benefits of AI benefit only the few.
as an american, i never voted for this war, would like it to end, and think now is the time when international coordination is most critical
The top three largest supercomputers in the world are located at Lawrence Livermore Labs, Oak Ridges Labs, and Argonne Labs. Each of them provides an architecture that's ideal for running AI.
One can't help wondering what kinds of classified AI results the US military is getting when running on El Capitan.
Is El Capitan really ideal arch for AI? It’s massively parallel of course.
no of course not
But Anthropic is a French company isn’t that in the EU
Anthropic is not French. You probably meant Mistral, that is a French company, but a niche player in this AI game.
No it's American, you probably confuse them with Mistral AI.
Oh wow thank you yes I did
[dead]
Short version: It's Hype.
Long version: It's marketing efforts stirring up hype around incremental software updates. If this was software being patched in 2005 we'd call it "ChatGPT V1.115"
>Patch notes: >Added bells. >Added whistles.
Crafted by Rajat
Source Code