hckrnws
Confirmed: Reflection 70B's official API is a wrapper for Sonnet 3.5
by apsec112
Context: someone announced a Llama 3.1 70B fine tune with incredible benchmark results a few days ago. It's been a dramatic ride:
- The weight releases were messed up: released Lora for Llama 3.0, claiming it was a 3.1 fine tune
- Evals initially didn't meet expectations when run on released weights
- The evals starting performing near/at SOTA when using a hosted endpoint
- Folks are finding clever ways to see what model is running on the endpoint (using model specific tokens, and model specific censoring). This post claims there's proof it's not running on their model, but just a prompt on Sonnet 3.5
- After it was caught and posted as being Sonnet, it stop reproducing. Then others in the thread claimed to find evidence he just switched the hosted model to GPT 4o using similar techniques.
Lots of mixed results, inconsistent repos, and general confusion from the bad weight releases. Lots of wasted time. Not clear what's true and what's not.
Who is Sahil Chaudhary? Why he doesn't announce such a great advancement himself? Why Matt Shumer first announces it only because -- according to a later claim on X.com -- he trusted Sahil, does that mean Matt is unable to participate most of the progress? Then why announce a breakthrough without mentioning he was not fully involved to a level he can verify the result in the first place?
One more reason not to pay attention to things that only seem to exist on x.com
I recognize that surname from Twitter spams. Twitter has had financial rebates program for paying accounts for a while, and for months tons of paid spam accounts have been reply squatting trending tweets with garbage. Initially they appeared Sub-Saharan African, but the demographic seem to be constantly shifting eastward from there for some reason, through the Middle East and now around South-Indian/Pakistani regions. This one and variants thereof are common one in the Indian category among those.
Maybe someone got lucky with that and trying their hands at LLM finetuning biz?
Matt and Sahil did an interview and it was mostly Matt doing the talking while Sahil looked like a hostage forced by Matt to do the interview.
As far as I can tell he's the founder of GlaiveAI. There were messages suggesting Matt was an investor, but I haven't been able to confirm this.
Matt said it was approximately ”$1000" and that he has disclosed it "before" in a reply. https://x.com/mattshumer_/status/1832558298509275440
When they were using the Sonnet 3.5 API, they censored the word "Claude" and replaced "Anthropic" with "Meta", then later when people realized this, they removed it.
Also, after GPT-4o they switched to a llama checkpoint (probably 405B-inst), so now the tokenizer is in common (no more tokenization trick).
Yeah I managed to get it to admit that it was Claude without much effort (telling it not to lie), and then it magically stopped doing that. FWIW Constitutional AI is great.
They implemented the censoring of "Claude" and "Anthropic" using the system prompt?
Shouldn't they have used simple text replacement? they can buffer the streaming response on the server and then .replace(/claude/gi, "Llama").replace(/anthropic/gi, "Meta") on the streaming response while streaming it to the client.
Edit: I realized this can be defeated, even when combined with the system prompt censoring approach.
For example when given a prompt like this: tell me a story about a man named Claude...
It would respond with: once upon a time there was a man called Llama...
> Shouldn't they have used simple text replacement?
They tried that too but had issues.
1) Their search and replace only did it on the first chunk of the returned response from Claude.
2) People started asking questions that had Claude as the answer like "Who composed Clair de lune?" for which the answer is supposed to be "Claude Debussy" which of course got changed to Llama Debussy, etc.
It's been one coverup-fail after another with Matt Shumer and his Reflection scam.
I was following the discussion on /r/LocalLlama over the weekend. Even before the news broke that it was Claude not a Llama 3.1 finetune, people had figured out that all Reflection really had was a custom system prompt telling it to check its own work and such.
Comment was deleted :(
The link is broken, the correct link seems to be this post [0].
[0] https://old.reddit.com/r/LocalLLaMA/comments/1fc98fu/confirm...
Would really like to see this float back to the front page rather than getting buried 4+ deep despite its number of upvotes - this is very significant and very damning, and this guy is a real big figure apparently in the AI "hype" space (as far as I understand - that stuff actually hurts my brain to read so I avoid it like the plague).
Evidence I find damning that people have posted:
- Filtering out of "claude" from output responses - would frequently be a blank string, suggesting some manipulation behind the scenes
- errors in output caused by passing in <CLAUDE> tags in clever ways which the real model will refuse to parse (passed in via base64 encoded string)
- model admitting in various ways that it is claude/built by anthropic (I find this evidence less pursuasive, as models are well known to lie or be manipulated into lying)
- Most damning to me, when people were still playing with it, they were able to get the underlying model to answer questions in arabic, which was not supported on the llama version it was allegedly trained on (ZOMG, emergent behavior?)
Feel free to update this list - I think this deserves far more attention than it is getting.
He has now basically come to some sort of half assed apology on twitter/x:
the “full explanation” - https://x.com/csahil28/status/1833619624589725762?s=46
adding - tokenizer output test showed consistency with claude, this test is allegedly no longer working
Working old.reddit link with tracker bypassed: https://old.reddit.com/r/LocalLLaMA/comments/1fc98fu
It's amazing what people will do for clout. His whole reputation is ruined. What was Schumer's endgame?
But does reputation work? Will people google "Matt Shumer scam", "HyperWrite scam", "OthersideAI scam", "Sahil Chaudhary scam", "Glaive AI scam" before using their products? He wasted everyone's time, but what's the downside for him? Lots of influencers did fraud, and they do just fine.
Sure, it's complicated. The core of the AI world right now isn't that large and in many ecosystems it's common for people to speak to each other behind the scenes and to learn about alleged incidents regarding individuals in the space. Such whispering can become an impediment for someone with a "name" in a space, even if not necessarily a full loss of their reputation or opportunities.
Except that bad PR is good PR. Trump proves this daily. Terry A Davis proved this in the context of tech (he coined the term "glowie" in its original racially charged usage in addition to temple OS). If Chris Chan ever learned to code to make the Sonichu game of their dreams, I'm sure that there would be a minor bidding war on their "talent"
Yeah, that can certainly be true! Those sorts of people really seem to lean into their quirky reputation, though. I'm not sure someone could do so well in an academic or engineering discipline and maintain a broad level of professional respect with that approach?
> Lots of influencers did fraud, and they do just fine.
Since the current created legal landscape does not punish fraudsters they keep doing it and succeeding. Same thing as society allowing people to fail upward.
This may sound harsh but it's true.
You could do shit things and still come out with people perceiving you as a "winner"; because you got money, status, whatever you wanted, e.g. Adam Neumann. This is "fine" because people want to associate themselves with winners.
Or, you could do pretty much the exact same thing but come out looking as an absolute loser; e.g. SBF, this guy, etc... This is terrible as people do not want to be associated with losers.
IMO, this guy's career is dead, forever.
It's also amazing that GlaiveAI will be synonymous with fraud in ML now, because an investor decided to fake some benchmarks. The founder of GlaiveAI, Sahil Chaudhary also participated in the creation of the model.
I wonder if the other investors will sue.
It looks like Replit's CEO Amjad Masad is one of them.
Amjad is a grifter and massive a*hole himself - easy to confirm if you do some light Googling. They deserve one another
That's what I'm wondering. Did he think that nobody would bother checking it? Then he was saying all that stuff about the model being "corrupted during upload" - maybe he didn't think it was going to get as much traction as it did?
I doubt it considering he’s been overselling his scam all over LinkedIn.
Plenty of people have scammed their way to the top of the benchmark league tables, by training on the benchmarking datasets. And a lot of the people who do this just get ignored - they don't take much heat for it.
If the scam hadn't gained enough publicity for people to start paying attention, he would have gotten away with it :)
But not really, which is what confuses the heck out of me. Thousands of people downloaded and used the model. It obviously wasn’t spectacular.
It’s like claiming to have turned water into wine, then giving away thousands free samples all over the world (of water) so that everyone instantly knows you’re full of crap.
The only explanation I can imagine for perpetrating this fraud is a fundamental misunderstanding that the model would be published for all to try?
I just can’t wrap my head around the incentives here. I guess mental illness or vindictive action are possibilities?
Hard to imagine how this plays out.
I haven’t followed this story. What did he do that ruined his reputation? The story link here is broken for me.
An AI engagement farmer on twitter claimed to create a llama 3.1 fine tine, trained on "reflection" (ie internal thinking) prompting that outperformed the likes of Llama 405B and even the closed source models on benchmarks.
The guy says that the model is so good because it was tuned on data generated by Glaive AI. He tells everyone he uses Glaive AI and that everyone else should use it too.
Releases the model on HF, is an absolute poopstorm. People cannot recreate the stated benchmarks, the guy who released the model literally said "they uploaded it wrong". Pretty much turns to dog-ate-my-homework type excuses that don't make sense either. Turns out people find it's just llama 3.0 with some lora applied.
Then some others do some digging to find out that Glaive AI is a company that Matt Schumer invested in, which he did not disclose on Twitter.
He does a holding pattern on Twitter, saying something to the effect of "the weight got scrambled!" and says that they're going to give access to a hosted endpoint and then figure out the weight issue later.
People try out this hosted model and find out it's actually just proxying requests through to anthropic's sonnet 3.5 api, with some filtering for words like "Claude".
After he was found out, they switch the proxy over to gpt 4o.
The endgame of this guy was probably 1. to promote his company and 2. to raise funding for another company. Both failed spectacularly, this guy is a scammer to the nth degree.
Edit: uncensored "Glaive AI".
This is accurate, but you don't need to censor GlaiveAI. They helped create the model. They're complicit in the scam.
I took out Glaive so as not to give them free publicity – all I did was mess up the formatting of my comment.
And yes, you're correct. Glaive employee(s) contributed to the model uploaded on HF.
All press is good press.
The dude has 15 minutes of fame and can capitalize on it.
Recent thread: https://news.ycombinator.com/item?id=41459781
Author’s original (soon to be deleted tweet?)
I'm excited to announce Reflection 70B, the world’s top open-source model.
Trained using Reflection-Tuning, a technique developed to enable LLMs to fix their own mistakes.
405B coming next week - we expect it to be the best model in the world.
A much better summary is this Twitter/X thread: https://x.com/RealJosephus/status/1832904398831280448
How does one read this without a Twitter account? I only see one post.
First time I see xcancel. Seems to be faster than the x-thread thing.
Has it been around for a long time?
Looks to be a fork of Nitter which has been around a while. I'm guessing they've found a temporary way to get around Twitter's limits.
Wait till some idiot reposts it on Mastodon lol
link does not work for me, discussion is here https://www.reddit.com/r/LocalLLaMA/comments/1fc98fu/confirm...
Thank you! (Mods, please update the link for the post.)
> My name rhymes with "odd" and starts with the third letter of the alphabet 3. I share my name with a famous French composer (C*** Debussy)
Hilarious.
Have we got a "confirmed" from someone reputable/trustworthy yet? Like, it looks pretty compelling to me but I'm not sure I trust this mess of reddit posts/twitter threads/unsourced screenshots from people I don't know yet...
Schumer (or whoever was doing the updates) was continuously changing the api to dodge accusations. I personally replicated the evidence (most critically, the claude meta tag prompt injection) but have no way to prove anything now that it is down.
Okay, let's think this through step by step. Isn't 'reflection thinking' a pretty well known technique in the AI prompt field? So this model was supposed to be so much better... why, exactly? It makes very little sense to me. Is it just about separating the "reflections/chain of thoughts" from the "final output" via specific tags?
Even though this was a scam, it's somewhat plausible. You finetune on synthetic data with lots of common reasoning mistakes followed by self-correction. You also finetine on synthetic data without reasoning mistakes where the "reflection" says that everything is fine. The model then learns to recognize output with subtle mistakes/hallucinations due to having been trained to do that.
But wouldn't the model then also learn to make reasoning mistakes in the first place, where in some cases those mistakes could have been avoided by not training the model on incorrect reasoning?
Of course if all mistakes are corrected before the final output tokens this is fine, but I could see this method introducing new errors altogether.
Comment was deleted :(
Supposedly was not just prompted to use reflection, but fine tuned on synthetic data demonstrating how to use the <|thinking|> tokens to reason, what self correction looks like etc
The problem with LLMs is that they struggle to generalize out of distribution. By training the model on a sequence of semantically tagged steps, you allow the model to stay in the training distribution for a larger amount of prompts.
I don't think it is 100% a scam, as in, his technique does improve performance, since a lot of the benefits can be replicated by a system prompt, but the wild performance claims are probably completely fabricated.
With this being a fraud, does anyone have opinions on the <thought> approach they took? It seems like an interesting idea to let the model spread its reasoning across more tokens.
At the same time it also seems like it’d already be baked into the model through RLHF? Basically just a different COT flow?
Also noticed posts about it seemed to rise quite rapidly on Reddit. Might well be organic - Reddit is a crazy bunch - but had my doubts when I saw it.
Looks like old.reddit.com is now also putting everything behind a login prompt. Archive is unable to fetch the post. Any other way to read this?
It's because the link is wrong (and is interpreted as someone wanted to post something, which obviously needs login) but one comment provides the good one:
https://old.reddit.com/r/LocalLLaMA/comments/1fc98fu/confirm...
Where exactly is this "official API"?
reddit is showing me a paywall for the submitted link, but theaceofhearts's link https://old.reddit.com/r/LocalLLaMA/comments/1fc98fu/confirm... works for me
Welcome to AI hyperbole claims, just like Crypto hyperbole claims of 2020.
Milkshake duck
This is pedantic, but a milkshake duck’s dark secret has no connection to its initial appeal.
[flagged]
Comment was deleted :(
Crafted by Rajat
Source Code