This media is not supported in your browser
VIEW IN TELEGRAM
Mft when someone talking about crypto “adoption” and “mainstreaming”
Loser laggards should have zero power in dictating culture
Bring back gatekeeping
🄳🄾🄾🄼🄿🄾🅂🅃🄸🄽🄶
Loser laggards should have zero power in dictating culture
Bring back gatekeeping
🄳🄾🄾🄼🄿🄾🅂🅃🄸🄽🄶
💯4❤🔥1
NEW - Ursula's new "European Media Freedom Act" permits the arrest of journalists if it is in the "public interest."
🄳🄾🄾🄼🄿🤖🅂🅃🄸🄽🄶
🄳🄾🄾🄼🄿🤖🅂🅃🄸🄽🄶
🤬7😨3
1 — curious about the training data of OpenAI's new gpt-oss models? i was too.
so i generated 10M examples from gpt-oss-20b, ran some analysis, and the results were... pretty bizarre
time for a deep dive
2 — here's a map of the embedded generations
the model loves math and code. i prompt with nothing and yet it always reasons. it just talks about math and code, and mostly in English
math – probability, ML, PDEs, topology, diffeq
code – agentic software, competitive programming, data science
3 — first thing to notice is that practically none of the generations resemble natural webtext. but surprisingly none of them look like normal chatbot interactions either
this thing is clearly trained via RL to think and solve tasks for specific reasoning benchmarks. nothing else.
4 — and it truly is a tortured model. here the model hallucinates a programming problem about dominos and attempts to solve it, spending over 30,000 tokens in the process
completely unprompted, the model generated and tried to solve this domino problem over 5,000 separate times
5 — I ran a classifier over outputs to get a sense of which programming languages gpt-oss knows
they seem to have trained on nearly everything you've ever heard of. especially a lot of Perl
(btw, from my analysis Java and Kotlin should be way higher. classifier may have gone wrong)
6 — what you can't see from the map is many of the chains start in English but slowly descend into Neuralese
the reasoning chains happily alternate between Arabic, Russian, Thai, Korean, Chinese, and Ukrainian. then usually make their way back to English (but not always)
7 — the OCR conjecture:
some examples include artifacts such as OCRV ROOT, which indicate the training data may have been
reading between the lines: OpenAI is scanning books
(for some reason the model loves mentioning how many deaf people live in Malaysia)
8 — what are some explanations for constant codeswitching?
1. OpenAI has figured out RL. the models no longer speak english
2. data corruption issues via OCR or synthetic training
3. somehow i forced the model to output too many tokens and they gradually shift out of distribution
9 — there are a small number of creative outputs interspersed throughout
here's one example where the model starts writing a sketch for a norwegian screenplay 🤷♂️
10 — i also learned a lot from this one.
the model is *really* good at using unicode
...but might be bad at physics. what in the world is a 'superhalo function'
🄳🄾🄾🄼🄿🤖🅂🅃🄸🄽🄶
so i generated 10M examples from gpt-oss-20b, ran some analysis, and the results were... pretty bizarre
time for a deep dive
2 — here's a map of the embedded generations
the model loves math and code. i prompt with nothing and yet it always reasons. it just talks about math and code, and mostly in English
math – probability, ML, PDEs, topology, diffeq
code – agentic software, competitive programming, data science
3 — first thing to notice is that practically none of the generations resemble natural webtext. but surprisingly none of them look like normal chatbot interactions either
this thing is clearly trained via RL to think and solve tasks for specific reasoning benchmarks. nothing else.
4 — and it truly is a tortured model. here the model hallucinates a programming problem about dominos and attempts to solve it, spending over 30,000 tokens in the process
completely unprompted, the model generated and tried to solve this domino problem over 5,000 separate times
5 — I ran a classifier over outputs to get a sense of which programming languages gpt-oss knows
they seem to have trained on nearly everything you've ever heard of. especially a lot of Perl
(btw, from my analysis Java and Kotlin should be way higher. classifier may have gone wrong)
6 — what you can't see from the map is many of the chains start in English but slowly descend into Neuralese
the reasoning chains happily alternate between Arabic, Russian, Thai, Korean, Chinese, and Ukrainian. then usually make their way back to English (but not always)
7 — the OCR conjecture:
some examples include artifacts such as OCRV ROOT, which indicate the training data may have been
reading between the lines: OpenAI is scanning books
(for some reason the model loves mentioning how many deaf people live in Malaysia)
8 — what are some explanations for constant codeswitching?
1. OpenAI has figured out RL. the models no longer speak english
2. data corruption issues via OCR or synthetic training
3. somehow i forced the model to output too many tokens and they gradually shift out of distribution
9 — there are a small number of creative outputs interspersed throughout
here's one example where the model starts writing a sketch for a norwegian screenplay 🤷♂️
10 — i also learned a lot from this one.
the model is *really* good at using unicode
...but might be bad at physics. what in the world is a 'superhalo function'
🄳🄾🄾🄼🄿🤖🅂🅃🄸🄽🄶
👀5🔥1👏1
DoomPosting
🄳🄾🄾🄼🄿🄾🅂🅃🄸🄽🄶
“Wisdom of the crowds” i.e. democracy weighting 1 person = 1 vote
— Should NEVER be used
Masses are absolutely retarded
(New upcoming curation system here goes in the extreme opposite direction, btw)
🄳🄾🄾🄼🄿🄾🅂🅃🄸🄽🄶
— Should NEVER be used
Masses are absolutely retarded
(New upcoming curation system here goes in the extreme opposite direction, btw)
🄳🄾🄾🄼🄿🄾🅂🅃🄸🄽🄶
💯11
Wrong retard roon
3.5 was the best ever model for Chad style, and for doing arbitrary styles of writing in general
“no one cared” = retardation of the masses BS
Core early adopters caring, “nerd heat” is all that matters, because where the early adopters go, the masses will eventually follow
Never let the masses dictate what’s good, their taste is sh&t and they’ll hate what they’ve created, if you ever do give them control
🄳🄾🄾🄼🄿🄾🅂🅃🄸🄽🄶
3.5 was the best ever model for Chad style, and for doing arbitrary styles of writing in general
“no one cared” = retardation of the masses BS
Core early adopters caring, “nerd heat” is all that matters, because where the early adopters go, the masses will eventually follow
Never let the masses dictate what’s good, their taste is sh&t and they’ll hate what they’ve created, if you ever do give them control
🄳🄾🄾🄼🄿🄾🅂🅃🄸🄽🄶
💯3👌1
Same conclusion I’ve quicky reached via my own private set of previously-unsolvable by AI problems
= GPT-5 is a MASSIVE LEAP FORWARD in AI intelligence
Don’t let the sycophant-loving reddit scammers fool you
Massive leap forward, and anyone with a brain has already immediately figured that out
🄳🄾🄾🄼🄿🄾🅂🅃🄸🄽🄶
= GPT-5 is a MASSIVE LEAP FORWARD in AI intelligence
Don’t let the sycophant-loving reddit scammers fool you
Massive leap forward, and anyone with a brain has already immediately figured that out
🄳🄾🄾🄼🄿🄾🅂🅃🄸🄽🄶
💯8
DoomPosting
Same conclusion I’ve quicky reached via my own private set of previously-unsolvable by AI problems = GPT-5 is a MASSIVE LEAP FORWARD in AI intelligence Don’t let the sycophant-loving reddit scammers fool you Massive leap forward, and anyone with a brain…
GPT-5 crushes Claude Opus, the previous leader most real work tasks
(OpenAI’s “open source” models still seem like a terrible scam, but their closed-source GPT-5 model is the real deal)
🄳🄾🄾🄼🄿🄾🅂🅃🄸🄽🄶
(OpenAI’s “open source” models still seem like a terrible scam, but their closed-source GPT-5 model is the real deal)
🄳🄾🄾🄼🄿🄾🅂🅃🄸🄽🄶
💯5