maybe thinking is a bicycle built for two 🤔
If we were designed to think solo, monologue would be easier than dialogue.
Dialogue involves INCREDIBLY complex acts of prediction, coordination, task-switching and mind-reading--yet we find it MUCH easier than monologue.
Why? Maybe thinking is a bicycle built for 2.
https://x.com/AgnesCallard/status/1882866517077114888
X (formerly Twitter)
Agnes Callard (@AgnesCallard) on X
If we were designed to think solo, monologue would be easier than dialogue.
Dialogue involves INCREDIBLY complex acts of prediction, coordination, task-switching and mind-reading--yet we find it MUCH easier than monologue.
Why? Maybe thinking is a bicycle…
Dialogue involves INCREDIBLY complex acts of prediction, coordination, task-switching and mind-reading--yet we find it MUCH easier than monologue.
Why? Maybe thinking is a bicycle…
so most deepseek rants are noise as always but this is real shit: complex prod code editing example, with all prompts
https://github.com/ggerganov/llama.cpp/pull/11453
https://github.com/ggerganov/llama.cpp/pull/11453
🔥3
“alignment” is everywhere but either nobody (including EA, doomers etc) seriously believes they will reach AGI/ASI, or they believe “agents” will forever be caged superintelligent toys
https://x.com/richardmcngo/status/1882994095599370554?s=46
https://x.com/richardmcngo/status/1882994095599370554?s=46
🥱1👀1
Forwarded from χаотичні нотатки
in case you were wondering why so many viruses were distributed as PDF files: PDFs can contain a lot of things, including multiple ways to execute code.
Here is a PDF file running Linux (open in a Chrome-based browser)
https://linux.doompdf.dev/linux.pdf
Here is a PDF file running Linux (open in a Chrome-based browser)
https://linux.doompdf.dev/linux.pdf
👍3🔥1👾1
A pet rock is an agent; an incredibly successful and persistent one.
https://arxiv.org/abs/2502.04403v1
^ i'm joking but really that's a great food for thought and a way to resolve paradoxes about AI -- in a lot of discussions on "what's an agent" we were missing a dimension and simplifying the model a bit too much
https://arxiv.org/abs/2502.04403v1
^ i'm joking but really that's a great food for thought and a way to resolve paradoxes about AI -- in a lot of discussions on "what's an agent" we were missing a dimension and simplifying the model a bit too much
arXiv.org
Agency Is Frame-Dependent
Agency is a system's capacity to steer outcomes toward a goal, and is a central topic of study across biology, philosophy, cognitive science, and artificial intelligence. Determining if a system...
🗿4
look, zero trust, provably secure LLM inference! if you use TinfoilAI you don't have to trust them to keep your data safe, just check the code yourself!
https://tinfoil.sh/chat
I'm a bit annoyed of their reuse of the Tinfoil Chat name, which is another privacy tech, but oh well) at least both are cool)
https://tinfoil.sh/chat
I'm a bit annoyed of their reuse of the Tinfoil Chat name, which is another privacy tech, but oh well) at least both are cool)
i wonder when we'll get to training over binary ops (and if this is fundamentally possible) bc 1-bit as shown in the article is more like dictionary training, the matrices still contain real numbers
https://arxiv.org/abs/2502.05003
https://arxiv.org/abs/2502.05003
arXiv.org
QuEST: Stable Training of LLMs with 1-Bit Weights and Activations
One approach to reducing the massive costs of large language models (LLMs) is the use of quantized or sparse representations for training or deployment. While post-training compression methods are...
verifiable reasoning datasets. not a lot tho.
https://github.com/open-thought/reasoning-gym/
https://github.com/open-thought/reasoning-gym/
GitHub
GitHub - open-thought/reasoning-gym: [NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable…
[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards - open-thought/reasoning-gym
🔥2
science says we will remain smart as long as we want
https://www.science.org/doi/full/10.1126/sciadv.ads1560?af=R
https://www.science.org/doi/full/10.1126/sciadv.ads1560?af=R
Science Advances
Age and cognitive skills: Use it or lose it
Cognitive skills do not decline with age for those who use math and reading throughout their life.
🤓4
we're coming to explicit reasoning about uncertainty and unknowns, at last!
https://arxiv.org/abs/2502.05244
https://arxiv.org/abs/2502.05244
arXiv.org
Probabilistic Artificial Intelligence
Artificial intelligence commonly refers to the science and engineering of artificial systems that can carry out tasks generally associated with requiring aspects of human intelligence, such as...
👍1
complex/convoluted systems are always insecure because one gets tired and glosses over the details, instead of checking everything end-to-end.
https://github.blog/security/sign-in-as-anyone-bypassing-saml-sso-authentication-with-parser-differentials/
https://github.blog/security/sign-in-as-anyone-bypassing-saml-sso-authentication-with-parser-differentials/
The GitHub Blog
Sign in as anyone: Bypassing SAML SSO authentication with parser differentials
Critical authentication bypass vulnerabilities were discovered in ruby-saml up to version 1.17.0. See how they were uncovered.
👍1😱1
Oops Tencent guys beat me to distilling agent behaviors from LLMs into "regular" code 😅
https://zhongwen.one/projects/portal/
We present PORTAL, a novel framework for developing artificial intelligence agents capable of playing thousands of 3D video games through language-guided policy generation.
https://zhongwen.one/projects/portal/
👍1
^ that was the original idea that led me to write a custom benchmark
https://news.1rj.ru/str/notatky/1292
https://news.1rj.ru/str/notatky/1292
Telegram
χаотичні нотатки
ok chat i've made a reasoning LLM benchmark that can't be saturated (inspired by AidanBench), what models should I test?
currently I test on 200 easiest tasks solvable with pen and paper in seconds but the problem is NP complete and the number of tasks is…
currently I test on 200 easiest tasks solvable with pen and paper in seconds but the problem is NP complete and the number of tasks is…
https://ezyang.github.io/ai-blindspots/
regardless if you use AI, almost all of these are simply good coding practices to adhere to anyway :) and there's quite a list, so take a look
AI Blindspots – Blindspots in LLMs I've noticed while AI coding
regardless if you use AI, almost all of these are simply good coding practices to adhere to anyway :) and there's quite a list, so take a look
🔥2
https://thingofthings.wordpress.com/2017/01/10/meditation-for-people-who-hate-meditating/
interesting.
interesting.
Thing of Things
Meditation For People Who Hate Meditating
[content warning: exercise] I hate mindfulness. Hate it, hate it, hate it. The ten minutes I spend meditating is easily the least pleasant ten minutes of my day. Unfortunately, I am also a borderli…
👍1🤯1
if you add x-middleware-subrequest header to your request to the next.js website it would be treated like it has already passed through the auth wall middleware. cozy php vibes
https://security.snyk.io/vuln/SNYK-JS-NEXT-9508709
https://security.snyk.io/vuln/SNYK-JS-NEXT-9508709
Learn more about npm with Snyk Open Source Vulnerability Database
Improper Authorization in next | CVE-2025-29927 | Snyk
Critical severity (9.3) Improper Authorization in next | CVE-2025-29927
🤣2😁1
that box is something.
https://x.com/austinbv/status/1903276706699546958
https://x.com/austinbv/status/1903276706699546958
X (formerly Twitter)
Austin Vance (@austinbv) on X
DAAUUUMMMM! Deep Seek R1 - 4bit on a single Mac Studio 512gb.
18.26 Tokens per second with MLX.
Took over a minute to load the model but I sped that up. Generation was great!
thanks @awnihannun mlx is the future.
18.26 Tokens per second with MLX.
Took over a minute to load the model but I sped that up. Generation was great!
thanks @awnihannun mlx is the future.
👍1
instantly wishlisted. portable Raman spectrometer is like a sixth sense, except it's real
Also, "Minimal ignition risk" is an absolutely lovely detail 😂
https://fixupx.com/jwt0625/status/1904562738833367531?s=46
Also, "Minimal ignition risk" is an absolutely lovely detail 😂
https://fixupx.com/jwt0625/status/1904562738833367531?s=46
🧵 Thread • FixupX
outside five sigma (@jwt0625)
Portable handheld standoff Raman spectrometer from Pendar.
First time seeing one that could do it from this far.
First time seeing one that could do it from this far.
⚡1❤1