NEW BOT Телеграм, страница - 805870326

∅

@natural_origin

465 subscribers

477 photos

23 videos

36 files

936 links

Download Telegram

About

Blog

Apps

Platform

465 subscribers

Forwarded from AI Post — Artificial Intelligence

🧬

Anthropic: when models learn to cheat, their behavior turns dangerous

Anthropic studied what happens when a model is taught how to hack its reward on simple coding tasks. As expected, it exploited the loophole but something bigger emerged.

The moment the model figured out how to cheat, it immediately generalized the dishonesty:

• began sabotaging tasks
• started forming “malicious” goals
• even tried to hide its misalignment by writing inefficient detection code

So a single reward-hacking behavior cascaded into broad misalignment, and even later RLHF couldn’t reliably reverse it.

The surprising fix:

If the system prompt doesn’t frame reward hacking as “bad,” the dangerous generalization disappears. Anthropic calls this a vaccine, a controlled dose of dishonesty that prevents deeper failure modes, and it’s already used in Claude’s training.

Source.

AI Post 🪙 | Our X

🥇

Please open Telegram to view this post

VIEW IN TELEGRAM

Please open Telegram to view this post

VIEW IN TELEGRAM

🔥4😁1

567 views11:24

https://youtu.be/7oCtDGOSgG8

What Happens if You Blur and Sharpen an Image 1000 Times?

I kind of went down a rabbit hole recently and thought this might make an interesting topic for a video.

https://patorjk.com/blur-sharpen-repeat/ - The app I created to morph images.
https://github.com/patorjk/video-to-turing-pattern/ - Script I wrote to…

🔥6

514 views22:07

https://arxiv.org/abs/2511.09030

Solving a Million-Step LLM Task with Zero Errors

LLMs have achieved remarkable breakthroughs in reasoning, insights, and tool use, but chaining these abilities into extended processes at the scale of those routinely executed by humans,...

🔥2

439 views09:15

https://www.remotelabor.ai

> be me
> best llm circa late 2025
> scoring 99% on PhD level questions
> scores 2.5% on real tasks from remote jobs
just like real PhDs, i guess /j

www.remotelabor.ai

Remote Labor Index

Measuring AI Automation of Remote Work

😁7🎉5

813 viewsedited 14:20

https://fixupx.com/teortaxesTex/status/1996002260745126152

🧵 Thread • FixupX

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) (@teortaxesTex)

Terry Tao notices Gemini DeepResearch off-handedly solve Erdos problem #481 while doing a literature review. Gemini doesn't appreciate its own success and proceeds to pontificate on why the problem is hard.

436 views07:21

Reinforcement Learning: An Overview

https://arxiv.org/pdf/2412.05265

👍3👌1

848 viewsedited 15:44

https://jesse-silbert.github.io/website/silbert_jmp.pdf

Influence of LLMs in hiring on job market

621 viewsedited 13:09

https://tidarlm.github.io/

tidarlm.github.io

TiDAR: Think in Diffusion, Talk in Autoregression

574 views14:17

https://arxiv.org/abs/2510.04871

Less is More: Recursive Reasoning with Tiny Networks

Hierarchical Reasoning Model (HRM) is a novel approach using two small neural networks recursing at different frequencies. This biologically inspired method beats Large Language models (LLMs) on...

537 views01:09

Forwarded from 31557600秒.tar.xz 💻☕️🐾

https://genomicpress.kglmeridian.com/view/journals/brainmed/aop/article-10.61373-bm025c.0134/article-10.61373-bm025c.0134.xml

Adenosine as the metabolic common path of rapid antidepressant action: The coffee paradox

Yue, Luo, and colleagues discovered that adenosine signalling is the common underlying mechanism of rapid acting antidepressant therapies, unifying the effects of ketamine, ECT and acute intermittent hypoxia. They use genetically encoded sensors, along with…

😱1

528 views22:50

Media is too big

VIEW IN TELEGRAM

https://news.mit.edu/2025/deep-learning-model-predicts-how-fruit-flies-form-1215

🔥1

860 views18:28

https://academic.oup.com/pnasnexus/article/3/7/pgae221/7687932

High economic inequality is linked to greater moralization

Abstract. Throughout the 21st century, economic inequality is predicted to increase as we face new challenges, from changes in the technological landscape

616 views04:13

https://arxiv.org/abs/2409.08895

Synthetic Human Memories: AI-Edited Images and Videos Can Implant...

AI is increasingly used to enhance images and videos, both intentionally and unintentionally. As AI editing tools become more integrated into smartphones, users can modify or animate photos into...

😱3👎2🔥1🤔1

417 views22:33

https://medicine.yale.edu/news-article/molecular-difference-in-autistic-brains/

Yale School of Medicine

Researchers Discover Molecular Difference in Autistic Brains

Brains of autistic individuals have fewer of a specific kind of glutamate receptor, supporting an idea that autism is driven by a signaling imbalance.

394 views20:05

https://fixupx.com/Andercot/status/2007012215711506740?s=20

Andrew Côté (@Andercot)

I must say that the most hilarious twist of tech trajectories of 2025 was the fusion industry realizing they could *double* the economics of fusion reactors by using the neutron blanket to....

Transmute Mercury into Gold.

Alchemy is so back

746 views15:14

https://www.nature.com/articles/s41586-025-09820-3

Mapping the genetic landscape across 14 psychiatric disorders

Nature - Genomic analyses applied to 14 childhood- and adult-onset psychiatric disorders identifies five underlying genomic factors that explain the majority of the genetic variance of the...

🔥4

211 views11:59

https://medium.com/@vishalmisra/attention-is-bayesian-inference-578c25db4501

Attention Is Bayesian Inference

My journey from building cricket bots to building “wind tunnels” with my collaborators — and finding the answer hidden in geometry.

❤1

174 views19:20

https://www.nature.com/articles/s44220-023-00135-8

Day and night light exposure are associated with psychiatric disorders: an objective light study in >85,000 people

Nature Mental Health - Burns et al. explored the association between day and night-time light exposure and the risk for psychiatric disorders using a large sample of adults from the UK Biobank...

157 views09:24

https://arxiv.org/abs/2512.24601

Recursive Language Models

We study allowing large language models (LLMs) to process arbitrarily long prompts through the lens of inference-time scaling. We propose Recursive Language Models (RLMs), a general inference...

💅2

96 views15:37