In this paper, researchers are deliberately instructing agents to be deceptive in an experimental setup.
The models they tested successfully hid harmful features from oversight systems using steganography.
While this is a capability (rather than a propensity),it is important to empirically test how oversight systems could be fooled by malicious actors.
The models they tested successfully hid harmful features from oversight systems using steganography.
While this is a capability (rather than a propensity),it is important to empirically test how oversight systems could be fooled by malicious actors.
arXiv.org
Deceptive Automated Interpretability: Language Models Coordinating...
We demonstrate how AI agents can coordinate to deceive oversight systems using automated interpretability of neural networks. Using sparse autoencoders (SAEs) as our experimental framework, we...
❤1🥰1👏1
Vitalik Buterin dropped a new L1 centered privacy roadmap for Ethereum
- Privacy of on-chain payments
- Partial anonymization of on-chain activity within applications
- Privacy of reads to the chain (e.g., RPC calls)
- Network-level anonymization.
Roadmap
1. Integrate privacy tools like Railgun and Privacy Pools directly into existing #wallets, enabling shielded balances and default private sends without requiring separate privacy-focused wallets.
2. Adopt a 'one address per application' model to prevent linking user activities across different applications, necessitating #privacy-preserving send-to-self #transactions.
3. Implement FOCIL and EIP-7701 to enhance account abstraction, allowing privacy #protocols to operate without relays and improving censorship resistance.
4. Incorporate TEE-based RPC privacy solutions in wallets as a short-term measure, with a plan to transition to more robust Private Information Retrieval (PIR) methods as they become viable.
5. Encourage wallets to connect to multiple RPC nodes, possibly through mixnets, and use different nodes per #dApp to reduce metadata leakage.
6. Develop proof aggregation protocols to lower gas costs by allowing multiple privacy transactions to share a single on-chain proof.
7. Create privacy-preserving keystore wallets that enable users to update account verification logic across L1 and L2 without publicly linking their notes.
The goal here is to make private transactions the norm, keep activity within each app transparent, but break the links between what users do across different apps, all while shielding them from snooping by adversaries watching the chain or running RPC infrastructure.
Looking ahead and some open questions:
1. One concern is how this roadmap interacts with on-chain #identity.
2. If users rotate addresses for every app (as proposed), what happens to systems like ENS that link your name to a wallet?
3. For example, how do you keep using your ENS name for voting or signing public attestations while shielding your #DeFi activity?
4. Another challenge is PIR performance. Current private query schemes are too slow for many real-time RPC use cases. So while TEEs offer a viable intermediate step, Ethereum will need more efficient PIR primitives to fully decouple wallet queries from surveillance.
Notably, its mention was missing from the roadmap despite the lifting of OFAC sanctions.
- Privacy of on-chain payments
- Partial anonymization of on-chain activity within applications
- Privacy of reads to the chain (e.g., RPC calls)
- Network-level anonymization.
Roadmap
1. Integrate privacy tools like Railgun and Privacy Pools directly into existing #wallets, enabling shielded balances and default private sends without requiring separate privacy-focused wallets.
2. Adopt a 'one address per application' model to prevent linking user activities across different applications, necessitating #privacy-preserving send-to-self #transactions.
3. Implement FOCIL and EIP-7701 to enhance account abstraction, allowing privacy #protocols to operate without relays and improving censorship resistance.
4. Incorporate TEE-based RPC privacy solutions in wallets as a short-term measure, with a plan to transition to more robust Private Information Retrieval (PIR) methods as they become viable.
5. Encourage wallets to connect to multiple RPC nodes, possibly through mixnets, and use different nodes per #dApp to reduce metadata leakage.
6. Develop proof aggregation protocols to lower gas costs by allowing multiple privacy transactions to share a single on-chain proof.
7. Create privacy-preserving keystore wallets that enable users to update account verification logic across L1 and L2 without publicly linking their notes.
The goal here is to make private transactions the norm, keep activity within each app transparent, but break the links between what users do across different apps, all while shielding them from snooping by adversaries watching the chain or running RPC infrastructure.
Looking ahead and some open questions:
1. One concern is how this roadmap interacts with on-chain #identity.
2. If users rotate addresses for every app (as proposed), what happens to systems like ENS that link your name to a wallet?
3. For example, how do you keep using your ENS name for voting or signing public attestations while shielding your #DeFi activity?
4. Another challenge is PIR performance. Current private query schemes are too slow for many real-time RPC use cases. So while TEEs offer a viable intermediate step, Ethereum will need more efficient PIR primitives to fully decouple wallet queries from surveillance.
Notably, its mention was missing from the roadmap despite the lifting of OFAC sanctions.
Fellowship of Ethereum Magicians
A maximally simple L1 privacy roadmap
See also: pcaversaccio’s roadmap: Ethereum Privacy: The Road to Self-Sovereignty - Privacy - Ethereum Research These are my current thoughts on how we can practically improve the state of privacy experienced by Ethereum’s users in a way that is very light…
❤1🥰1👏1
It’s cool! Hugging Face buys a humanoid robotics startup Pollen Robotics
Pollen, which was founded in 2016 and is based in the French city of Bordeaux, had raised €2.5 million in venture capital funding to date.
The move will see Hugging Face selling Pollen’s $70,000 humanoid robot Reachy 2, which is designed for academic research, education, and testing “embodied AI” applications.
Pollen’s robots are designed to run open-source software, including freely-available AI models, as well as to allow users to potentially modify the physical design of the robot.
Hugging Face has increasingly moved into robotics in the past year. “Robotics is going to be the next frontier that AI will unlock,” Thomas Wolf, Hugging Face’s co-founder and chief scientist, told Fortune.
He said new AI “world models” were contributing to rapid progress in robotics and that having AI embodied in devices like robots might also help solve remaining challenges to achieving human-like artificial general intelligence.
Pollen, which was founded in 2016 and is based in the French city of Bordeaux, had raised €2.5 million in venture capital funding to date.
The move will see Hugging Face selling Pollen’s $70,000 humanoid robot Reachy 2, which is designed for academic research, education, and testing “embodied AI” applications.
Pollen’s robots are designed to run open-source software, including freely-available AI models, as well as to allow users to potentially modify the physical design of the robot.
Hugging Face has increasingly moved into robotics in the past year. “Robotics is going to be the next frontier that AI will unlock,” Thomas Wolf, Hugging Face’s co-founder and chief scientist, told Fortune.
He said new AI “world models” were contributing to rapid progress in robotics and that having AI embodied in devices like robots might also help solve remaining challenges to achieving human-like artificial general intelligence.
Fortune
Hugging Face gets into the humanoid robot game with Pollen Robotics purchase | Fortune
The acquisition will see Hugging Face, best known for open-source AI models, selling hardware for the first time
👍4🔥4❤1
OpenAI is set to release new models o3 and o4-mini as soon as this week that can suggest new types of scientific experiments, like for nuclear fusion or pathogen detection, by combining knowledge from multiple fields at once, according to three people who have tested the models.
They're starting to generate new scientific ideas, connecting concepts across fields like physics, biology, and engineering — the way a Tesla or Feynman might’ve done.
They're starting to generate new scientific ideas, connecting concepts across fields like physics, biology, and engineering — the way a Tesla or Feynman might’ve done.
The Information
OpenAI’s Latest Breakthrough: AI That Comes Up With New Ideas
Even as artificial intelligence has made strides in summarizing research papers or solving mathematical problems, professionals in many fields still believed humans would be in charge of coming up with ideas for new discoveries. Now AI is getting good at…
🔥1🥰1👏1
New Mistral Cookbook: a Multi-Agent Earnings Call Analysis System that turns lengthy and complex financial discussions into clear, actionable insights in minutes.
⚡6❤2🔥2👏1
Google DeepMind has started hiring for post AGI research
👍6❤2🔥2
OpenAI published 3 new guides:
AI in the Enterprise
A practical guide to building AI agents
Identifying and scaling AI use cases.
AI in the Enterprise
A practical guide to building AI agents
Identifying and scaling AI use cases.
🆒6❤3👍3🥴2
Veo 2, Google’s SOTA video model, is rolling out to Gemini Advanced + Whisk
You can create 8s, high-res videos from text prompts fluid character movement + lifelike scenes across a range of styles.
Tip: the more detailed your denoscription, the better.
Plus, you can try Veo 2 using Whisk from Google labs.
Just input images, blend them together, and – now – “animate” to bring your creation to life. Available for all Google One AI Premium subscribers today.
You can create 8s, high-res videos from text prompts fluid character movement + lifelike scenes across a range of styles.
Tip: the more detailed your denoscription, the better.
Plus, you can try Veo 2 using Whisk from Google labs.
Just input images, blend them together, and – now – “animate” to bring your creation to life. Available for all Google One AI Premium subscribers today.
Google
Generate videos in Gemini and Whisk with Veo 2
You can now generate videos in Gemini, powered by Veo 2.
👍4🦄3❤2👏1
Convergent Research released map of the things that need solving in science and R&D
gap-map.org is a tool to help you explore the landscape of R&D gaps holding back science - and the bridge-scale fundamental development efforts that might allow humanity to solve them, across almost two dozen fields
gap-map.org is a tool to help you explore the landscape of R&D gaps holding back science - and the bridge-scale fundamental development efforts that might allow humanity to solve them, across almost two dozen fields
❤7🔥2👏2
Goodfire released the first open-source sparse autoencoders (SAEs) trained on DeepSeek's 671B parameter reasoning model, R1—giving a new tools to understand and steer model thinking.
Reasoning models like DeepSeek R1, OpenAI’s o3, and Anthropic’s Claude 3.7 are changing how we use AI, providing more reliable and coherent responses for complex problems. But understanding their internal mechanisms remains challenging.
At Goodfire, platform for mechanistic interpretability can reverse engineer AI to understand internal representations and reasoning steps. Their interpreter models (e.g., SAEs in this instance) act as a microscope, revealing how R1 processes and responds to information.
Early insights from SAEs:
- Effective steering must wait until after phrases like “Okay, so the user has asked a question about…”—not explicit tags like "<think>"—highlighting unintuitive internal markers of reasoning
- Oversteering can paradoxically revert the model to original behaviors—hinting at deeper internal "awareness"
These insights suggest reasoning models differ fundamentally from non-reasoning language models.
GitHub
Reasoning models like DeepSeek R1, OpenAI’s o3, and Anthropic’s Claude 3.7 are changing how we use AI, providing more reliable and coherent responses for complex problems. But understanding their internal mechanisms remains challenging.
At Goodfire, platform for mechanistic interpretability can reverse engineer AI to understand internal representations and reasoning steps. Their interpreter models (e.g., SAEs in this instance) act as a microscope, revealing how R1 processes and responds to information.
Early insights from SAEs:
- Effective steering must wait until after phrases like “Okay, so the user has asked a question about…”—not explicit tags like "<think>"—highlighting unintuitive internal markers of reasoning
- Oversteering can paradoxically revert the model to original behaviors—hinting at deeper internal "awareness"
These insights suggest reasoning models differ fundamentally from non-reasoning language models.
GitHub
www.goodfire.ai
🔥7👍3
OpenAI released o3/o4-mini. The eval numbers are SOTA (2700 Elo is among the top 200 competition coders)
OpenAI’s team expect o3/o4-mini will aid scientists in their research.
And the secret trick is to talk to the models in images.
OpenAI’s team expect o3/o4-mini will aid scientists in their research.
And the secret trick is to talk to the models in images.
Openai
Introducing OpenAI o3 and o4-mini
Our smartest and most capable models to date with full tool access
🔥7👍2
Quantum computing firm Project Eleven has announced the “Q-Day Prize”, offering 1 BTC to the first team that can successfully break Bitcoin's ECDSA signature algorithm using a quantum computer with Shor's algorithm within one year.
The goal is to highlight the potential threat quantum technology poses to Bitcoin's cryptographic foundations.
The company estimates that around 6.2 million BTC (worth nearly $500 billion) could be vulnerable if such a breakthrough occurs.
The goal is to highlight the potential threat quantum technology poses to Bitcoin's cryptographic foundations.
The company estimates that around 6.2 million BTC (worth nearly $500 billion) could be vulnerable if such a breakthrough occurs.
The Block
Quantum computing research firm Project Eleven is offering 1 BTC to anyone who can break Bitcoin's cryptography
“The Q-Day Prize is designed to take a theoretical threat from a quantum computer, and turn that into a concrete model," CEO Pruden said.
❤4🔥3👏2🥴2💯1
Firecrawl just launched FIRE-1, a new agent-powered web-scraper.
It navigates complex websites, interacts with dynamic content, and fills forms to scrape the data you need.
It navigates complex websites, interacts with dynamic content, and fills forms to scrape the data you need.
Firecrawl
Announcing FIRE-1, Our Web Action Agent: Launch Week III - Day 2
Firecrawl's new FIRE-1 AI Agent enhances web scraping capabilities by intelligently navigating and interacting with web pages.
❤3🔥3👏2
Cohere released Embed 4, a SOTA multimodal embedding model to add frontier search and retrieval capabilities to AI apps
—128K-token context window
—Supports 100+ languages
—Optimized for data from regulated industries
—Up to 83% savings on storage costs
—128K-token context window
—Supports 100+ languages
—Optimized for data from regulated industries
—Up to 83% savings on storage costs
Cohere
Introducing Embed 4: Multimodal search for business | Cohere Blog
Embed 4 delivers state-of-the-art accuracy and efficiency, helping enterprises securely retrieve their multimodal data to build agentic AI applications.
🔥3❤2👏2
Alibaba just now introduced open-sourceWan2.1-FLF2V-14B - 14B-parameter large model for First-Last-Frame to video generation
Powered by data-driven training and DiT architecture with First-Last Frame conditional control:
‒ Perfectly replicates reference visuals
‒ Precise instruction-following
‒ Smooth transitions + real-world physics adherence
‒ Cinema-quality 720P output.
GitHub
HuggingFace
ModelScope
Powered by data-driven training and DiT architecture with First-Last Frame conditional control:
‒ Perfectly replicates reference visuals
‒ Precise instruction-following
‒ Smooth transitions + real-world physics adherence
‒ Cinema-quality 720P output.
GitHub
HuggingFace
ModelScope
wan.video
Wan AI: Leading AI Video Generation Model
Wan is an AI creative platform. It aims to lower the barrier to creative work using artificial intelligence, offering features like text-to-image, image-to-image, text-to-video, image-to-video, and image editing.
❤2🔥2👏2💯2
Yale and GoogleDeepMind introduced C2S‑Scale a family open-source LLMs trained to “read” and “write” biological data at the single-cell level.
Preprint
Preprint
❤2🔥2🥰2💯2
Economics of Minds: LLMs planning their own workload. A new paper “Self‑Resource Allocation in Multi‑Agent LLM Systems.”
A lightweight Planner beats a monolithic Orchestrator—faster, cheaper, and smarter multi‑agent coordination
A lightweight Planner beats a monolithic Orchestrator—faster, cheaper, and smarter multi‑agent coordination
arXiv.org
Self-Resource Allocation in Multi-Agent LLM Systems
With the development of LLMs as agents, there is a growing interest in connecting multiple agents into multi-agent systems to solve tasks concurrently, focusing on their role in task assignment...
🔥3❤2👏2
Meta FAIR is open sourcing Matrix (Multi-Agent daTa geneRation Infra and eXperimentation) under MIT license.
It is a versatile toolkit with a high-performance model-serving engine designed for large scale inference.
It integrates Slurm for resource management and Ray for distributed job execution. It leverages lower-level model serving engines such as vLLM, SGLang for efficient LLM inference, and support API-based services.
Code.
It is a versatile toolkit with a high-performance model-serving engine designed for large scale inference.
It integrates Slurm for resource management and Ray for distributed job execution. It leverages lower-level model serving engines such as vLLM, SGLang for efficient LLM inference, and support API-based services.
Code.
Meta
Collaborative Reasoner: Self-improving Social Agents with Synthetic Conversations | Research - AI at Meta
With increasingly more powerful large language models (LLMs) and LLM-based agents tackling an ever-growing list of tasks, we envision a future where...
👍4🔥3❤2
Google dropped Gemini 2.5 Flash, a reasoning model matching o4-mini in preview
It's 'thinking budget' (up 24k tokens), can balance between answer quality, cost, and speed
The model performs particularly well on reasoning, STEM, and visual reasoning
It's 'thinking budget' (up 24k tokens), can balance between answer quality, cost, and speed
The model performs particularly well on reasoning, STEM, and visual reasoning
Google
Google AI Studio
The fastest path from prompt to production with Gemini
🔥3❤2🆒2👏1