Trends in AI Supercomputers
Epoch AI dropped a map of the world’s 500+ AI supercomputers.
• Performance doubling every 9 months.
• Hardware cost & power use doubling every year.
• xAI’s Colossus already gulps 300 MW — equal to 250 k homes — and that’s only 2025.
If the trendline holds, the 2030 front-runner will burn 9 GW, pack 2 M chips, and sport a $200 B price tag.
The U.S. owns 75 % of today’s compute muscle while China trails at 15 %.
Epoch AI dropped a map of the world’s 500+ AI supercomputers.
• Performance doubling every 9 months.
• Hardware cost & power use doubling every year.
• xAI’s Colossus already gulps 300 MW — equal to 250 k homes — and that’s only 2025.
If the trendline holds, the 2030 front-runner will burn 9 GW, pack 2 M chips, and sport a $200 B price tag.
The U.S. owns 75 % of today’s compute muscle while China trails at 15 %.
🔥7❤3👏2
MIT researchers have developed a "periodic table" for machine learning — a groundbreaking framework that maps the connections between 20+ classical ML algorithms.
By revealing how these methods relate and overlap, the table opens up new ways for scientists to hybridize techniques, improving existing models or even inventing entirely new ones.
As proof of concept, the team fused two distinct algorithms using this framework and created a novel image classification method — outperforming current state of the art models by 8%.
By revealing how these methods relate and overlap, the table opens up new ways for scientists to hybridize techniques, improving existing models or even inventing entirely new ones.
As proof of concept, the team fused two distinct algorithms using this framework and created a novel image classification method — outperforming current state of the art models by 8%.
Tech Xplore
'Periodic table of machine learning' framework unifies AI models to accelerate innovation
MIT researchers have created a periodic table that shows how more than 20 classical machine-learning algorithms are connected. The new framework sheds light on how scientists could fuse strategies from ...
🔥5🆒5❤3👏1
It’s announcement about the new lightweight ChatGPT deep research model (powered by a version of o4-mini) and updated limits confusing
In the end, it's a combination of tasks using "standard" deep research plus additional tasks using the lightweight version:
- Free - 5 tasks/month using the lightweight version
- Plus & Team - 10 tasks/month, plus an additional 15 tasks/month using the lightweight version
- Pro - 125 tasks/month, plus an additional 125 tasks/month using the lightweight version
- Enterprise - 10 tasks/month
In the end, it's a combination of tasks using "standard" deep research plus additional tasks using the lightweight version:
- Free - 5 tasks/month using the lightweight version
- Plus & Team - 10 tasks/month, plus an additional 15 tasks/month using the lightweight version
- Pro - 125 tasks/month, plus an additional 125 tasks/month using the lightweight version
- Enterprise - 10 tasks/month
Openai
Introducing deep research
An agent that uses reasoning to synthesize large amounts of online information and complete multi-step research tasks for you. Available to Pro users today, Plus and Team next.
🔥3❤2👏2
Liquid AI introduced architecture called Hyena Edge, a convolution-based multi-hybrid model that not only matches but outperforms strong Transformer-based baselines in computational efficiency and model quality on edge hardware, benchmarked on the Samsung S24 Ultra smartphone.
To design Hyena Edge, researchers used end-to-end automated model design framework —STAR.
To design Hyena Edge, researchers used end-to-end automated model design framework —STAR.
www.liquid.ai
Convolutional Multi-Hybrids for Edge Devices | Liquid AI
Today, we introduce a Liquid architecture called Hyena Edge, a convolution-based multi-hybrid model that not only matches but outperforms strong Transformer-based baselines in computational efficiency and model quality on edge hardware, benchmarked on the…
❤3👍3👏2
Why Do Multi-Agent LLM Systems Fail? Despite the growing excitement around Multi-Agent Systems (MAS), they often struggle to outperform single-agent approaches.
Berkeley’s researchers analyzed 7 popular MAS frameworks across 200+ tasks, identifying 14 failure modes that hinder their effectiveness.
Paper.
Code.
Berkeley’s researchers analyzed 7 popular MAS frameworks across 200+ tasks, identifying 14 failure modes that hinder their effectiveness.
Paper.
Code.
Google
MAST
In a Nutshell
Despite the increasing adoption of Multi-Agent Systems (MAS) , their performance gains often remain minimal compared to single-agent frameworks. Why do MAS fail?
We have conducted a systematic evaluation of MASs execution traces using Grounded…
Despite the increasing adoption of Multi-Agent Systems (MAS) , their performance gains often remain minimal compared to single-agent frameworks. Why do MAS fail?
We have conducted a systematic evaluation of MASs execution traces using Grounded…
❤5👍3👏2
Stripe is building a NEW stablecoin product, powered by Bridge
If your company is:
- Based outside of the US, EU, or UK
- Interested in dollar access
Send a quick note about your company to stablecoins@stripe.com
If your company is:
- Based outside of the US, EU, or UK
- Interested in dollar access
Send a quick note about your company to stablecoins@stripe.com
🔥5💯3🤩2👏1
Perplexity released an agentic Voice Assistant
It uses web browsing and multi-app actions to book reservations, send emails and calendar invites, play podcasts/videos, and more
Currently available in the Perplexity app, but only on iOS
It uses web browsing and multi-app actions to book reservations, send emails and calendar invites, play podcasts/videos, and more
Currently available in the Perplexity app, but only on iOS
⚡️Viral rumors of DeepSeek R2 leaked
—1.2T param, 78B active, hybrid MoE
—97.3% cheaper than GPT 4o ($0.07/M in, $0.27/M out)
—5.2PB training data. 89.7% on C-Eval2.0
—Better vision. 92.4% on COCO
—82% utilization in Huawei Ascend 910B
Big shift away from US supply chain.
—1.2T param, 78B active, hybrid MoE
—97.3% cheaper than GPT 4o ($0.07/M in, $0.27/M out)
—5.2PB training data. 89.7% on C-Eval2.0
—Better vision. 92.4% on COCO
—82% utilization in Huawei Ascend 910B
Big shift away from US supply chain.
A new paper from Google DeepMind shows how Reinforcement Learning Fine-Tuning (RLFT) on self-generated Chain-of-Thought (CoT) can improve exploration and decision-making.
RLFT Implementation:
1. Set up the LLM to interact with a decision environment (e.g., bandit, Tic-Tac-Toe).
2. Prompt the LLM to generate a thinking process (CoT) and an action.
3. Extract and execute the action in the environment to get a reward.
4. Use the reward to fine-tune the LLM (via RL) based on the generated CoT and action.
5. Repeat interactions and fine-tuning to improve the LLM's decision policy.
Insights:
-LLMs act greedily, sticking to early successful actions and failing to explore potentially better options.
- Smaller LLMs tend to repeat actions common in the prompt history.
- LLMs can often articulate or calculate the correct strategy, but fail to execute
- Simple Reward bonuses (e.g., +1 for exploring a new action) or penalties (e.g., -5 for an invalid action format) guide LLM towards desired behaviour.
- Thinking Time Matters, allowing more tokens for generation improves performance
- Larger models suffer less frequency bias but are still prone to greediness
RLFT Implementation:
1. Set up the LLM to interact with a decision environment (e.g., bandit, Tic-Tac-Toe).
2. Prompt the LLM to generate a thinking process (CoT) and an action.
3. Extract and execute the action in the environment to get a reward.
4. Use the reward to fine-tune the LLM (via RL) based on the generated CoT and action.
5. Repeat interactions and fine-tuning to improve the LLM's decision policy.
Insights:
-LLMs act greedily, sticking to early successful actions and failing to explore potentially better options.
- Smaller LLMs tend to repeat actions common in the prompt history.
- LLMs can often articulate or calculate the correct strategy, but fail to execute
- Simple Reward bonuses (e.g., +1 for exploring a new action) or penalties (e.g., -5 for an invalid action format) guide LLM towards desired behaviour.
- Thinking Time Matters, allowing more tokens for generation improves performance
- Larger models suffer less frequency bias but are still prone to greediness
👍8❤4👏2
HuggingFace introduced SO-101 the new version of the hugely popular SO-100 low-cost robot arm:
- easier to assemble
- more robust in daily use
- still 100% open-source
- still ultra low-cost
Wowrobot shop
Seeedstudio shop
Partabot shop
- easier to assemble
- more robust in daily use
- still 100% open-source
- still ultra low-cost
Wowrobot shop
Seeedstudio shop
Partabot shop
🔥7❤3👏2
Mastercard announced the launch of a global stablecoin payment system, covering wallet enablement, card issuance, merchant settlement, and on-chain remittances, and will partner with OKX to issue the OKX Card, linking crypto trading with everyday spending.
Mastercard is also collaborating with Circle, Nuvei, and Paxos to enable direct merchant settlement in stablecoins.
Mastercard is also collaborating with Circle, Nuvei, and Paxos to enable direct merchant settlement in stablecoins.
Coindesk
Mastercard Moves To Make Stablecoins Easier To Spend, Launches Crypto Card With OKX
Mastercard's new global system aims to make stablecoin transactions as seamless as traditional payments.
🔥4❤3👍2
OpenAI is introducing shopping features in ChatGPT today, powered by GPT-4o, for ChatGPT Pro, Plus, and Free users, as well as logged-out users worldwide.
TechCrunch
OpenAI upgrades ChatGPT search with shopping features | TechCrunch
OpenAI is updating ChatGPT Search to give users an improved shopping experience, the company announced in a blog post.
🔥4❤3👏2
Alibaba Introduced Qwen3
Open-weight Qwen3, latest large language models, including 2 MoE models and 6 dense models, ranging from 0.6B to 235B.
Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general capabilities, etc., when compared to other top-tier models such as DeepSeek-R1, o1, o3-mini, Grok-3, and Gemini-2.5-Pro.
Additionally, the small MoE model, Qwen3-30B-A3B, outcompetes QwQ-32B with 10 times of activated parameters, and even a tiny model like Qwen3-4B can rival the performance of Qwen2.5-72B-Instruct.
Trained on 36T tokens, covering 119 languages! Data extracted from PDFs, synthetic data, etc.
Thinking and non-thinking modes
Improved agentic, coding capabilities, support for MCP
Training pipeline similar to DeepSeek R1
Small distilled models, such as Qwen3-4B that can rival the performance of Qwen2.5-72B-Instruct, even a Qwen3-0.6B model
GitHub
HuggingFace
Modelscope
Open-weight Qwen3, latest large language models, including 2 MoE models and 6 dense models, ranging from 0.6B to 235B.
Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general capabilities, etc., when compared to other top-tier models such as DeepSeek-R1, o1, o3-mini, Grok-3, and Gemini-2.5-Pro.
Additionally, the small MoE model, Qwen3-30B-A3B, outcompetes QwQ-32B with 10 times of activated parameters, and even a tiny model like Qwen3-4B can rival the performance of Qwen2.5-72B-Instruct.
Trained on 36T tokens, covering 119 languages! Data extracted from PDFs, synthetic data, etc.
Thinking and non-thinking modes
Improved agentic, coding capabilities, support for MCP
Training pipeline similar to DeepSeek R1
Small distilled models, such as Qwen3-4B that can rival the performance of Qwen2.5-72B-Instruct, even a Qwen3-0.6B model
GitHub
HuggingFace
Modelscope
Qwen
Qwen3: Think Deeper, Act Faster
QWEN CHAT GitHub Hugging Face ModelScope Kaggle DEMO DISCORD
Introduction Today, we are excited to announce the release of Qwen3, the latest addition to the Qwen family of large language models. Our flagship model, Qwen3-235B-A22B, achieves competitive results…
Introduction Today, we are excited to announce the release of Qwen3, the latest addition to the Qwen family of large language models. Our flagship model, Qwen3-235B-A22B, achieves competitive results…
👍4❤3👏2
New work on automated prompt engineering for personalized text-to-image generation:
PRISM: Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation
Paper + Code
Prompt engineering for personalized image generation is labor-intensive or requires model-specific tuning, limiting generalization.
Key Idea: PRISM uses VLMs and iterative in-context learning to automatically generate effective, human-readable prompts using only black-box access to image generation models.
This approach shows strong generalization and versatility in generating accurate prompts for objects, styles and images across multiple T2I models, including Stable Diffusion, DALL-E, and Midjourney. It also enables easy editing and multi-concept prompt generation.
PRISM: Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation
Paper + Code
Prompt engineering for personalized image generation is labor-intensive or requires model-specific tuning, limiting generalization.
Key Idea: PRISM uses VLMs and iterative in-context learning to automatically generate effective, human-readable prompts using only black-box access to image generation models.
This approach shows strong generalization and versatility in generating accurate prompts for objects, styles and images across multiple T2I models, including Stable Diffusion, DALL-E, and Midjourney. It also enables easy editing and multi-concept prompt generation.
kellyyutonghe.github.io
PRISM: Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation
We propose an algorithm that automatically identifies human-interpretable and transferable prompts that can effectively generate desired concepts given only black-box access to T2I models.
BCG_AI_Agents_MCP_1745919815.pdf
22.8 MB
BCG 𝗱𝗿𝗼𝗽𝗽𝗲𝗱 𝘁𝗵𝗲𝗶𝗿 𝗹𝗮𝘁𝗲𝘀𝘁 𝗣𝗢𝗩 𝗼𝗻 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁𝘀 𝗮𝗻𝗱 𝘁𝗵𝗲 𝗠𝗼𝗱𝗲𝗹 𝗖𝗼𝗻𝘁𝗲𝘅𝘁 𝗣𝗿𝗼𝘁𝗼𝗰𝗼𝗹 (𝗠𝗖𝗣)
𝗛𝗲𝗿𝗲 𝗮𝗿𝗲 𝗸𝗲𝘆 𝘁𝗮𝗸𝗲𝗮𝘄𝗮𝘆𝘀:
1. 𝗔𝘂𝘁𝗼𝗻𝗼𝗺𝗼𝘂𝘀 𝗔𝗴𝗲𝗻𝘁𝘀 𝗔𝗿𝗲 𝗠𝗼𝘃𝗶𝗻𝗴 𝗙𝗿𝗼𝗺 𝗖𝗼𝗻𝗰𝗲𝗽𝘁 𝘁𝗼 𝗥𝗲𝗮𝗹𝗶𝘁𝘆:
➜ Early deployments are already delivering 30–90% improvements in speed, productivity, and cost across coding, compliance, and supply chain domains.
2. 𝗠𝗖𝗣 𝗜𝘀 𝗕𝗲𝗰𝗼𝗺𝗶𝗻𝗴 𝘁𝗵𝗲 𝗕𝗮𝗰𝗸𝗯𝗼𝗻𝗲 𝗼𝗳 𝗦𝗰𝗮𝗹𝗮𝗯𝗹𝗲 𝗔𝗴𝗲𝗻𝘁𝘀:
➜ The Model Context Protocol (MCP) is the new open standard adopted by Anthropic, OpenAI, Microsoft, Google, and Amazon to expose tools, prompts, and resources reliably.
3. 𝗔𝗴𝗲𝗻𝘁 𝗜𝗻𝘁𝗲𝗹𝗹𝗶𝗴𝗲𝗻𝗰𝗲 𝗜𝘀 𝗣𝗿𝗼𝗴𝗿𝗲𝘀𝘀𝗶𝗻𝗴 𝗥𝗮𝗽𝗶𝗱𝗹𝘆:
➜ Agents today can automate tasks up to one hour long — and this limit is doubling every seven months, pushing toward multi-day autonomous workflows by the end of the decade.
4. 𝗔𝗴𝗲𝗻𝘁 𝗔𝗿𝗰𝗵𝗶𝘁𝗲𝗰𝘁𝘂𝗿𝗲𝘀 𝗠𝘂𝘀𝘁 𝗕𝗲 𝗦𝗲𝗰𝘂𝗿𝗶𝘁𝘆-𝗙𝗶𝗿𝘀𝘁:
➜ Security challenges grow as agents gain system access. OAuth, RBAC, permission isolation, eval-driven development, and real-time monitoring are mandatory to deploy agents safely.
5. 𝗧𝗵𝗲 𝗥𝗶𝘀𝗲 𝗼𝗳 𝗔𝗴𝗲𝗻𝘁-𝗢𝗿𝗰𝗵𝗲𝘀𝘁𝗿𝗮𝘁𝗶𝗼𝗻 𝗣𝗹𝗮𝘁𝗳𝗼𝗿𝗺𝘀:
➜ Platforms like Azure Foundry, Vertex AI, Bedrock Agents, and Lindy are positioning themselves as the orchestration layer to create, manage, and scale enterprise agent ecosystems.
6. 𝗙𝗿𝗼𝗺 𝗪𝗼𝗿𝗸𝗳𝗹𝗼𝘄𝘀 𝘁𝗼 𝗙𝘂𝗹𝗹𝘆 𝗔𝘂𝘁𝗼𝗻𝗼𝗺𝗼𝘂𝘀 𝗔𝗴𝗲𝗻𝘁𝘀:
➜ Enterprises are shifting from prompt chaining (rigid workflows) to fully autonomous agents capable of observing, reasoning, and acting dynamically based on real-world feedback.
7. 𝗠𝗖𝗣 𝗮𝗻𝗱 𝗔2𝗔 𝗪𝗶𝗹𝗹 𝗗𝗲𝗳𝗶𝗻𝗲 𝘁𝗵𝗲 𝗔𝗴𝗲𝗻𝘁 𝗘𝗰𝗼𝗻𝗼𝗺𝘆:
➜ MCP connects agents to tools and data. A2A (Agent-to-Agent communication) will enable agents to negotiate, collaborate, and coordinate across systems — forming true multi-agent networks.
𝗛𝗲𝗿𝗲 𝗮𝗿𝗲 𝗸𝗲𝘆 𝘁𝗮𝗸𝗲𝗮𝘄𝗮𝘆𝘀:
1. 𝗔𝘂𝘁𝗼𝗻𝗼𝗺𝗼𝘂𝘀 𝗔𝗴𝗲𝗻𝘁𝘀 𝗔𝗿𝗲 𝗠𝗼𝘃𝗶𝗻𝗴 𝗙𝗿𝗼𝗺 𝗖𝗼𝗻𝗰𝗲𝗽𝘁 𝘁𝗼 𝗥𝗲𝗮𝗹𝗶𝘁𝘆:
➜ Early deployments are already delivering 30–90% improvements in speed, productivity, and cost across coding, compliance, and supply chain domains.
2. 𝗠𝗖𝗣 𝗜𝘀 𝗕𝗲𝗰𝗼𝗺𝗶𝗻𝗴 𝘁𝗵𝗲 𝗕𝗮𝗰𝗸𝗯𝗼𝗻𝗲 𝗼𝗳 𝗦𝗰𝗮𝗹𝗮𝗯𝗹𝗲 𝗔𝗴𝗲𝗻𝘁𝘀:
➜ The Model Context Protocol (MCP) is the new open standard adopted by Anthropic, OpenAI, Microsoft, Google, and Amazon to expose tools, prompts, and resources reliably.
3. 𝗔𝗴𝗲𝗻𝘁 𝗜𝗻𝘁𝗲𝗹𝗹𝗶𝗴𝗲𝗻𝗰𝗲 𝗜𝘀 𝗣𝗿𝗼𝗴𝗿𝗲𝘀𝘀𝗶𝗻𝗴 𝗥𝗮𝗽𝗶𝗱𝗹𝘆:
➜ Agents today can automate tasks up to one hour long — and this limit is doubling every seven months, pushing toward multi-day autonomous workflows by the end of the decade.
4. 𝗔𝗴𝗲𝗻𝘁 𝗔𝗿𝗰𝗵𝗶𝘁𝗲𝗰𝘁𝘂𝗿𝗲𝘀 𝗠𝘂𝘀𝘁 𝗕𝗲 𝗦𝗲𝗰𝘂𝗿𝗶𝘁𝘆-𝗙𝗶𝗿𝘀𝘁:
➜ Security challenges grow as agents gain system access. OAuth, RBAC, permission isolation, eval-driven development, and real-time monitoring are mandatory to deploy agents safely.
5. 𝗧𝗵𝗲 𝗥𝗶𝘀𝗲 𝗼𝗳 𝗔𝗴𝗲𝗻𝘁-𝗢𝗿𝗰𝗵𝗲𝘀𝘁𝗿𝗮𝘁𝗶𝗼𝗻 𝗣𝗹𝗮𝘁𝗳𝗼𝗿𝗺𝘀:
➜ Platforms like Azure Foundry, Vertex AI, Bedrock Agents, and Lindy are positioning themselves as the orchestration layer to create, manage, and scale enterprise agent ecosystems.
6. 𝗙𝗿𝗼𝗺 𝗪𝗼𝗿𝗸𝗳𝗹𝗼𝘄𝘀 𝘁𝗼 𝗙𝘂𝗹𝗹𝘆 𝗔𝘂𝘁𝗼𝗻𝗼𝗺𝗼𝘂𝘀 𝗔𝗴𝗲𝗻𝘁𝘀:
➜ Enterprises are shifting from prompt chaining (rigid workflows) to fully autonomous agents capable of observing, reasoning, and acting dynamically based on real-world feedback.
7. 𝗠𝗖𝗣 𝗮𝗻𝗱 𝗔2𝗔 𝗪𝗶𝗹𝗹 𝗗𝗲𝗳𝗶𝗻𝗲 𝘁𝗵𝗲 𝗔𝗴𝗲𝗻𝘁 𝗘𝗰𝗼𝗻𝗼𝗺𝘆:
➜ MCP connects agents to tools and data. A2A (Agent-to-Agent communication) will enable agents to negotiate, collaborate, and coordinate across systems — forming true multi-agent networks.
🆒3🔥2
U.S. Secretary of Commerce Howard Lutnick said the U.S. will accelerate bitcoin mining, support the construction of its own power infrastructure, and reduce reliance on the public power grid.
The U.S. will allow miners to build power plants and data centers near natural gas fields and consider incorporating bitcoin into the national economic account.
The U.S. will allow miners to build power plants and data centers near natural gas fields and consider incorporating bitcoin into the national economic account.
Bitcoin Magazine
U.S. Secretary Of Commerce Howard Lutnick Has A Bitcoin Vision For America
Secretary Lutnick encourages Bitcoin businesses to set up shop in the United States, as he claims that the Trump administration is doing everything in its power to welcome such companies to the U.S. in the wake of the Biden administration’s hostile treatment…
The U.K. government released consultation papers on crypto legislation.
It sees the creation of new regulated activities, such as operating a crypto exchange and stablecoin issuance.
It sees the creation of new regulated activities, such as operating a crypto exchange and stablecoin issuance.
GOV.UK
Regulatory regime for cryptoassets (regulated activities) – Draft SI and Policy Note
A draft of statutory provisions to create new regulated activities for cryptoassets, and an explainer document detailing the intended policy outcomes of these provisions. The government laid the final legislation in Parliament on 15 December 2025.
❤3🔥3👏2
Meta is releasing a standalone mobile app for its ChatGPT competitor, Meta AI.
There's a Discover feed that shows interactions that others (including your IG/FB friends) are having with the assistant. Meta tells the idea is to demystify AI and show “people what they can do with it."
OpenAI is working on a similar feed for ChatGPT.
There's a Discover feed that shows interactions that others (including your IG/FB friends) are having with the assistant. Meta tells the idea is to demystify AI and show “people what they can do with it."
OpenAI is working on a similar feed for ChatGPT.
The Verge
Meta’s ChatGPT competitor shows how your friends use AI
What if Instagram only showed people talking with AI?
Stanford and Google DeepMind released SWiRL: A synthetic data generation and multi-step RL approach for reasoning and tool use!
With SWiRL, the model’s capability generalizes to new tasks and tools. For example, a model trained to use a retrieval tool to solve multi-hop knowledge-intensive question answering tasks becomes significantly better at using Python to solve math problems (and vice versa).
As they scale the synthetic data size, the generalization gains continue to improve.
This suggests new possibilities for self improvement, where researchers use the model to synthetically generate data on multi-step tasks in more accessible (or affordable) domains and improve it on other domains.
With SWiRL, the model’s capability generalizes to new tasks and tools. For example, a model trained to use a retrieval tool to solve multi-hop knowledge-intensive question answering tasks becomes significantly better at using Python to solve math problems (and vice versa).
As they scale the synthetic data size, the generalization gains continue to improve.
This suggests new possibilities for self improvement, where researchers use the model to synthetically generate data on multi-step tasks in more accessible (or affordable) domains and improve it on other domains.
🔥2