Anthropic: Claude Sonnet 4 now supports 1 million tokens of context on the Anthropic API—a 5x increase.
Process over 75,000 lines of code or hundreds of documents in a single request.
Long context support is in public beta for API users with Tier 4 and custom rate limits.
Broader availability will be rolling out over the coming weeks. Available in Amazon Bedrock, and coming soon to Google Cloud's Vertex AI.
Process over 75,000 lines of code or hundreds of documents in a single request.
Long context support is in public beta for API users with Tier 4 and custom rate limits.
Broader availability will be rolling out over the coming weeks. Available in Amazon Bedrock, and coming soon to Google Cloud's Vertex AI.
Claude
Claude Sonnet 4 now supports 1M tokens of context | Claude
Claude Sonnet 4 now supports up to 1 million tokens of context—a 5x increase that lets you process entire codebases, synthesize extensive document sets, and build agents that maintain coherence across hundreds of tool calls.
🔥6🥰3👏2
Microsoft introduced Dion is a new AI model optimization method that boosts scalability and performance over existing leading methods by orthonormalizing only a top rank subset of singular vectors, enabling more efficient training of large models such as LLaMA-3 with reduced overhead.
Orthonormal updates appear to roughly double transformer training convergence with Dion paving tractability at the largest scale.
Code.
Paper.
Orthonormal updates appear to roughly double transformer training convergence with Dion paving tractability at the largest scale.
Code.
Paper.
Microsoft Research
Dion: Distributed orthonormal update revolution
Dion is a new AI model optimization method that boosts scalability and performance over existing leading methods by orthonormalizing only a top rank subset of singular vectors, enabling more efficient training of large models such as LLaMA-3 with reduced…
🔥3🆒3❤2👏2
Matrix-Game 2.0 — the first open-source, real-time, long-sequence interactive world model
Last week, DeepMind's Genie 3 shook the AI world with real-time interactive world models.
But... it wasn't open-sourced.
Matrix-Game 2.0 is Skywork's next-gen interactive world model:
- Real-time: 25FPS generation
- Long-sequence: Minutes of continuous video
- Interactive: Move, rotate, explore
- Multi-scene: City, wild, TempleRun, GTA.
It's the foundation for:
- Game engines
- Embodied AI
- Virtual humans
- Spatial intelligence.
The Tech Stack:
- Data: 1,350 hrs of interactive videos from Unreal Engine + GTA5
- Control: Frame-level keyboard & mouse input
- Model: 1.3B autoregressive diffusion with action control
- Speed: Single GPU → 25FPS
- 3D Causal VAE for space-time compression
- Diffusion Transformer with action conditioning
- KV-Cache for infinite video generation
- DMD training to avoid error accumulation
Last week, DeepMind's Genie 3 shook the AI world with real-time interactive world models.
But... it wasn't open-sourced.
Matrix-Game 2.0 is Skywork's next-gen interactive world model:
- Real-time: 25FPS generation
- Long-sequence: Minutes of continuous video
- Interactive: Move, rotate, explore
- Multi-scene: City, wild, TempleRun, GTA.
It's the foundation for:
- Game engines
- Embodied AI
- Virtual humans
- Spatial intelligence.
The Tech Stack:
- Data: 1,350 hrs of interactive videos from Unreal Engine + GTA5
- Control: Frame-level keyboard & mouse input
- Model: 1.3B autoregressive diffusion with action control
- Speed: Single GPU → 25FPS
- 3D Causal VAE for space-time compression
- Diffusion Transformer with action conditioning
- KV-Cache for infinite video generation
- DMD training to avoid error accumulation
huggingface.co
Skywork/Matrix-Game-2.0 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
🔥4🥰3👏2
Now you can run and benchmark evolutionary coding agents on 100+ algorithm optimization tasks from algotune.io
👍4🔥2🥰2
Google is rolling out their version of memory for Gemini today. It is called 'personal context.'
If you want to disable this, toggle off Personal Context in settings.
This works for 2.5 Pro only, not Flash.
It will be interesting to see what the effect of Gemini's monster context window will have on implementation.
If you want to disable this, toggle off Personal Context in settings.
This works for 2.5 Pro only, not Flash.
It will be interesting to see what the effect of Gemini's monster context window will have on implementation.
Google
Gemini adds Temporary Chats and new personalization features
Today, we are updating the Gemini app so that it learns about your preferences the more you use it.
🔥3❤2🥰2🍌1
Anthropic acquired the co-founders and most of the team of Humanloop, the startup behind Hooli, a platform for prompt management, LLM evaluation, and observability
The move marks a major push from Anthropic to strengthen its AI safety strategy.
The move marks a major push from Anthropic to strengthen its AI safety strategy.
TechCrunch
Anthropic nabs Humanloop team as competition for enterprise AI talent heats up | TechCrunch
While an Anthropic spokesperson confirmed that the AI firm did not acquire Humanloop or its IP, that’s a moot point in an industry where IP lives in the brain. And what Humanloop’s team is bringing to Anthropic is experience developing the tools that help…
The revenue from just the AI Labs (publicly reported figures from OpenAI and Anthropic), along with the public AI infrastructure companies, has already eclipsed all public SaaS revenue in 2024 (Nvidia's datacenter revenue drives most of the growth).
It will almost double public SaaS on a net new revenue basis this year. And these figures don’t include private AI companies, which would even further show the spread.
It’s clear that the current set of 100+ public SaaS companies is not yet seeing revenue growth in their AI offerings, and for the most part, AI demand is happening where they are not.
It will almost double public SaaS on a net new revenue basis this year. And these figures don’t include private AI companies, which would even further show the spread.
It’s clear that the current set of 100+ public SaaS companies is not yet seeing revenue growth in their AI offerings, and for the most part, AI demand is happening where they are not.
🔥4🥰2👏2
ByteDance & Tsinghua University unveiled ASearcher
Agentic search by enabling long-horizon reasoning with large-scale asynchronous RL.
Goes beyond typical turn limits for complex, knowledge-intensive tasks.
Achieves SOTA performance, with significant gains of up to +46.7% on xBench and GAIA after RL training.
Models & data.
Agentic search by enabling long-horizon reasoning with large-scale asynchronous RL.
Goes beyond typical turn limits for complex, knowledge-intensive tasks.
Achieves SOTA performance, with significant gains of up to +46.7% on xBench and GAIA after RL training.
Models & data.
huggingface.co
Paper page - Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale
Asynchronous RL
Asynchronous RL
Join the discussion on this paper page
🔥4❤3🥰2
Meta introduced DINOv3. A major release that raises the bar of self-supervised vision foundation models.
DINOv3 is open source. Researchers scaled model size and training data, but here's what makes it special.
What’s in DINOv3?
• 7B ViT foundation model + smaller distilled models
• Trained on 1.7B curated images with no annotations
• Gram anchoring fixes feature map degradation when training too big too long
• High-resolution adaptation with relative spatial coords and 2D RoPE.
Training+evaluation code, adapters and notebooks
Collection of pre-trained backbones in HF Transformers
Paper.
DINOv3 is open source. Researchers scaled model size and training data, but here's what makes it special.
What’s in DINOv3?
• 7B ViT foundation model + smaller distilled models
• Trained on 1.7B curated images with no annotations
• Gram anchoring fixes feature map degradation when training too big too long
• High-resolution adaptation with relative spatial coords and 2D RoPE.
Training+evaluation code, adapters and notebooks
Collection of pre-trained backbones in HF Transformers
Paper.
Meta AI
DINOv3: Self-supervised learning for vision at unprecedented scale
DINOv3 scales self-supervised learning for images to create universal vision backbones that achieve absolute state-of-the-art performance across diverse domains, including web and satellite imagery.
🔥4👏3🥰2
Google just dropped a new tiny LLM with outstanding performance - Gemma3 270M.
Now available on KerasHub. Try the new presets gemma3_270m and
gemma3_instruct_270m
Now available on KerasHub. Try the new presets gemma3_270m and
gemma3_instruct_270m
Googleblog
Google for Developers Blog - News about Web, Mobile, AI and Cloud
Explore Gemma 3 270M, a compact, energy-efficient AI model for task-specific fine-tuning, offering strong instruction-following and production-ready quantization.
🔥5👍4🥰2
Tencent dropped China's version of Google Genie 3
Yan is a world model that generates 1080p worlds at 60fps (!) with no game engine, pure AI inference, at 0.11s latency and infinite video length. It's trained on ~150 days of video gameplay.
The specs are better, but the results aren't quite as good. They do publish their research though.
Yan is a world model that generates 1080p worlds at 60fps (!) with no game engine, pure AI inference, at 0.11s latency and infinite video length. It's trained on ~150 days of video gameplay.
The specs are better, but the results aren't quite as good. They do publish their research though.
alphaXiv
Yan: Foundational Interactive Video Generation | alphaXiv
View 5 comments: Yan-Gen 是 Yan 框架的第二个核心模块,它专注于**实时、交互式世界生成**。该模块旨在解决自回归视频模型中固有的长期语义漂移问题,这是生成连贯、长时间视频序列的一大挑战。为了实现这一目标,Yan-Gen 采用了多阶段训练流程,如图 8 所示。
以下是 Yan-Gen 的主要特点和工作方式:
* **多阶段训练流程:**
* **基础世界生成模型训练:** Yan-Gen 首先...
以下是 Yan-Gen 的主要特点和工作方式:
* **多阶段训练流程:**
* **基础世界生成模型训练:** Yan-Gen 首先...
🔥4🥰3👏2
OpenAI is preparing ChatGPT to be used in its upcoming AI Browser
The new ChatGPT web app version adds a hidden option to "Use cloud browser" when enabling Agent mode.
However, interestingly, this option is enabled only if the user agent matches "ChatGPT.+Macintosh;.+ Chrome" (likely the new browser from OpenAI) - hinting at the possibility that ChatGPT in Agent mode might be able to control either your browser or the cloud browser.
"Aura" will be able to run ChatGPT Agent natively and will likely arrive on macOS first.
Watch out for Edge to get it first.
The new ChatGPT web app version adds a hidden option to "Use cloud browser" when enabling Agent mode.
However, interestingly, this option is enabled only if the user agent matches "ChatGPT.+Macintosh;.+ Chrome" (likely the new browser from OpenAI) - hinting at the possibility that ChatGPT in Agent mode might be able to control either your browser or the cloud browser.
"Aura" will be able to run ChatGPT Agent natively and will likely arrive on macOS first.
Watch out for Edge to get it first.
🔥4👏3🥰2
Inworld AI released AI runtime built to auto-scale consumer apps from prototype to millions of users, automate MLOps, and launch one-click AI experiments
Researchers delivered:
- Adaptive Graphs: Auto-scale from 10 to 10M users. No rework.
- Automated MLOps: Ops, telemetry, optimizations automatically.
- Live Experiments: Instant A/B tests, no code changes.
Researchers delivered:
- Adaptive Graphs: Auto-scale from 10 to 10M users. No rework.
- Automated MLOps: Ops, telemetry, optimizations automatically.
- Live Experiments: Instant A/B tests, no code changes.
inworld.ai
Inworld AI | Powering the Next Wave of Realtime AI Apps
#1 ranked TTS with under 200ms latency, voice cloning, and 25x lower cost. Model-agnostic LLM orchestration with smart routing and sub-second latency.
👍4❤2🔥2
Nvidia announced Cosmos Reason 7B, an open-source VLM to enable robots to see, reason, and act in the physical world, solving multistep tasks
The company also made Isaac Sim 5.0 and Isaac Lab 2.2 generally available
The company also made Isaac Sim 5.0 and Isaac Lab 2.2 generally available
NVIDIA Technical Blog
Maximize Robotics Performance by Post-Training NVIDIA Cosmos Reason
First unveiled at NVIDIA GTC 2025, NVIDIA Cosmos Reason is an open and fully customizable reasoning vision language model (VLM) for physical AI and robotics. The VLM enables robots and vision AI…
❤3👍3🔥3
DatologyAI Team introduced BeyondWeb, a synthetic data generation framework
BeyondWeb significantly extends the capabilities of traditional web-scale datasets, outperforming SOTA synthetic pretraining datasets such as Cosmopedia and Nemotron-CC's high-quality synthetic subset (Nemotron-Synth) by up to 5.1 percentage points (pp) and 2.6pp, respectively, when averaged across a suite of 14 benchmark evaluations. It delivers up to 7.7x faster training than open web data and 2.7x faster than Nemotron-Synth.
Remarkably, a 3B model trained for 180B tokens on BeyondWeb outperforms an 8B model trained for the same token budget on Cosmopedia.
BeyondWeb significantly extends the capabilities of traditional web-scale datasets, outperforming SOTA synthetic pretraining datasets such as Cosmopedia and Nemotron-CC's high-quality synthetic subset (Nemotron-Synth) by up to 5.1 percentage points (pp) and 2.6pp, respectively, when averaged across a suite of 14 benchmark evaluations. It delivers up to 7.7x faster training than open web data and 2.7x faster than Nemotron-Synth.
Remarkably, a 3B model trained for 180B tokens on BeyondWeb outperforms an 8B model trained for the same token budget on Cosmopedia.
🔥3❤2🆒2👍1
BlackRock just proposes AlphaAgents for investment research
Equity portfolio management has long relied on human analysts poring over 10-Ks, earnings calls, and market data—a process that’s slow and prone to biases.
Enter AI-powered multi-agent LLMs: teams of specialized agents that collaborate and debate to synthesize market and fundamental data.
This approach can speed up research and surface insights humans might miss.
AlphaAgents also tackle cognitive biases. Loss aversion, overconfidence, and anchoring often lead to suboptimal decisions—but multi-agent AI provides an objective second opinion.
By combining reasoning, memory, and tool usage, this framework helps:
• Aggregate massive datasets
• Reduce human and AI errors
• Improve portfolio decision-making efficiency
In short, multi-agent LLMs could redefine equity research, making it faster, more objective, and more data-driven. The future of alpha hunting may just be collaborative AI.
Equity portfolio management has long relied on human analysts poring over 10-Ks, earnings calls, and market data—a process that’s slow and prone to biases.
Enter AI-powered multi-agent LLMs: teams of specialized agents that collaborate and debate to synthesize market and fundamental data.
This approach can speed up research and surface insights humans might miss.
AlphaAgents also tackle cognitive biases. Loss aversion, overconfidence, and anchoring often lead to suboptimal decisions—but multi-agent AI provides an objective second opinion.
By combining reasoning, memory, and tool usage, this framework helps:
• Aggregate massive datasets
• Reduce human and AI errors
• Improve portfolio decision-making efficiency
In short, multi-agent LLMs could redefine equity research, making it faster, more objective, and more data-driven. The future of alpha hunting may just be collaborative AI.
🔥5🆒4🤔2🥰1👏1
Nvidia dropped model that rivals Qwen3 8b, with data, with base model, not that bad of a license (could be better to be clear)
NVIDIA Nemotron Nano v2 - a 9B hybrid SSM that is 6X faster than similarly sized models, while also being more accurate.
Along with this model, also released most of the data researchers used to create it, including the pretraining corpus.
NVIDIA Nemotron Nano v2 - a 9B hybrid SSM that is 6X faster than similarly sized models, while also being more accurate.
Along with this model, also released most of the data researchers used to create it, including the pretraining corpus.
NVIDIA ADLR
NVIDIA Nemotron Nano 2 and the Nemotron Pretraining Dataset v1
NVIDA Nemotron Nano 2 is a new hybrid Mamba-Transformer reasoning model that achieves on-par or better accuracies compared to comparably sized leading open models at up to 6x higher throughput. Nemotron Pretraining Dataset v1 is a 6.6 trillion token dataset…
🔥4❤3🥰2
DeepSeek Releases V3.1, context expanded to 128K.
You are welcome to try it on the official website, the app, and the mini program. The API calling method remains unchanged.
You are welcome to try it on the official website, the app, and the mini program. The API calling method remains unchanged.
🔥3💅3👍2❤1
Digital asset platform Bullish announced that its $1.15 billion IPO proceeds were fully settled in stablecoins, making it the first IPO in the United States to be completed using stablecoin funding.
The stablecoins used include USDCV, EURCV, USDG, PYUSD, RLUSD, among others.
The stablecoins used include USDCV, EURCV, USDG, PYUSD, RLUSD, among others.
Bullish
Bullish receives $1.15bn of IPO proceeds in stablecoins | Bullish
Bullish (NYSE: BLSH), an institutionally focused global digital asset platform that provides market infrastructure and information services, announced that it had arranged to receive $1.15 billion of proceeds from its recently completed initial public offering…
🔥3🥰3👏2