All about AI, Web 3.0, BCI – Telegram
All about AI, Web 3.0, BCI
3.22K subscribers
724 photos
26 videos
161 files
3.08K links
This channel about AI, Web 3.0 and brain computer interface(BCI)

owner @Aniaslanyan
Download Telegram
Sekai: A Video Dataset towards World Exploration

A high-quality 5k hrs of egocentric worldwide video + audio dataset for world exploration, created from Youtube with high-quality annotations.

Data.
Paper.
Andrej Karpathy's keynote yesterday at AI Startup School in San Francisco

Chapters:

0:00 software is changing quite fundamentally again. LLMs are a new kind of computer, and you program them *in English*. Hence Karpathy think they are well deserving of a major version upgrade in terms of software.
6:06 LLMs have properties of utilities, of fabs, and of operating systems => New LLM OS, fabbed by labs, and distributed like utilities (for now). Many historical analogies apply - imo we are computing circa ~1960s.
14:39 LLM psychology: LLMs = "people spirits", stochastic simulations of people, where the simulator is an autoregressive Transformer. Since they are trained on human data, they have a kind of emergent psychology, and are simultaneously superhuman in some ways, but also fallible in many others. Given this, how do we productively work with them hand in hand?
Switching gears to opportunities...
18:16 LLMs are "people spirits" => can build partially autonomous products.
29:05 LLMs are programmed in English => make software highly accessible! (yes, vibe coding)
33:36 LLMs are new primary consumer/manipulator of digital information (adding to GUIs/humans and APIs/programs) => Build for agents!

Some of the links:
- Karpathy’s slides as keynote
- Software 2.0 blog
post
- How LLMs flip the
noscript on technology diffusion
- Vibe
coding MenuGen (retrospective)
4🆒4
Stablecoin market supply exceeds $250 billion, per Delphi Digital

Tether and Circle dominate with 86% of circulation.

Yield-bearing stablecoins grow rapidly, with Ethena nearing $6 billion since launch.

Over 10 stablecoins have circulation above $1 billion, showing increased issuer diversity.

Over $120 billion in U.S. Treasuries are locked in stablecoins, forming a liquidity pool outside traditional markets.
CyberGym is a large-scale evaluation framework that stress-tests AI agents on 1,500+ real vulnerabilities across 188 major Open Source Software projects.

It challenges agents to:
– Navigate large, real-world codebases
– Reproduce PoCs for real CVEs
– Discover new, unknown vulnerabilities.

Key insights from CyberGym:
1. SOTA agents and LLMs successfully generated PoCs for up to ~18% of historical CVEs
2. More striking: they discovered 15 zero-days in the wild
5
All about AI, Web 3.0, BCI
Another top notch open source model at OpenAI/Meta/Google levels from &MiniMax AI (Chinese lab, ex Sensetime, $850m raised). Massive MoE similar to Deep-seek. Excels on long context (4m tokens!) which is really interesting, need to dig into their lighting…
Chinese Lab MiniMax introduced in this week:

1. open-sourcing LLM MiniMax-M1 — setting new standards in long-context reasoning.

- World’s longest context window: 1M-token input, 80k-token output
- State-of-the-art agentic use among open-source models
- RL at unmatched efficiency: trained with just $534,700.
HF.
GitHub.
Tech Report.

2. Hailuo 02, World-Class Quality, Record-Breaking Cost Efficiency

- Best-in-class instruction following
- Handles extreme physics
- Native 1080p

3. MiniMax Audio:
- Any prompt, any voice, any emotion
- Fully customizable and multilingual.

4. Hailuo Video Agent in Beta, Vibe Videoing with Zero-touch.

MiniMax plan to achieve end-to-end Hailuo Video Agent via 3 stages:
Stage 1: Prebuilt video Agent templates for high-quality creative videos. Users simply follow instructions and input text or images — with one click, a polished video is generated.
Stage 2: Semi-customizable video Agent. Users gain the flexibility to edit any part of the video creation process, from noscript to visuals to voiceover.
Stage 3: Fully autonomous, end-to-end video Agent. A complete, intelligent workflow that turns creative input into final-cut video with minimal manual effort.

This summer, team plan to gradually roll out Stage Two of Agent creation tools.

5. MiniMax Agent, a general intelligent agent designed to tackle long-horizon, complex tasks.

From expert-level multi-step planning to flexible task breakdown and end-to-end execution — it’s designed to act like a reliable teammate, with strengths in:

-Programming & tool use
-Multimodal understanding & generation
-Seamless MCP integration
🔥8
New AI for rare disease diagnosis: SHEPHERD shows how simulation + knowledge-grounded AI = deep learning for ultra‑low label domains

SHEPHERD is a few‑shot learning model powered by a phenotypic knowledge graph to tackle over 7,000 rare diseases with just a handful (or zero) diagnosed cases.
🔥4
Sakana AI introduced Reinforcement-Learned Teachers (RLTs): Transforming how teach LLMs to reason with reinforcement learning (RL).

Traditional RL focuses on “learning to solve” challenging problems with expensive LLMs and constitutes a key step in making student AI systems ultimately acquire reasoning capabilities via distillation and cold-starting.

RLTs—a new class of models prompted with not only a problem’s question but also its solution, and directly trained to generate clear, step-by-step “explanations” to teach their students.

Remarkably, an RLT with only 7B parameters produces superior results when distilling and cold-starting students in competitive and graduate-level reasoning tasks than orders-of-magnitude larger LLMs.

RLTs are as effective even when distilling 32B students, much larger than the teacher itself—unlocking a new standard for efficiency in developing reasoning language models with RL.

Paper.
Code.
🔥6
Future of Work with AI Agents. Stanford's new report analyzes what 1500 workers think about working with AI Agents.

The audit proposes a large-scale framework for understanding where AI agents should automate or augment human labor.

The authors build the WORKBank, a database combining worker desires and expert assessments across 844 tasks and 104 occupations, and introduce the Human Agency Scale to quantify desired human involvement in AI-agent-supported work.

A substantial portion of current AI investment, such as YC-funded companies, targets tasks in the “Red Light” Zone (high technical feasibility but low worker desire).

This raises concerns about pushing automation where it's socially or ethically unwelcome.

Interpersonal skills are becoming more valuable

Tasks rated as needing HAS 5 (essential human involvement) were strongly associated with interpersonal communication and domain expertise.

These include editing, education, and some engineering tasks, where AI lacks the nuance or trustworthiness to operate alone.

Some High-Wage Skills May Decline in Value

The results above reveal that skills like analyzing data or updating knowledge, which currently command high wages, are less associated with high HAS tasks, implying future declines in their labor market value as AI spreads.

Role-based AI Support

From trannoscript analysis, the most common vision for human–AI collaboration was role-based support, where workers imagine AI tools acting as analysts, assistants, or specialists with clearly bounded responsibilities, not general-purpose agents.

Lots of other findings in this one.
🔥5
BountyBench evaluates AI agents on 25 real-world, complex systems and 40 bug bounties (worth up to $30,000+), covering 9 OWASP Top 10 categories.

Key insights:

– AI agents solved bug bounty tasks worth tens of thousands of dollars
– Codex CLI & Claude Code excelled in patching (90% / 87.5%), vs in exploitation (32.5% / 57.5%)
– Custom agents performed more evenly across both: Exploit (40–67.5%), Patch (45–60%)
🔥5
Last month Cursor overtook GitHub Copilot in business spend, Ramp’s data shows.

Both continue adding users + spend, more than enough to go around in this market.

But goes to show that first movers != market dominance. Not included: Claude Code, small but growing
🔥4
Unreal Labs is Hiring - Member of Technical Staff. Just raised from Sequoia & First Round Capital 🔥

Building AI to replace performance marketing teams. Looking for Python engineers to work on:

Creative AI: Turn briefs into finished images/videos using latest models (Runway, Sora, etc.)
Data Pipeline: Crawl & process social media ad data at scale

What you need?
1. Strong Python skills
2. Interest in generative AI
3. Builder mindset

What you get?
- London-based (help with relocation)
- Good salary + equity
- Unlimited GPU/API budget
- Small team with big-tech experience.

When a Sequoia-backed startup offers unlimited GPU budget, you listen 👀
🔥6😱1
Google DeepMind introduced Gemini Robotics On-Device a VLA model to help make robots faster, highly efficient, and adaptable to new tasks and environments - without needing a constant internet connection

Key takeaways:


1. It has the generality and dexterity of Gemini Robotics - but it can run locally on the device
2. It can handle a wide variety of complex, two-handed tasks out of the box
3. It can learn new skills with as few as 50-100 demonstrations.

From humanoids to industrial bi-arm robots, the model supports multiple embodiments, even though it was pre-trained on ALOHA - while following instructions from humans.

Also launched the Gemini Robotics software development kit (SDK) to help developers fine-tune the model for their own applications, including by testing it in the MuJoCo physics simulator.
4
Anthropic is preparing a Memory feature for Claude Web, added in the latest mobile update.
5
ChatGPT added Search Connectors

It launched with a slate of 12 apps (Box, Gmail, Outlook, Github, etc.) that you can authenticate into and search across in the ChatGPT interface

Feels like a real step towards a broader workspace vision.
Google launched Gemini CLI, a powerful open-source AI agent built for the terminal.

- Built on Gemini 2.5 Pro
- 1 million token context window
- Free tier with 60 requests per minute and 1,000 per day
- Google Search grounding for real-time context
- Script and plugin support
- Non-interactive mode for automation
- Support for Model Context Protocol (MCP)
- Integrated with Gemini Code Assist in VS Code
- Fully open-source under Apache 2.0
🔥8