All about AI, Web 3.0, BCI – Telegram
All about AI, Web 3.0, BCI
3.22K subscribers
724 photos
26 videos
161 files
3.08K links
This channel about AI, Web 3.0 and brain computer interface(BCI)

owner @Aniaslanyan
Download Telegram
Apple introduced EgoDex the largest and most diverse dataset of dexterous human manipulation to date — 829 hours of egocentric video + paired 3D hand poses across 194 tasks.

Unlike teleoperation, egocentric video is passively scalable - like text and images on the Internet.

Researchers use Apple Vision Pro to collect video + precise pose annotations (unlike Ego4D, which lacks native pose data). This unlocks 5x the scale of existing large datasets like DROID.

Also propose new benchmarks and train imitation learning policies for dexterous trajectory prediction. Below are 30 Hz wrist and fingertip trajectories on the test set, where blue = ground truth, red = model predictions, and points get lighter up to 2 seconds in the future.

The full dataset is now publicly available to the community, access details are in the paper. Sample code for data loading is coming soon.
👏6
Elon Musk's xAI announced Live Search in the API

The new beta (free for a limited time) feature allows apps leveraging Grok models to search real-time info from X and the internet, including news.

Here's how easy it is to try out Grok 3's new live search:

1/ Grab a key from xAI
2/ Remix our template
3/ Add your API key to Secrets
4/ Click Run and start chatting with Grok.

Since it's built with Agent, you can remix and keep editing with Agent.

Here's the template to get started.
3🆒3
Tencent presented Hunyuan-TurboS

- Hybrid Transformer-Mamba MoE (56B active params) trained on 16T tokens
- Dynamically switching between rapid responses and deep ”thinking” modes
- Overall top 7 on LMSYS Chatbot Arena.
3
VanEck will launch a private digital assets fund in June 2025 focused on the Avalanche ecosystem.

The fund will invest in projects with long-term token utility around the TGE stage across sectors such as gaming, financial services, payments, and AI, while allocating idle capital to Avalanche-native RWA products to maintain onchain liquidity.
👏3
G42 announced with OpenAI the Stargate in UAE

#Stargate UAE: a next generation 1GW AI compute cluster, will be built by G42 and operated by OpenAI and Oracle.

The collaboration will also include Cisco and SoftBank Group. NVIDIA will supply the latest Blackwell GB300 systems. This will be at the heart of 5GW AI campus announced last week.
👍3
Researchers introduced MedBrowseComp, a challenging deep research benchmark for LLM agents in medicine

MedBrowseComp is the first benchmark that tests the ability of agents to retrieve & synthesize multi-hop medical facts from oncology knowledge bases.
🔥4
Claude 4 is here, and it’s Anthropic’s vision about future of Agents
👍6
All about AI, Web 3.0, BCI
Claude 4 is here, and it’s Anthropic’s vision about future of Agents
More details about Claude4:

—Both models are hybrid models
—Opus 4 is great at understanding codebases and “the right choice” for agentic workflows
—Sonnet 4 excels at everyday tasks, and is your “daily go to”.

Coding agents are a huge theme here at the event and clearly a major focus for what’s coming next.

-Claude 4 has significantly greater agentic capabilities
-A new Code execution tool
-Claude Code coming to VSCode and Jetbrains
-Can now run Claude Code in GitHub.

Some more details on Claude 4 Opus:

—Matches or beats the best models in the world
—SOTA for coding, agentic tool use, and writing
—Memory capabilities across sessions
—Extended thinking mode for complex problem-solving
—200K context window with 32K output tokens.

Claude Code:

—Now generally available
—Integrates with VSCode and Jetbrains IDEs
—You can now see changes live inline in your editor
—A new Claude Code SDK for more flexibility.

If you want to read more about Sonnet & Opus 4, including a bunch of alignment and reward hacking findings, check out the model card.
6👍3
ByteDance introduced MMaDA: Multimodal Large Diffusion Language Models

MMaDA, a novel class of multimodal diffusion foundation models designed to achieve superior performance across diverse domains such as textual reasoning, multimodal understanding, and text-to-image generation.

Surpasses LLaMA-3-7B and Qwen2-7B, SDXL and Janus, Show-o and SEED-X.

3 key innovations:
1. a unified diffusion architecture with a shared probabilistic formulation and a modality-agnostic design, eliminating the need for modality-specific components.
2. mixed long chain-of-thought (CoT) fine-tuning strategy that curates a unified CoT format across modalities.
3. UniGRPO, a unified policy-gradient-based RL algorithm specifically tailored for diffusion foundation models.

GitHub.
👏4
Humans can now see near-infrared light! Very cool tech development in biophotonics: engineered contact lenses convert invisible NIR signals into visible colors- enabling wearable, power-free NIR vision.

This has potential to shift our perceptual boundaries, showing the brain can integrate novel spectral inputs when mapped onto familiar visual codes, reframing light-based information processing and sensory integration.
4
The World Economic Forum has released a report on Asset Tokenization in Financial Markets.

Highlights

1. Tokenization offers a new model of digital asset ownership that enhances transparency, efficiency and accessibility.

2. This report analyses asset class use cases in issuance, securities financing and asset management, identifying factors that enable successful tokenization implementation.

3. Key differentiators include a shared system of record, flexible custody, programmability, fractional ownership and composability across asset types. These features can democratize access to financial markets and modernize infrastructure.

4. While the benefits are demonstrated, adoption is slowed by challenges such as legacy infrastructure, regulatory fragmentation, limited interoperability and liquidity issues.

5. Effective deployment requires phased approaches and strategic coordination among financial institutions, regulators and technology providers. Factors affecting design decisions – such as ledger type, settlement mechanisms and market operating hours – must also be carefully considered.

6. Ultimately, tokenization holds promise for a more inclusive and efficient financial system, provided stakeholders align on standards, safeguards and scalable solutions.

7. Tokenization is expected to reshape financial markets by increasing transparency, efficiency, speed, and inclusivity—paving the way for more resilient and accessible financial systems.
4
Singapore's Sharpa unveiled SharpaWave, a lifelike robotic hand

—Features 22 DOF to balance for dexterity and strength
—Each fingertip has 1,000+ tactile sensing pixels and 5 mN pressure sensitivity
—AI models adapt the hand's grip and modulate force
🔥6
Researchers introduced SPORT, a multimodal agent that explores tool usage without human annotation.

It leverages step-wise DPO to further enhance tool-use capabilities following SFT.

SPORT achieves improvements on the GTA and GAIA benchmarks.
Google introduced Lyria RealTime is a new experimental interactive music generation model that allows anyone to interactively create, control and perform music in real time.

Available via the Gemini API and you can try the demo app on Google AI Studio.
Anthropic just now rolling out voice mode in beta on mobile.

Try starting a voice conversation and asking Claude to summarize your calendar or search your docs. Voice mode in beta is available in English and coming to all plans in the next few weeks.
Game-Changer for AI: Meet the Low-Latency-Llama Megakernel

Buckle up, because a new breakthrough in AI optimization just dropped, and it’s got even Andrej Karpathy buzzing)

The Low-Latency-Llama Megakernel a approach to running models like Llama-1B faster and smarter on GPUs.

What’s the Big Deal?
Instead of splitting a neural network’s forward pass into multiple CUDA kernels (with pesky synchronization delays), this megakernel runs everything in a single kernel. Think of it as swapping a clunky assembly line for a sleek, all-in-one super-machine!

Why It’s Awesome:
1. No Kernel Boundaries, No Delays. By eliminating kernel switches, the GPU works non-stop, slashing latency and boosting efficiency.
2. Memory Magic. Threads are split into “loaders” and “workers.” While loaders fetch future weights, workers crunch current data, using 16KiB memory pages to hide latency.
3. Fine-Grained Sync. Without kernel boundaries, custom synchronization was needed. This not only solves the issue but unlocks tricks like early attention head launches.
4. Open Source. The code is fully open, so you can stop “torturing” your models with slow kernel launches (as the devs humorously put it) and optimize your own pipelines!

Why It Matters ?
- Speed Boost. Faster inference means real-time AI applications (think chatbots or recommendation systems) with lower latency.
- Cost Savings. Optimized GPU usage reduces hardware demands, perfect for startups or budget-conscious teams.
- Flexibility. Open-source code lets developers tweak it for custom models or use cases.

Karpathy’s Take:
Andrej calls it “so so so cool,” praising the megakernel for enabling “optimal orchestration of compute and memory.” He argues that traditional sequential kernel approaches can’t match this efficiency.
🆒5