DeepSeek – Telegram
DeepSeek
1.1K subscribers
38 photos
32 links
Unravel the mystery of AGI with curiousity. Answer the essential questions with long-termism. https://www.deepseek.com
Download Telegram
DeepSeek
🌟 Impressive Results of DeepSeek-R1-Lite-Preview Across Benchmarks!
🌟 Inference Scaling Laws of DeepSeek-R1-Lite-Preview
Longer Reasoning, Better Performance. DeepSeek-R1-Lite-Preview shows steady score improvements on AIME as thought length increases.
🐳1
Re DeepSeek has not issued any cryptocurrency. Currently, there is only one official account on the Twitter platform. We will not contact anyone through other accounts.Please stay vigilant and guard against potential scams.

via Twitter @DeepSeek
🐳1
🎉 Introducing DeepSeek App!

💡 Powered by world-class DeepSeek-V3
🆓 FREE to use with seamless interaction
📱 Now officially available on App Store & Google Play & Major Android markets
🔗Download now: https://download.deepseek.com/app/

🌟 1/3

via Twitter @DeepSeek
2🔥1🐳1
Re Key Features of DeepSeek App:

🔐 Easy login: E-mail/Google Account/Apple ID
☁️ Cross-platform chat history sync
🔍 Web search & Deep-Think mode
📄 File upload & text extraction

🌟 2/3

via Twitter @DeepSeek
👍21🐳1
Re ⚠️ Important Notice:

100% FREE - No ads, no in-app purchases
🛡️ Download only from official channels to avoid being misled
📲 Search "DeepSeek" in your app store or visit our website for direct links

🌟 3/3

via Twitter @DeepSeek
👍21🔥1🐳1
🚀 DeepSeek-R1 is here!

Performance on par with OpenAI-o1
📖 Fully open-source model & technical report
🏆 MIT licensed: Distill & commercialize freely!

🌐 Website & API are live now! Try DeepThink at http://chat.deepseek.com today!

🐋 1/n

via Twitter @DeepSeek
🐳4🔥2
Re 🛠️ DeepSeek-R1: Technical Highlights

📈 Large-scale RL in post-training
🏆 Significant performance boost with minimal labeled data
🔢 Math, code, and reasoning tasks on par with OpenAI-o1
📄 More details: https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf

🐋 4/n

via Twitter @DeepSeek
4🐳4🔥2🥰1👏1
Re 🌐 API Access & Pricing

⚙️ Use DeepSeek-R1 by setting model=deepseek-reasoner
💰 $0.14 / million input tokens (cache hit)
💰 $0.55 / million input tokens (cache miss)
💰 $2.19 / million output tokens

📖 API guide: https://api-docs.deepseek.com/guides/reasoning_model

🐋 5/n

via Twitter @DeepSeek
👍6🐳63💩3🔥1
To prevent any potential harm, we reiterate that @deepseek_ai is our sole official account on Twitter/X.

Any accounts:
- representing us
- using identical avatars
- using similar names
are impersonations.

Please stay vigilant to avoid being misled!
🔥8🐳83👍1🥰1🤩1
📢 Terminology Correction: DeepSeek-R1’s code and models are released under the MIT License.
🐳14🔥42
🎉 Excited to see everyone’s enthusiasm for deploying DeepSeek-R1! Here are our recommended settings for the best experience:

• No system prompt
• Temperature: 0.6
• Official prompts for search & file upload: bit.ly/4hyH8np
• Guidelines to mitigate model bypass thinking: bit.ly/4gJrhkF

The official DeepSeek deployment runs the same model as the open-source version—enjoy the full DeepSeek-R1 experience! 🚀
🐳16👍2🔥2🫡1
🚀 Introducing NSA: A Hardware-Aligned and Natively Trainable Sparse Attention mechanism for ultra-fast long-context training & inference!

Core components of NSA:
• Dynamic hierarchical sparse strategy
• Coarse-grained token compression
• Fine-grained token selection

💡 With optimized design for modern hardware, NSA speeds up inference while reducing pre-training costs—without compromising performance. It matches or outperforms Full Attention models on general benchmarks, long-context tasks, and instruction-based reasoning.

📖 For more details, check out our paper here: https://arxiv.org/abs/2502.11089
👍8🐳7🔥21
🚀 Day 0: Warming up for #OpenSourceWeek!

We're a tiny team @deepseek_ai
exploring AGI. Starting next week, we'll be open-sourcing 5 repos, sharing our small but sincere progress with full transparency.

These humble building blocks in our online service have been documented, deployed and battle-tested in production.

As part of the open-source community, we believe that every line shared becomes collective momentum that accelerates the journey.

Daily unlocks are coming soon. No ivory towers - just pure garage-energy and community-driven innovation.
🐳323🔥2🥰1🤓1🫡1
🚀 Day 1 of #OpenSourceWeek: FlashMLA

Honored to share FlashMLA - our efficient MLA decoding kernel for Hopper GPUs, optimized for variable-length sequences and now in production.

BF16 support
Paged KV cache (block size 64)
⚡️ 3000 GB/s memory-bound & 580 TFLOPS compute-bound on H800

🔗 Explore on GitHub: https://github.com/deepseek-ai/FlashMLA
🐳133👍2🥰1🆒1
🚀 Day 2 of #OpenSourceWeek: DeepEP

Excited to introduce DeepEP - the first open-source EP communication library for MoE model training and inference.

Efficient and optimized all-to-all communication
Both intranode and internode support with NVLink and RDMA
High-throughput kernels for training and inference prefilling
Low-latency kernels for inference decoding
Native FP8 dispatch support
Flexible GPU resource control for computation-communication overlapping

🔗 GitHub: github.com/deepseek-ai/DeepEP
🐳14🔥72🥰1👏1
🚀 Day 3 of #OpenSourceWeek: DeepGEMM

Introducing DeepGEMM - an FP8 GEMM library that supports both dense and MoE GEMMs, powering V3/R1 training and inference.

⚡️ Up to 1350+ FP8 TFLOPS on Hopper GPUs
No heavy dependency, as clean as a tutorial
Fully Just-In-Time compiled
Core logic at ~300 lines - yet outperforms expert-tuned kernels across most matrix sizes
Supports dense layout and two MoE layouts

🔗 GitHub: https://github.com/deepseek-ai/DeepGEMM
🐳83👍2🔥1👏1
🚨 Off-Peak Discounts Alert!

Starting today, enjoy off-peak discounts on the DeepSeek API Platform from 16:30–00:30 UTC daily:

🔹 DeepSeek-V3 at 50% off
🔹 DeepSeek-R1 at a massive 75% off

Maximize your resources smarter — save more during these high-value hours!
🐳16👍4🔥3🥰1