The engine powering Grok is Grok-1, our frontier LLM, which we developed over the last four months. Grok-1 has gone through many iterations over this span of time.
After announcing xAI, we trained a prototype LLM (Grok-0) with 33 billion parameters. This early model approaches LLaMA 2 (70B) capabilities on standard LM benchmarks but uses only half of its training resources. In the last two months, we have made significant improvements in reasoning and coding capabilities leading up to Grok-1, a state-of-the-art language model that is significantly more powerful, achieving 63.2% on the HumanEval coding task and 73% on MMLU.
...
At the frontier of deep learning research, reliable infrastructure must be built with the same care as datasets and learning algorithms. To create Grok, we built a custom training and inference stack based on Kubernetes, Rust, and JAX.
https://x.ai
After announcing xAI, we trained a prototype LLM (Grok-0) with 33 billion parameters. This early model approaches LLaMA 2 (70B) capabilities on standard LM benchmarks but uses only half of its training resources. In the last two months, we have made significant improvements in reasoning and coding capabilities leading up to Grok-1, a state-of-the-art language model that is significantly more powerful, achieving 63.2% on the HumanEval coding task and 73% on MMLU.
...
At the frontier of deep learning research, reliable infrastructure must be built with the same care as datasets and learning algorithms. To create Grok, we built a custom training and inference stack based on Kubernetes, Rust, and JAX.
https://x.ai
x.ai
Welcome | xAI
xAI is an AI company with the mission of advancing scientific discovery and gaining a deeper understanding of our universe.
👍13🥱11❤3
More about updated models and new GPT capabilities
https://openai.com/blog/new-models-and-developer-products-announced-at-devday
https://openai.com/blog/introducing-gpts
https://openai.com/blog/new-models-and-developer-products-announced-at-devday
https://openai.com/blog/introducing-gpts
Openai
New models and developer products announced at DevDay
GPT-4 Turbo with 128K context and lower prices, the new Assistants API, GPT-4 Turbo with Vision, DALL·E 3 API, and more.
👍11
In case you didn't have time to watch the keynote (https://www.youtube.com/live/U9mJuUkhUzk?si=9_KjNVsS3x7vxCdP) or read any other summaries, here's a very brief mine.
# GPT-4 Turbo
## 1 context length
- up to 128k, 300 pages std book
## 2 more control:
- valid JSON mode for output
- multiple function calling + better in following instructions
- consistent output with the seed param
-logprobs in the API soon
## 3 better world knowledge
- bringing retrieval to the platform
- knowledge cutoff shifted Sep 21 to Apr 23
## 4 new modalities
- dalle 3, gpt-4-turbo with vision, TTS in API
- protect from misuse
- 6 preset voices
- oss whisper v3 in the API soon
## 5 Customization
- fine-tuning for gpt-3.5-16k
- fine-tuning for gpt-4 experimental access program
- custom models for new domain, with tools to adjust different training stages
## 6 higher rate limits
- x2 tokens per minute
- can request further increase in settings
## 7 Lower Pricing
GPT 4 turbo
- 3x less for input tokens (1c per 1000 tokens)
- 2x for completion tokens (3c per 1000)
- total 2.75x less for most devs
- starting today
- speed is also a lot faster
GPT 3.5 turbo 16k
- 0.1c/0.2c (3x/2x) (cheaper than prev 4k model)
old Fine-tuning GPT 3.5 turbo 4k
- 1.2c/1.6c
new Fine-tuning GPT 3.5 turbo 16k
- 0.3c/0.6c (4x/2.7x)
# Building on the platform
- Copyright shield for enterprise and API
- defend customers and pay costs incurred
- remind: don't train on API or ChatGPT enterprise
# ChatGPT news
- now uses GPT-4 turbo by default
- can browse web
- without model clicker
# Agents
- Gradual iterative deployment
- GPTs -- tailored versions of GPT (instructions, expanded knowledge, actions)
- data is shared only on permission
- build with natural language in GPT Builder
- can upload documents
- can publish to use, or make it private, or use by link, on create for the company in ChatGPT Enterprise
- Launching GPT Store later this month
- Revenue sharing will be there
- Bringing the same concept to API with Assistants API
# Assistants API (beta today)
- persistent threads with long time conversation history (threads and messages, managing state)
- retrieval, can read pdf files, RAG
- code interpreter can generate and run code (Python)
- function calling
- can navigate threads in the console and look inside
# GPT-4 Turbo
## 1 context length
- up to 128k, 300 pages std book
## 2 more control:
- valid JSON mode for output
- multiple function calling + better in following instructions
- consistent output with the seed param
-logprobs in the API soon
## 3 better world knowledge
- bringing retrieval to the platform
- knowledge cutoff shifted Sep 21 to Apr 23
## 4 new modalities
- dalle 3, gpt-4-turbo with vision, TTS in API
- protect from misuse
- 6 preset voices
- oss whisper v3 in the API soon
## 5 Customization
- fine-tuning for gpt-3.5-16k
- fine-tuning for gpt-4 experimental access program
- custom models for new domain, with tools to adjust different training stages
## 6 higher rate limits
- x2 tokens per minute
- can request further increase in settings
## 7 Lower Pricing
GPT 4 turbo
- 3x less for input tokens (1c per 1000 tokens)
- 2x for completion tokens (3c per 1000)
- total 2.75x less for most devs
- starting today
- speed is also a lot faster
GPT 3.5 turbo 16k
- 0.1c/0.2c (3x/2x) (cheaper than prev 4k model)
old Fine-tuning GPT 3.5 turbo 4k
- 1.2c/1.6c
new Fine-tuning GPT 3.5 turbo 16k
- 0.3c/0.6c (4x/2.7x)
# Building on the platform
- Copyright shield for enterprise and API
- defend customers and pay costs incurred
- remind: don't train on API or ChatGPT enterprise
# ChatGPT news
- now uses GPT-4 turbo by default
- can browse web
- without model clicker
# Agents
- Gradual iterative deployment
- GPTs -- tailored versions of GPT (instructions, expanded knowledge, actions)
- data is shared only on permission
- build with natural language in GPT Builder
- can upload documents
- can publish to use, or make it private, or use by link, on create for the company in ChatGPT Enterprise
- Launching GPT Store later this month
- Revenue sharing will be there
- Bringing the same concept to API with Assistants API
# Assistants API (beta today)
- persistent threads with long time conversation history (threads and messages, managing state)
- retrieval, can read pdf files, RAG
- code interpreter can generate and run code (Python)
- function calling
- can navigate threads in the console and look inside
YouTube
OpenAI DevDay: Opening Keynote
Join us for the opening keynote from OpenAI DevDay — OpenAI’s first developer conference.
We’re gathering developers from around the world for an in-person day of programming to learn about the latest AI advancements and explore what lies ahead.
New models…
We’re gathering developers from around the world for an in-person day of programming to learn about the latest AI advancements and explore what lies ahead.
New models…
👍22🔥6
Интересная новость.
https://www.hpcwire.com/2023/11/13/training-of-1-trillion-parameter-scientific-ai-begins/
Интересно даже не тем, что 1T модель обучают (если оно MoE, то бывали и побольше), а тем, что не на Нвидии это делают. Неужели реальная конкуренция наконец?
"Argonne National Laboratory (ANL) is creating a generative AI model called AuroraGPT and is pouring a giant mass of scientific information into creating the brain.
The model is being trained on its Aurora supercomputer, which delivers more than an half an exaflop performance at ANL. The system has Intel’s Ponte Vecchio GPUs, which provide the main computing power."
...
"Brkic said its Ponte Vecchio GPUs outperformed Nvidia’s A100 GPUs in another Argonne supercomputer called Theta, which has a peak performance of 11.7 petaflops."
https://www.hpcwire.com/2023/11/13/training-of-1-trillion-parameter-scientific-ai-begins/
Интересно даже не тем, что 1T модель обучают (если оно MoE, то бывали и побольше), а тем, что не на Нвидии это делают. Неужели реальная конкуренция наконец?
"Argonne National Laboratory (ANL) is creating a generative AI model called AuroraGPT and is pouring a giant mass of scientific information into creating the brain.
The model is being trained on its Aurora supercomputer, which delivers more than an half an exaflop performance at ANL. The system has Intel’s Ponte Vecchio GPUs, which provide the main computing power."
...
"Brkic said its Ponte Vecchio GPUs outperformed Nvidia’s A100 GPUs in another Argonne supercomputer called Theta, which has a peak performance of 11.7 petaflops."
HPCwire
Training of 1-Trillion Parameter Scientific AI Begins
A US national lab has started training a massive AI brain that could ultimately become the must-have computing resource for scientific researchers. Argonne National Laboratory (ANL) is creating a generative […]
👍32🤔5❤3
С генерацией картинок и текстов уже давно всё хорошо и мейнстрим, а музыка с видео пока отставали. Вот теперь Deepmind взялся за музыку:
https://deepmind.google/discover/blog/transforming-the-future-of-music-creation/
https://deepmind.google/discover/blog/transforming-the-future-of-music-creation/
Google DeepMind
Transforming the future of music creation
Announcing our most advanced music generation model and two new AI experiments, designed to open a new playground for creativity
👍23🤮6🔥4
Свежие слухи -- OpenAI начали работать над GPT-5
https://twitter.com/rowancheung/status/1724079608054812684?t=3Fs3ELPj6JKQH6pcYSHZuw&s=19
https://twitter.com/rowancheung/status/1724079608054812684?t=3Fs3ELPj6JKQH6pcYSHZuw&s=19
🔥32👻8
Вона как!
"Mr. Altman’s departure follows a deliberative review process by the board, which concluded that he was not consistently candid in his communications with the board, hindering its ability to exercise its responsibilities. The board no longer has confidence in his ability to continue leading OpenAI."
https://openai.com/blog/openai-announces-leadership-transition
"Mr. Altman’s departure follows a deliberative review process by the board, which concluded that he was not consistently candid in his communications with the board, hindering its ability to exercise its responsibilities. The board no longer has confidence in his ability to continue leading OpenAI."
https://openai.com/blog/openai-announces-leadership-transition
Openai
OpenAI announces leadership transition
😱12😢8🤨5👍4🤯4😁2❤1🔥1
Скандалы, интриги, расследования
https://www.forbes.com/sites/alexkonrad/2023/11/17/these-are-the-people-that-fired-openai-ceo-sam-altman/
https://www.forbes.com/sites/alexkonrad/2023/11/17/these-are-the-people-that-fired-openai-ceo-sam-altman/
Forbes
These Are The People That Fired OpenAI CEO Sam Altman
AI giant OpenAI had an unusual board of directors for its controlling nonprofit. Here’s who made the shocking decision to oust its cofounder and CEO.
🥰8👍3😁2🤔1🥱1
Кое-какое саммари событий на текущий момент
https://arstechnica.com/information-technology/2023/11/report-sutskever-led-board-coup-at-openai-that-ousted-altman-over-ai-safety-concerns/
https://arstechnica.com/information-technology/2023/11/report-sutskever-led-board-coup-at-openai-that-ousted-altman-over-ai-safety-concerns/
Ars Technica
Details emerge of surprise board coup that ousted CEO Sam Altman at OpenAI
Microsoft CEO "furious"; OpenAI president and 3 researchers resign. COO says "No malfeasance."
🦄17👍1
Ай молодца.
https://twitter.com/satyanadella/status/1726509045803336122?t=4hllB5IQxTesJ3NQgouMKw&s=19
https://twitter.com/satyanadella/status/1726509045803336122?t=4hllB5IQxTesJ3NQgouMKw&s=19
😁51🔥25👍5❤3😱1
Для тех, кому надоело следить за Санта-Барбарой вокруг OpenAI, о добром и вечном:
https://www.space.com/should-search-for-alien-life-include-looking-for-artificial-intelligence
https://www.space.com/should-search-for-alien-life-include-looking-for-artificial-intelligence
Space.com
In the search for alien life, should we be looking for artificial intelligence?
Extraterrestrial life could have passed through its own technological singularity long ago, meaning that our universe could be dominated by artificial life-forms.
🔥17👍9🤡5❤4