Linguistics Student looking for career advice
I'm currently in my third year of my Linguistics degree. Next year (2026-2027) will be my last and I will specialize in Computational Linguistics. I would like to get into the world of NLP Engineering, or NLP in any way. What can I do courses or certificates wise? I would like to start working asap, and I wouldn't mind doing a Master's degree while I work. Any recommendation or suggestion is welcome 😁
/r/LanguageTechnology
https://redd.it/1oqshsp
I'm currently in my third year of my Linguistics degree. Next year (2026-2027) will be my last and I will specialize in Computational Linguistics. I would like to get into the world of NLP Engineering, or NLP in any way. What can I do courses or certificates wise? I would like to start working asap, and I wouldn't mind doing a Master's degree while I work. Any recommendation or suggestion is welcome 😁
/r/LanguageTechnology
https://redd.it/1oqshsp
Reddit
From the LanguageTechnology community on Reddit
Explore this post and more from the LanguageTechnology community
D Monthly Who's Hiring and Who wants to be Hired?
For Job Postings please use this template
>Hiring: [Location\], Salary:[\], [Remote | Relocation\], [Full Time | Contract | Part Time\] and [Brief overview, what you're looking for\]
For Those looking for jobs please use this template
>Want to be Hired: [Location\], Salary Expectation:[\], [Remote | Relocation\], [Full Time | Contract | Part Time\] Resume: [Link to resume\] and [Brief overview, what you're looking for\]
​
Please remember that this community is geared towards those with experience.
/r/MachineLearning
https://redd.it/1okj2rw
For Job Postings please use this template
>Hiring: [Location\], Salary:[\], [Remote | Relocation\], [Full Time | Contract | Part Time\] and [Brief overview, what you're looking for\]
For Those looking for jobs please use this template
>Want to be Hired: [Location\], Salary Expectation:[\], [Remote | Relocation\], [Full Time | Contract | Part Time\] Resume: [Link to resume\] and [Brief overview, what you're looking for\]
​
Please remember that this community is geared towards those with experience.
/r/MachineLearning
https://redd.it/1okj2rw
Reddit
From the MachineLearning community on Reddit
Explore this post and more from the MachineLearning community
I think we found a third phase of grokking — has anyone else seen this?
/r/deeplearning
https://redd.it/1oyrmfy
/r/deeplearning
https://redd.it/1oyrmfy
Apple AIML Residency Program 2026 R
Haven't seen a 2026 post - wanted to use this to consolidate info from everyone on the process. Anyone have any idea when they start sending out info session updates?
/r/MachineLearning
https://redd.it/1p0lart
Haven't seen a 2026 post - wanted to use this to consolidate info from everyone on the process. Anyone have any idea when they start sending out info session updates?
/r/MachineLearning
https://redd.it/1p0lart
Reddit
From the MachineLearning community on Reddit
Explore this post and more from the MachineLearning community
Theory for Karpathy's "Zero to Hero"
I always enjoyed "understanding" how LLMs work but never actually implemented it. After a friend recommended "zero to hero", I have been hooked!!
I am just 1.5 videos in, but still feel there are gaps in what I am learning. I am also implementing the code myself along with watching.
I took an ML class in my college but its been 8 years and I don't remember much.
He mentions some topics like "cross entropy loss", "learning rate decay" or "maximum likelihood estimation", but don't necessarily go in depth. I want to structure my learnings more.
Can someone please suggest reading material to read along with these videos or some pre-requisites? I do not want to fall in tutorial trap.
/r/deeplearning
https://redd.it/1p2lm6z
I always enjoyed "understanding" how LLMs work but never actually implemented it. After a friend recommended "zero to hero", I have been hooked!!
I am just 1.5 videos in, but still feel there are gaps in what I am learning. I am also implementing the code myself along with watching.
I took an ML class in my college but its been 8 years and I don't remember much.
He mentions some topics like "cross entropy loss", "learning rate decay" or "maximum likelihood estimation", but don't necessarily go in depth. I want to structure my learnings more.
Can someone please suggest reading material to read along with these videos or some pre-requisites? I do not want to fall in tutorial trap.
/r/deeplearning
https://redd.it/1p2lm6z
Reddit
From the deeplearning community on Reddit
Explore this post and more from the deeplearning community
Kimi K2 Thinking and Gemini 3 may have just shown OpenAI to be the AI bubble epicenter.
In an interview recently. Sam Altman commented that while he didn't think there was an AI bubble, some players were poised to lose a whole lot of money. Before Moonshot AI launched Kimi K2 Thinking on November 6 and before Google launched Gemini 3 on November 18, coming out of nowhere to massively leapfrog over every other AI by an historic margin, we might have wondered who these big losers in the AI race would ultimately be. Now that the numbers are in, it seems Altman might have presciently been talking about OpenAI.
Here's why. Let's begin with OpenAI's revenue projections for the next 5 years, all calculated before the launch of Kimi K2 Thinking and Gemini 3. A few key points stand out. First, OpenAI made those earnings projections about products that don't yet exist. Second, no one has yet created the demand for these products. And third, perhaps most importantly, OpenAI apparently didn't factor in the competition.
So when a 2-year-old startup from China open sources a thinking model it trained on less than $5 million, (by comparison GPT-5 cost OpenAI between $1.5 billion and $2 billion to train) you have to appreciate how much the AI landscape has shifted in a matter of days. And K2 Thinking was not just another model. It outperformed GPT-5. Grok 4, Gemini 2.5, and Claude 4 on many of the most important benchmarks. Of course the threat that OpenAI faces isn't really about Moonshot or Kimi K2 Thinking. It's about the world now knowing with absolute certainty that a small lab spending a miniscule amount of money can overtake ALL of the AI giants, while costing consumers and enterprises from 2 to 10 times less to run.
But Kimi K2 Thinking really isn't what OpenAI should be worried about. Let the following sink in:
Gemini 3 set monstrous new highs with 37.5% on Humanity’s Last Exam and 45.1% on ARC-AGI-2 in Deep Think mode—nearly doubling GPT-5 on both measures. It also scored 1501 Elo on LMArena and 91.9% on GPQA Diamond, outperforming GPT-5 and Claude across strategic reasoning, scientific knowledge, and abstract problem-solving. And that's just the beginning. Gemini 3 dominated its competitors far beyond those key benchmarks. If you're brave enough to review a brutally detailed account of how completely Gemini 3 trounced OpenAI and pretty much everyone else on pretty much everything, check out the following stats:
https://www.vellum.ai/blog/google-gemini-3-benchmarks?utm=&utmsource=direct&utmmedium=none
These scores position Gemini 3 way ahead -- perhaps years ahead -- of OpenAI on the metrics that matter most to both consumer and enterprise AI. Essentially Google just ate OpenAI's lunch, dinner and breakfast the next day.
But that's just the competition part of all of this. While Kimi K2 Thinking clearly demonstrates that massive data centers are just not necessary to building the most powerful AIs, OpenAI has committed $1.4 trillion in investments to build massive data centers, most of which won't be operational for years. It could be that this miscalculation -- this massive misappropriation of investment commitments -- best comes to explain why OpenAI may have positioned itself to be THE big loser in the AI bubble that Altman warned everyone about.
The bottom line is that if OpenAI doesn't pull a rabbit out of the hat during 2026, it may become the first major casualty of the AI bubble that will hopefully be limited to colossally unwise investments like those of OpenAI. For their sake, let's hope that it's a really, really big rabbit.
/r/deeplearning
https://redd.it/1p558ag
In an interview recently. Sam Altman commented that while he didn't think there was an AI bubble, some players were poised to lose a whole lot of money. Before Moonshot AI launched Kimi K2 Thinking on November 6 and before Google launched Gemini 3 on November 18, coming out of nowhere to massively leapfrog over every other AI by an historic margin, we might have wondered who these big losers in the AI race would ultimately be. Now that the numbers are in, it seems Altman might have presciently been talking about OpenAI.
Here's why. Let's begin with OpenAI's revenue projections for the next 5 years, all calculated before the launch of Kimi K2 Thinking and Gemini 3. A few key points stand out. First, OpenAI made those earnings projections about products that don't yet exist. Second, no one has yet created the demand for these products. And third, perhaps most importantly, OpenAI apparently didn't factor in the competition.
So when a 2-year-old startup from China open sources a thinking model it trained on less than $5 million, (by comparison GPT-5 cost OpenAI between $1.5 billion and $2 billion to train) you have to appreciate how much the AI landscape has shifted in a matter of days. And K2 Thinking was not just another model. It outperformed GPT-5. Grok 4, Gemini 2.5, and Claude 4 on many of the most important benchmarks. Of course the threat that OpenAI faces isn't really about Moonshot or Kimi K2 Thinking. It's about the world now knowing with absolute certainty that a small lab spending a miniscule amount of money can overtake ALL of the AI giants, while costing consumers and enterprises from 2 to 10 times less to run.
But Kimi K2 Thinking really isn't what OpenAI should be worried about. Let the following sink in:
Gemini 3 set monstrous new highs with 37.5% on Humanity’s Last Exam and 45.1% on ARC-AGI-2 in Deep Think mode—nearly doubling GPT-5 on both measures. It also scored 1501 Elo on LMArena and 91.9% on GPQA Diamond, outperforming GPT-5 and Claude across strategic reasoning, scientific knowledge, and abstract problem-solving. And that's just the beginning. Gemini 3 dominated its competitors far beyond those key benchmarks. If you're brave enough to review a brutally detailed account of how completely Gemini 3 trounced OpenAI and pretty much everyone else on pretty much everything, check out the following stats:
https://www.vellum.ai/blog/google-gemini-3-benchmarks?utm=&utmsource=direct&utmmedium=none
These scores position Gemini 3 way ahead -- perhaps years ahead -- of OpenAI on the metrics that matter most to both consumer and enterprise AI. Essentially Google just ate OpenAI's lunch, dinner and breakfast the next day.
But that's just the competition part of all of this. While Kimi K2 Thinking clearly demonstrates that massive data centers are just not necessary to building the most powerful AIs, OpenAI has committed $1.4 trillion in investments to build massive data centers, most of which won't be operational for years. It could be that this miscalculation -- this massive misappropriation of investment commitments -- best comes to explain why OpenAI may have positioned itself to be THE big loser in the AI bubble that Altman warned everyone about.
The bottom line is that if OpenAI doesn't pull a rabbit out of the hat during 2026, it may become the first major casualty of the AI bubble that will hopefully be limited to colossally unwise investments like those of OpenAI. For their sake, let's hope that it's a really, really big rabbit.
/r/deeplearning
https://redd.it/1p558ag
www.vellum.ai
Google Gemini 3 Benchmarks
Explore this breakdown of Gemini 3 Pro’s benchmarks and performance across reasoning, math, multimodal, and agentic benchmark to learn what results actually mean for building more powerful AI agents.
AMA with Indiana University CL Faculty on November 24
Hi r/LanguageTechnology! Three of us faculty members here in [computational linguistics at Indiana University Bloomington](https://cl.indiana.edu/) will be doing an AMA on this coming Monday, **November 24**, from **2pm to 5pm ET** (19 GMT to 22 GMT).
The three of us who will be around are:
* [Luke Gessler](https://lgessler.com/) (low-resource NLP, corpora, computational language documentation)
* [Shuju Shi](https://scholar.google.com/citations?user=SGZk95cAAAAJ&hl=en) (speech recognition, phonetics, computer-aided language learning)
* [Sandra Kuebler](https://cl.indiana.edu/~skuebler/) (parsing, hate speech, machine learning for NLP)
We're happy to field your questions on:
* Higher education in CL
* MS and PhD programs
* Our research specialties
* Anything else on your mind
Please save the date, and look out for the AMA thread which we'll make earlier in the day on the 24th.
EDIT: we're going to reuse this thread for questions, so ask away!
/r/LanguageTechnology
https://redd.it/1p263p0
Hi r/LanguageTechnology! Three of us faculty members here in [computational linguistics at Indiana University Bloomington](https://cl.indiana.edu/) will be doing an AMA on this coming Monday, **November 24**, from **2pm to 5pm ET** (19 GMT to 22 GMT).
The three of us who will be around are:
* [Luke Gessler](https://lgessler.com/) (low-resource NLP, corpora, computational language documentation)
* [Shuju Shi](https://scholar.google.com/citations?user=SGZk95cAAAAJ&hl=en) (speech recognition, phonetics, computer-aided language learning)
* [Sandra Kuebler](https://cl.indiana.edu/~skuebler/) (parsing, hate speech, machine learning for NLP)
We're happy to field your questions on:
* Higher education in CL
* MS and PhD programs
* Our research specialties
* Anything else on your mind
Please save the date, and look out for the AMA thread which we'll make earlier in the day on the 24th.
EDIT: we're going to reuse this thread for questions, so ask away!
/r/LanguageTechnology
https://redd.it/1p263p0
cl.indiana.edu
Home - Computational Linguistics
IU Computational Linguistics is a group of faculty, student, and staff researchers based in Indiana University Bloomington's Department of Linguistics. We work with natural language using computational methods in order to investigate scientific questions…
Did self-supervised learning for visual features quietly peak already?
From around 2020–2024 it felt like self-supervised learning (SSL, self-supervised learning) for image features was on fire — BYOL (Bootstrap Your Own Latent), SimCLR (Simple Contrastive Learning of Representations), SwAV (Swapping Assignments between multiple Views), DINO, etc. Every few months there was some new objective, augmentation trick, or architectural tweak that actually moved the needle for feature extractors.
This year it feels a lot quieter on the “new SSL objective for vision backbones” front. We got DINOv3, but as far as I can tell it’s mostly smart but incremental tweaks plus a lot of scaling in terms of data and compute, rather than a totally new idea about how to learn general-purpose image features.
So I’m wondering:
Have I just missed some important recent SSL image models for feature extraction?
Or has the research focus mostly shifted to multimodal/foundation models and generative stuff, with “vanilla” visual SSL kind of considered a solved or mature problem now?
is the SSL scene for general vision features still evolving in interesting ways, or did we mostly hit diminishing returns after the original DINO/BYOL/SimCLR wave?
/r/computervision
https://redd.it/1pavb20
From around 2020–2024 it felt like self-supervised learning (SSL, self-supervised learning) for image features was on fire — BYOL (Bootstrap Your Own Latent), SimCLR (Simple Contrastive Learning of Representations), SwAV (Swapping Assignments between multiple Views), DINO, etc. Every few months there was some new objective, augmentation trick, or architectural tweak that actually moved the needle for feature extractors.
This year it feels a lot quieter on the “new SSL objective for vision backbones” front. We got DINOv3, but as far as I can tell it’s mostly smart but incremental tweaks plus a lot of scaling in terms of data and compute, rather than a totally new idea about how to learn general-purpose image features.
So I’m wondering:
Have I just missed some important recent SSL image models for feature extraction?
Or has the research focus mostly shifted to multimodal/foundation models and generative stuff, with “vanilla” visual SSL kind of considered a solved or mature problem now?
is the SSL scene for general vision features still evolving in interesting ways, or did we mostly hit diminishing returns after the original DINO/BYOL/SimCLR wave?
/r/computervision
https://redd.it/1pavb20
Reddit
From the computervision community on Reddit
Explore this post and more from the computervision community
[D] Attention before it was all we needed
*hey all,*
so I guess most of us have read/heard of *Attention Is All You Need*, which gave us the foundation of the transformer models we all use today. Yesterday I spent some time browsing some pre-cursor papers that were exploring attention right before the AIAYN paper. The ones I found most relevant were:
* End-To-End Memory Networks: [https://arxiv.org/pdf/1503.08895](https://arxiv.org/pdf/1503.08895)
* Key-Value Memory Networks for Directly Reading Documents: [https://arxiv.org/pdf/1606.03126](https://arxiv.org/pdf/1606.03126)
* Neural Machine Translation by Jointly Learning to Align and Translate: [https://arxiv.org/pdf/1409.0473](https://arxiv.org/pdf/1409.0473)
they all (directly or indirectly) use something like the `softmax(QK^T)V` (scaled dot-product attention, SDPA) operation in different ways, but with extra machinery on top, which makes them feel less general and more specialized to a particular setup.
it’s kind of fun in hindsight that this core calculation was almost a “trick” in these earlier works, embedded into more complex systems, and then AIAYN comes along and says: actually, let’s strip away most of the extra parts and just make attention the main building block — “attention is all you need”.
Hope some of you find this interesting. I’d love to hear any insights or anecdotes from people who were around / working with these models at the time. and if there are other important pre-transformer attention papers I should read, please let me know as well. ⚡
/r/deeplearning
https://redd.it/1pc37u0
*hey all,*
so I guess most of us have read/heard of *Attention Is All You Need*, which gave us the foundation of the transformer models we all use today. Yesterday I spent some time browsing some pre-cursor papers that were exploring attention right before the AIAYN paper. The ones I found most relevant were:
* End-To-End Memory Networks: [https://arxiv.org/pdf/1503.08895](https://arxiv.org/pdf/1503.08895)
* Key-Value Memory Networks for Directly Reading Documents: [https://arxiv.org/pdf/1606.03126](https://arxiv.org/pdf/1606.03126)
* Neural Machine Translation by Jointly Learning to Align and Translate: [https://arxiv.org/pdf/1409.0473](https://arxiv.org/pdf/1409.0473)
they all (directly or indirectly) use something like the `softmax(QK^T)V` (scaled dot-product attention, SDPA) operation in different ways, but with extra machinery on top, which makes them feel less general and more specialized to a particular setup.
it’s kind of fun in hindsight that this core calculation was almost a “trick” in these earlier works, embedded into more complex systems, and then AIAYN comes along and says: actually, let’s strip away most of the extra parts and just make attention the main building block — “attention is all you need”.
Hope some of you find this interesting. I’d love to hear any insights or anecdotes from people who were around / working with these models at the time. and if there are other important pre-transformer attention papers I should read, please let me know as well. ⚡
/r/deeplearning
https://redd.it/1pc37u0
D Monthly Who's Hiring and Who wants to be Hired?
For Job Postings please use this template
>Hiring: [Location\], Salary:[\], [Remote | Relocation\], [Full Time | Contract | Part Time\] and [Brief overview, what you're looking for\]
For Those looking for jobs please use this template
>Want to be Hired: [Location\], Salary Expectation:[\], [Remote | Relocation\], [Full Time | Contract | Part Time\] Resume: [Link to resume\] and [Brief overview, what you're looking for\]
​
Please remember that this community is geared towards those with experience.
/r/MachineLearning
https://redd.it/1pb25zo
For Job Postings please use this template
>Hiring: [Location\], Salary:[\], [Remote | Relocation\], [Full Time | Contract | Part Time\] and [Brief overview, what you're looking for\]
For Those looking for jobs please use this template
>Want to be Hired: [Location\], Salary Expectation:[\], [Remote | Relocation\], [Full Time | Contract | Part Time\] Resume: [Link to resume\] and [Brief overview, what you're looking for\]
​
Please remember that this community is geared towards those with experience.
/r/MachineLearning
https://redd.it/1pb25zo
Reddit
From the MachineLearning community on Reddit
Explore this post and more from the MachineLearning community
D How did Gemini 3 Pro manage to get 38.3% on Humanity's Last Exam?
On ARC-AGI 2, Gemini improved its score from 5% (for 2.5 Pro) to 31% (for 3 Pro), both at $0.80 per task. This is amazing, but a lot of people here seem to believe that they just generated millions to synthetic ARC-like examples for pretraining. This is allowed by the rules of the competition, and the top Kaggle solution this year did just that. (Although investors and users might find such a tactic misleading.)
But how did Gemini go from 21.6% to 38.3% on Humanity's Last Exam? This kind of training data is very expensive to obtain en masse. The only practical way to "benchmax" here that I see is to actually cheat, i.e. use the test data for training.
What do you think is going on here? Is 3 as much of an improvement over 2.5 as its Humanity's Last Exam scores suggest?
/r/MachineLearning
https://redd.it/1pgqbjd
On ARC-AGI 2, Gemini improved its score from 5% (for 2.5 Pro) to 31% (for 3 Pro), both at $0.80 per task. This is amazing, but a lot of people here seem to believe that they just generated millions to synthetic ARC-like examples for pretraining. This is allowed by the rules of the competition, and the top Kaggle solution this year did just that. (Although investors and users might find such a tactic misleading.)
But how did Gemini go from 21.6% to 38.3% on Humanity's Last Exam? This kind of training data is very expensive to obtain en masse. The only practical way to "benchmax" here that I see is to actually cheat, i.e. use the test data for training.
What do you think is going on here? Is 3 as much of an improvement over 2.5 as its Humanity's Last Exam scores suggest?
/r/MachineLearning
https://redd.it/1pgqbjd
Reddit
From the MachineLearning community on Reddit
Explore this post and more from the MachineLearning community
CVPR Submission id changed D
When I logged into my Openreview CVPR author console, I found that my submission id has been changed from 9k+ to 42k+ . Interestingly, the openreview has applied some black colored mask on multiple pages of the pdf, probably to hide original id mentioned at the header in every page. Did anyone else notice that??
/r/MachineLearning
https://redd.it/1phygsa
When I logged into my Openreview CVPR author console, I found that my submission id has been changed from 9k+ to 42k+ . Interestingly, the openreview has applied some black colored mask on multiple pages of the pdf, probably to hide original id mentioned at the header in every page. Did anyone else notice that??
/r/MachineLearning
https://redd.it/1phygsa
Reddit
From the MachineLearning community on Reddit
Explore this post and more from the MachineLearning community
EACL 2026
Review Season is Here — Share Your Scores, Meta-Reviews & Thoughts!
With the ARR October 2025 → EACL 2026 cycle in full swing, I figured it’s a good time to open a discussion thread for everyone waiting on reviews, meta-reviews, and (eventually) decisions.
Looking forward to hearing your scores and experiences..!!!!
/r/LanguageTechnology
https://redd.it/1oykfv3
Review Season is Here — Share Your Scores, Meta-Reviews & Thoughts!
With the ARR October 2025 → EACL 2026 cycle in full swing, I figured it’s a good time to open a discussion thread for everyone waiting on reviews, meta-reviews, and (eventually) decisions.
Looking forward to hearing your scores and experiences..!!!!
/r/LanguageTechnology
https://redd.it/1oykfv3
Reddit
From the LanguageTechnology community on Reddit
Explore this post and more from the LanguageTechnology community
Comparing Different Object Detection Models (Metrics: Precision, Recall, F1-Score, COCO-mAP)
Hey there,
I am trying to train multiple object detection models (YOLO11, RT-DETRv4, DEIMv2) on a custom dataset while using the Ultralytics framework for YOLO and the repositories provided by the model authors from RT-DETRv4 and DEIMv2.
To objectivly compare the model performance I want to calculate the following metrics:
Precision (at fixed IoU-threshold like 0.5)
Recall (at fixed IoU-threshold like 0.5)
F1-Score (at fixed IoU-threshold like 0.5)
mAP at 0.5, 0.75 and 0.5:0.05:0.95 as well as for small, medium and large objects
However each framework appears to differ in the way they evaluate the model and the provided metrics. My idea was to run the models in prediction mode on the test-split of my custom dataset and then use the results to calculate the required metrics in a Python noscript by myself or with the help of a library like
I am wondering what is the correct way to evaluate the models. Just use the tools provided by the authors and only use those metrics which are available for all models? In each paper on object detection models those metrics are provided to describe model performance but rarely, if at all, it's described how they were practically obtained (only theory, formula is stated).
I would appreciate if anyone can offer some insights on how to properly test the models with an academic setting in mind.
Thanks!
/r/computervision
https://redd.it/1pmmujx
Hey there,
I am trying to train multiple object detection models (YOLO11, RT-DETRv4, DEIMv2) on a custom dataset while using the Ultralytics framework for YOLO and the repositories provided by the model authors from RT-DETRv4 and DEIMv2.
To objectivly compare the model performance I want to calculate the following metrics:
Precision (at fixed IoU-threshold like 0.5)
Recall (at fixed IoU-threshold like 0.5)
F1-Score (at fixed IoU-threshold like 0.5)
mAP at 0.5, 0.75 and 0.5:0.05:0.95 as well as for small, medium and large objects
However each framework appears to differ in the way they evaluate the model and the provided metrics. My idea was to run the models in prediction mode on the test-split of my custom dataset and then use the results to calculate the required metrics in a Python noscript by myself or with the help of a library like
pycocotools. Different sources (Github etc.) claim this might provide wrong results compared to using the tools provided by the respective framework as the prediction settings usual differ from validation/test settings. I am wondering what is the correct way to evaluate the models. Just use the tools provided by the authors and only use those metrics which are available for all models? In each paper on object detection models those metrics are provided to describe model performance but rarely, if at all, it's described how they were practically obtained (only theory, formula is stated).
I would appreciate if anyone can offer some insights on how to properly test the models with an academic setting in mind.
Thanks!
/r/computervision
https://redd.it/1pmmujx
Reddit
From the computervision community on Reddit
Explore this post and more from the computervision community
Zoom pivots from web conferencing to Federated AI, and earns SOTA on HLE. High level talent is proving to be quite common.
Part of this story is about how Zoom brought together a team of the top models in a federated AI system that recently earned SOTA by scoring 48.1% on HLE, dethroning Gemini 3 with its 45.8%. it's too early to tell if this federated strategy will continue to unseat top models, and it's definitely something to watch. But I want to focus on a different part of Zoom's full entry into the AI space. It is becoming increasingly clear that top AI talent, like senior engineers, can be found just about anywhere.
Our first example is DeepSeek, who took the world by storm in January with the power and cost effectiveness of its open source AIs. The important point here is that DeepSeek started as a "side project" of a few people working at a hedge fund.
Then in September a Chinese food delivery company named Meituan stunned the world by open sourcing LongCat‑Flash‑Omni. It topped Gemini-2.5-Pro and Gemini-2.5-Flash on DailyOmni with 82.38, demonstrating its superior multimodal reasoning. Again, this was a food delivery company that turned itself into a top AI contender!
Then a few weeks ago six former engineers from Google and DeepMind scaffolded their meta-system onto Gemini 3 Pro, and earned SOTA on ARC-AGI-2 with a score of 54%, beating Gemini's Deep Think (preview) that scored 45.1%. Their company, Poetiq, has only been around for about 7 months.
Now contrast these developments with Zuckerberg's massive talent spending spree, where he paid some engineers hundreds of millions of dollars to join Meta. One would think that top talent is rare, and very expensive. But it's becoming increasingly clear that top AI engineers are everywhere, poised to stun the world again, and again, and again.
/r/deeplearning
https://redd.it/1pnj07o
Part of this story is about how Zoom brought together a team of the top models in a federated AI system that recently earned SOTA by scoring 48.1% on HLE, dethroning Gemini 3 with its 45.8%. it's too early to tell if this federated strategy will continue to unseat top models, and it's definitely something to watch. But I want to focus on a different part of Zoom's full entry into the AI space. It is becoming increasingly clear that top AI talent, like senior engineers, can be found just about anywhere.
Our first example is DeepSeek, who took the world by storm in January with the power and cost effectiveness of its open source AIs. The important point here is that DeepSeek started as a "side project" of a few people working at a hedge fund.
Then in September a Chinese food delivery company named Meituan stunned the world by open sourcing LongCat‑Flash‑Omni. It topped Gemini-2.5-Pro and Gemini-2.5-Flash on DailyOmni with 82.38, demonstrating its superior multimodal reasoning. Again, this was a food delivery company that turned itself into a top AI contender!
Then a few weeks ago six former engineers from Google and DeepMind scaffolded their meta-system onto Gemini 3 Pro, and earned SOTA on ARC-AGI-2 with a score of 54%, beating Gemini's Deep Think (preview) that scored 45.1%. Their company, Poetiq, has only been around for about 7 months.
Now contrast these developments with Zuckerberg's massive talent spending spree, where he paid some engineers hundreds of millions of dollars to join Meta. One would think that top talent is rare, and very expensive. But it's becoming increasingly clear that top AI engineers are everywhere, poised to stun the world again, and again, and again.
/r/deeplearning
https://redd.it/1pnj07o
Reddit
From the deeplearning community on Reddit
Explore this post and more from the deeplearning community
For Text/Corpus Cluster Analysis - How do I handle huge, and very many small, outliers?
/r/LanguageTechnology
https://redd.it/1pnb3a9
/r/LanguageTechnology
https://redd.it/1pnb3a9
R No causal inference workshops at ICLR 2026?
What gives? Anyone got any alternative venues in mind for causal topics? Otherwise we going straight to the main track I guess.
p.s. The full list is posted on twitter. Also some of these are already on openreview.
/r/MachineLearning
https://redd.it/1psp0a1
What gives? Anyone got any alternative venues in mind for causal topics? Otherwise we going straight to the main track I guess.
p.s. The full list is posted on twitter. Also some of these are already on openreview.
/r/MachineLearning
https://redd.it/1psp0a1
X (formerly Twitter)
Yanan Sui (@YananSui) on X
@iclr_conf ICLR_2026_Workshops_names_20251201
D Self-Promotion Thread
Please post your personal projects, startups, product placements, collaboration needs, blogs etc.
Please mention the payment and pricing requirements for products and services.
Please do not post link shorteners, link aggregator websites , or auto-subscribe links.
\--
Any abuse of trust will lead to bans.
Encourage others who create new posts for questions to post here instead!
Thread will stay alive until next one so keep posting after the date in the noscript.
\--
Meta: This is an experiment. If the community doesnt like this, we will cancel it. This is to encourage those in the community to promote their work by not spamming the main threads.
/r/MachineLearning
https://redd.it/1pbxkt2
Please post your personal projects, startups, product placements, collaboration needs, blogs etc.
Please mention the payment and pricing requirements for products and services.
Please do not post link shorteners, link aggregator websites , or auto-subscribe links.
\--
Any abuse of trust will lead to bans.
Encourage others who create new posts for questions to post here instead!
Thread will stay alive until next one so keep posting after the date in the noscript.
\--
Meta: This is an experiment. If the community doesnt like this, we will cancel it. This is to encourage those in the community to promote their work by not spamming the main threads.
/r/MachineLearning
https://redd.it/1pbxkt2
Reddit
From the MachineLearning community on Reddit
Explore this post and more from the MachineLearning community
D Best papers of 2025
Which papers do you think are the most important ones which were released in 2025?
Please, provide a link to the paper if you share one.
/r/MachineLearning
https://redd.it/1pvmrx9
Which papers do you think are the most important ones which were released in 2025?
Please, provide a link to the paper if you share one.
/r/MachineLearning
https://redd.it/1pvmrx9
Reddit
From the MachineLearning community on Reddit
Explore this post and more from the MachineLearning community