TestingCatalog AI News 🗞 – Telegram
TestingCatalog AI News 🗞
4.91K subscribers
3.01K photos
401 videos
40 files
3.89K links
Reporting AI nonsense. A future news media, driven by virtual assistants 🤖
Download Telegram
Apple adds Claude Agent SDK and Codex into Xcode 26.3

Apple’s Xcode 26.3 introduces built-in Claude Agent SDK support, enabling subagents, background tasks, plugins, and full project reasoning within the IDE. This update targets productivity for solo developers and small teams.

🗞 #ai
5👍2🔥2
v0 live announcement 👀

"Important announcement from the v0 team", the next evolution of the v0 platform for vibe coding.

* v0 has been teasing Sonnet 5 earlier today.
8👍31
Perplexity released a new Advanced Deep Research upgrade with SOTA performance.

It scores 79% at Google DeepMind Deep Research QA and tops the newly introduced DRACO Benchmark. Currently available to Max users and expected to arrive for Pro users soon.
11🔥3👍1
Perplexity working on Model Concil, combining 3 AI models

Perplexity is developing Model Council, a Max-tier feature enabling users to compare outputs from top AI models like GPT-5.2, Gemini 3 Pro, and Claude Opus 4.5. A separate mode, Gamma, hints at experimental high-tier capabilities.

🗞 #perplexity
👍431
TestingCatalog AI News 🗞
Perplexity working on Model Concil, combining 3 AI models Perplexity is developing Model Council, a Max-tier feature enabling users to compare outputs from top AI models like GPT-5.2, Gemini 3 Pro, and Claude Opus 4.5. A separate mode, Gamma, hints at experimental…
BREAKING 🚨: Perplexity is working on a new Model Council multi-model system, combining outputs from GPT-5.2, Opus 4.5 and Gemini 3 Pro into one response.

In addition, a new mode named Gemma is in the works, labelled as "ASI"
12👍3👎1
Perplexity launches Advanced Deep Research for Max users

Perplexity launched the DRACO Benchmark to publicly assess AI research tools on real-world tasks across ten domains. It measures accuracy, depth, presentation, and sourcing, with initial results showing Perplexity leads in both precision and speed.

🗞 #perplexity
👍3
This media is not supported in your browser
VIEW IN TELEGRAM
GitHub Copilot Pro+ and Copilot Enterprise subscribers can now use Codex and Claude agents on GitHub.

Github Codex Pilot or Github Claude Pilot?
5👍2
According to The Information, upcoming Avocado model from Meta is referenced as the “most capable “ to date internally.

Soon? 👀
👍11😁3🤮1
BREAKING 🚨: A new Gemini checkpoint has been spotted in A/B testing.

Will we see this live? 👀

h/t x@marmaduke091
🔥16👍33
Who will crash benchmarks this week?
Anonymous Poll
23%
🔠 OpenAI
50%
🔠 Anthropic
30%
🔠 Google
11%
🔠 No one
43
BREAKING 🚨: CLAUDE OPUS 4.6 HAS BEEN SPOTTED IN PERPLEXITY APIs!

* keep in mind that this doesn’t imply an imminent release.

h/t x@synthwavedd
👍85
BREAKING 🚨: OPENAI ANNOUNCED OPENAI FRONTIER, A NEW ENTERPRISE PLATFORM TO CREATE AND MANAGE AI COWORKERS.

"Frontier gives agents the same skills people need to succeed at work: Understand how work gets done, Use a computer and tools, Improve quality over time, Stay governed & observable"

The biggest part 👀

"Built-in ways to evaluate and optimise performance make it clear to human managers and AI coworkers what’s working and what isn’t, so good behaviours improve over time. Over time, AI coworkers learn what good looks like and get better at the work that matters most."
🔥5👏41
BREAKING 🚨: A BIG DROP IS EXPECTED FOR CODEX TODAY! CODEX GITHUB ALSO DOESN’T STATE “LATEST” NEXT TO GPT-5.2 ANYMORE.
74👀3
BREAKING 🚨: PERPLEXITY IS PREPARING CLAUDE OPUS 4.6 FOR RELEASE ON THE WEB. A STRONG SIGNAL THAT IT WILL ARRIVE TODAY.

We are super close 👀
6👍1
BREAKING 🚨: CLAUDE OPUS 4.6 IS ROLLING OUT ON THE WEB, APPS AND DESKTOP!

TESTING TIME 🔥
10😭5👍1
TestingCatalog AI News 🗞
BREAKING 🚨: CLAUDE OPUS 4.6 IS ROLLING OUT ON THE WEB, APPS AND DESKTOP! TESTING TIME 🔥
This media is not supported in your browser
VIEW IN TELEGRAM
BREAKING 🚨: Claude Opus 4.6 has been officially announced. Opus 4.6 comes with an improved performance across various agentic, reasearch and coding tasks.

What would you test first? 👀
3👍1
TestingCatalog AI News 🗞
BREAKING 🚨: Claude Opus 4.6 has been officially announced. Opus 4.6 comes with an improved performance across various agentic, reasearch and coding tasks. What would you test first? 👀
Opus 4.6 comes with a big improvement at Agentic Search, Agentic financial analysis and Office tasks.

"Financial professionals use AI to research across multiple data sources, support financial analyses, and create deliverables that their teams and customers can act on."
2👍1