NEW BOT Телеграм, страница

Meta should have made it clearer that “Llama-4-Maverick-03-26-Experimental” was a customized model to optimize for human preference. As a result of that we are updating our leaderboard policies to reinforce our commitment to fair, reproducible evaluations so this confusion doesn’t occur in the future.

https://twitter.com/lmarena_ai/status/1909397817434816562

X (formerly Twitter)

lmarena.ai (formerly lmsys.org) (@lmarena_ai) on X

We've seen questions from the community about the latest release of Llama-4 on Arena. To ensure full transparency, we're releasing 2,000+ head-to-head battle results for public review. This includes user prompts, model responses, and user preferences. (link…

360 views02:22

无印🐑品 #BeHonest

这部定了 🤣

407 views02:31

无印🐑品 #BeHonest

已上牌的新 Y 感觉也已经有一些能见度了。

353 views03:49

无印🐑品 #BeHonest

Since we dropped the models as soon as they were ready, we expect it'll take several days for all the public implementations to get dialed in.

Our best understanding is that the variable quality people are seeing is due to needing to stabilize implementations.

https://twitter.com/Ahmad_Al_Dahle/status/1909302532306092107

X (formerly Twitter)

Ahmad Al-Dahle (@Ahmad_Al_Dahle) on X

We're glad to start getting Llama 4 in all your hands. We're already hearing lots of great results people are getting with these models.

That said, we're also hearing some reports of mixed quality across different services. Since we dropped the models as…

419 views04:40

无印🐑品 #BeHonest

380 views04:46

无印🐑品 #BeHonest

在中山公园下车的还是大多数

346 views05:11

无印🐑品 #BeHonest

🔥2

349 views05:48