NEW BOT Телеграм, страница

gonzo-обзоры ML статей

Продолжаю наблюдать за темой про AI scientists :)

Бонусом ссылка на интересную вакансию про open-endedness

❤12👍5😁1

7.05K views17:01

gonzo-обзоры ML статей

Слайд забыл :)

❤7🦄5

6.2K views17:03

gonzo-обзоры ML статей

И снова про AI-исследователей.

Авторы претендуют на end-to-end NAS (network architecture search), заявляют что увидели аналог хода 37 Альфаго, и обнаружили закон скейлинга — чем больше компьюта, тем линейно больше SOTA архитектур.

https://news.1rj.ru/str/gonzo_ML_podcasts/591

Нас всех отскейлят!

gonzo_ML_podcasts

AlphaGo Moment for Model Architecture Discovery
Authors: Yixiu Liu, Yang Nan, Weixian Xu, Xiangkun Hu, Lyumanshan Ye, Zhen Qin, Pengfei Liu
Paper: https://arxiv.org/abs/2507.18074
Code: https://github.com/GAIR-NLP/ASI-Arch
Model: https://gair-nlp.github.io/ASI…

🥱5🤔4❤2🔥2👍1😁1🥴1

6.69K viewsedited 10:48

gonzo-обзоры ML статей

https://news.1rj.ru/str/gonzo_ML_podcasts/594

gonzo_ML_podcasts

😁7

6.31K views10:48

gonzo-обзоры ML статей

Очень прикольная работа про subliminal learning: https://news.1rj.ru/str/gonzo_ML_podcasts/602

Из серии про природу вещей и геометрию репрезентаций. Идея в том, что при дистилляции модель-студент может выучить способности, которые напрямую ей не передаются. Например, любовь к совам через обучение числовым последовательностям.

Вроде на уровне внутренних репрезентаций и общих инициализаций всё логично, но вообще даёт богатую пищу для размышлений. Куда-то сюда же ложится тема про dataset distillation (https://news.1rj.ru/str/gonzo_ML/143), да и вообще возникают вопросы, как у людей могут появляться разные фичи без явной их передачи. Может, кстати, эффект Манделы сюда же? ;)

gonzo_ML_podcasts

Subliminal Learning: Language models transmit behavioral traits via hidden signals in data
Authors: Alex Cloud, Minh Le, James Chua, Jan Betley, Anna Sztyber-Betley, Jacob Hilton, Samuel Marks, Owain Evans
Paper: https://arxiv.org/abs/2507.14805
Site: ht…

❤16👍9

6.68K viewsedited 16:41

gonzo-обзоры ML статей

https://news.1rj.ru/str/gonzo_ML_podcasts/618

gonzo_ML_podcasts

🔥7

6.21K views16:41

gonzo-обзоры ML статей

Я, кстати, хочу подсветить, что в работе про subliminal learning в большинстве экспериментов была не logit-дистилляция, для которой всё было бы более-менее очевидно (был один эксперимент на MNIST с logit-дистилляцией), а дистилляция на уровне токенов, по сути обычный SFT, когда модель-учитель (например, закрытая GPT-4.1/mini/nano) генерит ответы на несвязанные со скрытой способностью запросы, а другая такая же модель (тоже закрытая GPT-4.1/mini/nano) на этом датасете файнтюнится.

Это добавляет находке красоты!

gonzo-обзоры ML статей

❤10🤯8👍2

5.69K views07:55

gonzo-обзоры ML статей

Прикольная работа про эволюцию промптов, которая бьёт RL — GEPA (не путать с лекуновской JEPA!)

https://news.1rj.ru/str/gonzo_ML_podcasts/619

Рефлексия на естественном языке вместо скалярных наград, эволюция только инструкций без few-shot примеров — и на редкость хороший результат. Очередной пример, когда всё больше "интеллекта" выносится на сторону LLM (как и в AlphaEvolve, например, https://news.1rj.ru/str/gonzo_ML/3624), и это работает хорошо.

gonzo_ML_podcasts

GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning
Authors: Lakshya A Agrawal, Shangyin Tan, Dilara Soylu, Noah Ziems, Rishi Khare, Krista Opsahl-Ong, Arnav Singhvi, Herumb Shandilya, Michael J Ryan, Meng Jiang, Christopher Potts, Koushik…

😁10🔥9❤3👍1

6.05K views10:57

gonzo-обзоры ML статей

https://news.1rj.ru/str/gonzo_ML_podcasts/628

gonzo_ML_podcasts

5.58K views10:59

gonzo-обзоры ML статей

Любопытная тёрка между Лекуном и Маском про инженеров и исследователей

https://www.linkedin.com/posts/yann-lecun_there-is-a-difference-between-research-and-activity-7356606929554567169-_iT2

There is a difference between research and engineering in (1) modus operandi, (2) methodology, (3) openness, (4) evaluation criteria.

Research uses the methodology of science to discover new principles, demonstrate that they can work in practice, analyze their advantages and limitations, and interact with the wider research community to criticize, validate, reproduce, compare, and improve. The criteria are conceptual simplicity, theoretical beauty/explainability, clear performance advantage over prior art on some accepted metrics. This is true for research in academia as well as in industry.

Engineering integrates methods, often developed in a research mode, to build working systems. The philosophy is to go with the first set of methods that work well enough for the task. It generally involves a lot of tinkering, tweaking, fine-tuning, and an occasional kludge to get the performance up on a real task. Whether the method is the absolute best matters less than whether it is good enough for the tasks at hand.

Researchers are evaluated largely on intellectual impact. Research evaluation is a difficult task because the product impact may occur years (sometimes decades) after the work. For that reason, evaluation must often rely on the collective opinion of the research community through proxies such as publications, citations, invited talks, awards, etc. That's one reason research must be published.

Engineers are evaluated largely on product impact, sometimes through proxy metrics such as pull requests, lines of code, etc.

By operating in engineering mode, researchers are incentivize to do incremental work. If you make no distinction between the two activities, if you don't evaluate researchers and engineers with different criteria, you run the risk of killing breakthrough innovation. True breakthroughs require teams with a long horizon and minimal constraints from product development and management.

The industry research labs of yore that have left an indelible mark on scientific and technological progress (Bell Labs Area 11, IBM Research, Xerox PARC, etc) were all research divisions that were clearly separate from engineering divisions.

How research and engineering differ in methodology and evaluation | Yann LeCun posted on the topic | LinkedIn

There is a difference between research and engineering in (1) modus operandi, (2) methodology, (3) openness, (4) evaluation criteria.

Research uses the methodology of science to discover new principles, demonstrate that they can work in practice, analyze…

❤43🔥8

6.65K viewsedited 12:16

gonzo-обзоры ML статей

А ещё сегодня Цукерберг опубликовал своё видение про персональный сверхинтеллект.

Любопытный комментарий тут.

Meta

Personal Superintelligence

Explore Meta's vision of personal superintelligence, where AI empowers individuals to achieve their goals, create, connect, and lead fulfilling lives. Insights from Mark Zuckerberg on the future of AI and human empowerment.

🤮22❤5👍5🦄1

6.34K views17:29

gonzo-обзоры ML статей

Интересная работа про Energy-based трансформеры: https://news.1rj.ru/str/gonzo_ML_podcasts/633

Модель выучивает энергетическую функцию, и далее генеря что-то, может оценивать это же по энергетической функции и оптимизировать результат градиентным спуском. Результат выглядит неплохо.

gonzo_ML_podcasts

Energy-Based Transformers are Scalable Learners and Thinkers
Alexi Gladstone, Ganesh Nanduru, Md Mofijul Islam, Peixuan Han, Hyeonjeong Ha, Aman Chadha, Yilun Du, Heng Ji, Jundong Li, Tariq Iqbal
Статья: https://arxiv.org/abs/2507.02092
Код: https://gith…

🔥14👍3🥰1

7.01K views14:01

gonzo-обзоры ML статей

https://news.1rj.ru/str/gonzo_ML_podcasts/636

gonzo_ML_podcasts

🔥6😍3

6.48K views14:01

gonzo-обзоры ML статей

Если вам нечего посмотреть на выходных, то есть прекрасный фильм Memento, который предсказал мир LLM задолго до него.

Что забавно, недавно, в июне, таки вышла статья, напрямую эксплуатирующая эту идею и название:

Memento: Note-Taking for Your Future Self
https://arxiv.org/abs/2506.20642

IMDb

Memento (2000) ⭐ 8.4 | Drama, Mystery, Thriller

1h 53m | R

❤16😁7

6.57K viewsedited 08:20

gonzo-обзоры ML статей

Прекрасное субботнее нашёл!

Что мы всё про AI, да AGI. Когда есть ETI (Extra-terrestrial Intelligence).

Avi Loeb с соавторами написал свежую статью про 3I/ATLAS, третий известный объект извне солнечной системы (помните Oumuamua, первый?). Он прямо сейчас летит у нас, если вы не знали.

Is the Interstellar Object 3I/ATLAS Alien Technology?
https://arxiv.org/abs/2507.12213

At this early stage of its passage through our Solar System, 3I/ATLAS, the recently discovered interstellar interloper, has displayed various anomalous characteristics, determined from photometric and astrometric observations. As largely a pedagogical exercise, in this paper we present additional analysis into the astrodynamics of 3I/ATLAS, and hypothesize that this object could be technological, and possibly hostile as would be expected from the 'Dark Forest' resolution to the 'Fermi Paradox'. We show that 3I/ATLAS approaches surprisingly close to Venus, Mars and Jupiter, with a probability of ≲\%. Furthermore the low retrograde tilt of 3I/ATLAS's orbital plane to the ecliptic offers various benefits to an Extra-terrestrial Intelligence (ETI), since it allows the object access to our planet with relative impunity. The eclipse by the Sun from Earth of 3I/ATLAS at perihelion, would allow it to conduct a clandestine reverse Solar Oberth Manoeuvre, an optimal high-thrust strategy for interstellar spacecraft to brake and stay bound to the Sun. An optimal intercept of Earth would entail an arrival in late November/early December of 2025, and also, a non-gravitational acceleration of au day, normalized at 1 au from the Sun, would indicate an intent to intercept the planet Jupiter, not far off its path, and a strategy to rendezvous with it after perihelion.

arXiv.org

Is the Interstellar Object 3I/ATLAS Alien Technology?

At this early stage of its passage through our Solar System, 3I/ATLAS, the recently discovered interstellar interloper, has displayed various anomalous characteristics, determined from photometric...

😱14🤡9👀9❤6👍4🔥4👎2😁1🤮1💩1💊1

7.74K viewsedited 20:00

gonzo-обзоры ML статей

Бахнул авторазбор свежей статьи Антропика про Persona vectors. В целом мне такие саммари проще и быстрее читать, чем даже официальные посты в блогах.

https://news.1rj.ru/str/gonzo_ML_podcasts/653

gonzo_ML_podcasts

Persona Vectors: Monitoring and Controlling Character Traits in Language Models
Runjin Chen, Andy Arditi, Henry Sleight, Owain Evans, Jack Lindsey
Статья: https://arxiv.org/abs/2507.21509
Код: https://github.com/safety-research/persona_vectors
Блог: http…

❤19👍10

6.66K views22:37

gonzo-обзоры ML статей

Вышла AlphaEarth Foundations (AEF), геопространственная фундаментальная модель от Дипмайнда. Выглядит просто бомбически по результатам. Ожидаю волны новых проектов и стартапов (если лицензия позволяет) вокруг гео-аналитики!

https://news.1rj.ru/str/gonzo_ML_podcasts/666

gonzo_ML_podcasts

Title: AlphaEarth Foundations: An embedding field model for accurate and efficient global mapping from sparse label data
Authors: Christopher F. Brown, Michal R. Kazmierski, Valerie J. Pasquarella, William J. Rucklidge, Masha Samsikova, Chenhui Zhang, Evan…

🔥20👍11❤3

6.61K views14:53

About

Blog

Apps

Platform