NEW BOT Телеграм, страница

Linkstream

A chance of training on huge piles of retail GPUs and not just on the superclusters?
https://huggingface.co/papers/2311.08105

huggingface.co

Paper page - DiLoCo: Distributed Low-Communication Training of Language Models

Join the discussion on this paper page

❤2

218 views23:32

Linkstream

System 2 Attention (S2A).
- Soft attention in Transformers is susceptible to irrelevant/biased info
- S2A uses LLM reasoning to generate what to attend to
Improves factuality & objectivity, decreases sycophancy.
https://arxiv.org/abs/2311.11829

218 viewsedited 17:25

Linkstream

DARE/MergeLM: Absorbing Abilities from Homologous Models as a Free Lunch
https://github.com/yule-BUAA/MergeLM

GitHub

GitHub - yule-BUAA/MergeLM: Codebase for Merging Language Models (ICML 2024)

Codebase for Merging Language Models (ICML 2024). Contribute to yule-BUAA/MergeLM development by creating an account on GitHub.

220 views18:31

Linkstream

In this paper, we introduce generative agents--computational software agents that simulate believable human behavior. Generative agents wake up, cook breakfast, and head to work; artists paint, while authors write; they form opinions, notice each other, and initiate conversations; they remember and reflect on days past as they plan the next day.

https://arxiv.org/abs/2304.03442
https://github.com/joonspk-research/generative_agents

GitHub

GitHub - joonspk-research/generative_agents: Generative Agents: Interactive Simulacra of Human Behavior

Generative Agents: Interactive Simulacra of Human Behavior - joonspk-research/generative_agents

👾1

258 viewsedited 20:50

Linkstream

https://github.com/comfyanonymous/ComfyUI
if you are into Stable Diffusion

271 viewsedited 19:59

Linkstream

This media is not supported in your browser

VIEW IN TELEGRAM

insanely detailed LLM inference visualization from Brendan Bycroft
https://bbycroft.net/llm

🔥2

1.14K views14:18

Linkstream

Nuclear Reactor Simulation (interactive!)

https://dalton-nrs.manchester.ac.uk/

⚡3

206 views19:59

Linkstream

https://twitter.com/MistralAI/status/1733150512395038967
beautiful. on friday. even more beautiful.

👍2

200 views16:29

Linkstream

ChatGPT: sometimes “hallucinates” (tries to guess the details not in the training set).
OpenAI: tries to counter that
Google: hold my beer, let’s hallucinate the actual Gemini model presentation!

https://arstechnica.com/information-technology/2023/12/google-admits-it-fudged-a-gemini-ai-demo-video-which-critics-say-misled-viewers/

Ars Technica

Google’s best Gemini AI demo video was fabricated

Google takes heat for a misleading AI demo video that hyped up its GPT-4 competitor.

👍2

217 views03:58

Linkstream

evergreen wisdom from Rob Pike
https://commandcenter.blogspot.com/2023/12/simplicity.html

Blogspot

Simplicity

In May 2009, Google hosted an internal "Design Wizardry" panel, with talks by Jeff Dean, Mike Burrows, Paul Haahr, Alfred Spector, Bill Cou...

199 views19:26

Linkstream

Generative agent-based modeling with actions grounded in physical, social, or digital space using Concordia from Google DeepMind

https://arxiv.org/abs/2312.03664
https://github.com/google-deepmind/concordia

GitHub

GitHub - google-deepmind/concordia: A library for generative social simulation

A library for generative social simulation. Contribute to google-deepmind/concordia development by creating an account on GitHub.

👍1

247 views06:51

Linkstream

Google has good researchers and not so good product managers, as always.
Loosely related: Terence Tao was saying "LLMs help me with math" for a while already.

“The FunSearch paper by DeepMind that was used to discover new mathematics is an example of searching through generative patterns and employing evolutionary methods to creatively conjure up new solutions. This is a very general principle that lies at the core of creativity.”
https://www.nature.com/articles/d41586-023-04043-w

Nature

DeepMind AI outdoes human mathematicians on unsolved problem

Nature - Large language model improves on efforts to solve combinatorics problems inspired by the card game Set.

🔥1

167 viewsedited 06:34

Linkstream

> Unfortunately , too few people understand the distinction between memorization and understanding. It's not some lofty question like "does the system have an internal world model?", it's a very pragmatic behavior distinction: "is the system capable of broad generalization, or is it limited to local generalization?"
-- a thread from François Chollet

> by popular demand: a starter set of papers you can read on the topic.

"Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks": https://arxiv.org/abs/2311.09247

"Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve": https://arxiv.org/abs/2309.13638

"Faith and Fate: Limits of Transformers on Compositionality": https://arxiv.org/abs/2305.18654

"The Reversal Curse: LLMs trained on "A is B" fail to learn 'B is A'": https://arxiv.org/abs/2309.12288

"On the measure of intelligence": https://arxiv.org/abs/1911.01547 not about LLMs, but provides context and grounding on what it means to be intelligent and the nature of generalization. It also introduces an intelligence benchmark (ARC) that remains completely out of reach for LLMs. Ironically the best-performing LLM-based systems on ARC are those that have been trained on tons of generated tasks, hoping to hit some overlap between test set tasks and your generated tasks -- LLMs have zero ability to tackle an actually new task.

In general there's a new paper documenting the lack of broad generalization capabilities of LLMs every few days.

❤1

187 viewsedited 07:48

Linkstream

"Noisy TV problem" is solvable by introducing yet another level of abstraction :)
Curiosity-Driven Exploration via Latent Bayesian Surprise
https://arxiv.org/abs/2104.07495

More on the topic: https://lilianweng.github.io/posts/2020-06-07-exploration-drl/

🤔3

173 views17:39

Linkstream

https://x.com/BrianJJi/status/1736284009590620597?s=20

X (formerly Twitter)

Brian Ji (@brianjji) on X

Peter Thiel: Meaning is found in doing things that are important, that otherwise wouldn't get done.

Aim to work on things which without you, wouldn't / couldn't get done by anyone else.

💯1

188 views11:55

Linkstream

a nice take on AI/ML from outside the Valley
https://wz.ax/bridgewater-on-ai

Bridgewater

Assessing the Implications of a Productivity Miracle

What happens when cognitive tasks can be done at zero marginal costs? Co-CIO Greg Jensen explores some of the potential impacts that advancements of AI/ML technology could have on companies and the economy, including an extreme scenario that could potentially…

196 viewsedited 08:43

Linkstream

https://www.nature.com/articles/s41586-023-06887-8

Nature

Discovery of a structural class of antibiotics with explainable deep learning

Nature - An explainable deep learning model using a chemical substructure-based approach for the exploration of chemical compound libraries identified structural classes of compounds with...

208 views17:30

Linkstream

oh interesting. makes sense as all addictions are driven by the same neurochemicals
https://twitter.com/tenobrus/status/1738364449122357365

X (formerly Twitter)

Tenobrus (@tenobrus) on X

it turns out ozempic is also the cure for doomscrolling and tiktok

💊1

266 views09:22

Linkstream

> In this study, we show that when aiming for limited precision, existing approximation methods can be outperformed by programs automatically discovered from scratch by a simple evolutionary algorithm.
https://arxiv.org/abs/2312.08472

arXiv.org

AutoNumerics-Zero: Automated Discovery of State-of-the-Art...

Computers calculate transcendental functions by approximating them through the composition of a few limited-precision instructions. For example, an exponential can be calculated with a Taylor...

233 views16:00

Linkstream

https://migeel.sk/blog/2024/01/02/building-a-self-contained-game-in-csharp-under-2-kilobytes/

Michal's low level corner

Building a self-contained game in C# under 2 kilobytes

How I fit a graphical game in C# into 2 kilobytes, with no .NET runtime required.

👍1

243 views18:44

Linkstream