> Unfortunately , too few people understand the distinction between memorization and understanding. It's not some lofty question like "does the system have an internal world model?", it's a very pragmatic behavior distinction: "is the system capable of broad generalization, or is it limited to local generalization?"
-- a thread from François Chollet
> by popular demand: a starter set of papers you can read on the topic.
"Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks": https://arxiv.org/abs/2311.09247
"Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve": https://arxiv.org/abs/2309.13638
"Faith and Fate: Limits of Transformers on Compositionality": https://arxiv.org/abs/2305.18654
"The Reversal Curse: LLMs trained on "A is B" fail to learn 'B is A'": https://arxiv.org/abs/2309.12288
"On the measure of intelligence": https://arxiv.org/abs/1911.01547 not about LLMs, but provides context and grounding on what it means to be intelligent and the nature of generalization. It also introduces an intelligence benchmark (ARC) that remains completely out of reach for LLMs. Ironically the best-performing LLM-based systems on ARC are those that have been trained on tons of generated tasks, hoping to hit some overlap between test set tasks and your generated tasks -- LLMs have zero ability to tackle an actually new task.
In general there's a new paper documenting the lack of broad generalization capabilities of LLMs every few days.
-- a thread from François Chollet
> by popular demand: a starter set of papers you can read on the topic.
"Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks": https://arxiv.org/abs/2311.09247
"Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve": https://arxiv.org/abs/2309.13638
"Faith and Fate: Limits of Transformers on Compositionality": https://arxiv.org/abs/2305.18654
"The Reversal Curse: LLMs trained on "A is B" fail to learn 'B is A'": https://arxiv.org/abs/2309.12288
"On the measure of intelligence": https://arxiv.org/abs/1911.01547 not about LLMs, but provides context and grounding on what it means to be intelligent and the nature of generalization. It also introduces an intelligence benchmark (ARC) that remains completely out of reach for LLMs. Ironically the best-performing LLM-based systems on ARC are those that have been trained on tons of generated tasks, hoping to hit some overlap between test set tasks and your generated tasks -- LLMs have zero ability to tackle an actually new task.
In general there's a new paper documenting the lack of broad generalization capabilities of LLMs every few days.
❤1
"Noisy TV problem" is solvable by introducing yet another level of abstraction :)
Curiosity-Driven Exploration via Latent Bayesian Surprise
https://arxiv.org/abs/2104.07495
More on the topic: https://lilianweng.github.io/posts/2020-06-07-exploration-drl/
Curiosity-Driven Exploration via Latent Bayesian Surprise
https://arxiv.org/abs/2104.07495
More on the topic: https://lilianweng.github.io/posts/2020-06-07-exploration-drl/
🤔3
a nice take on AI/ML from outside the Valley
https://wz.ax/bridgewater-on-ai
https://wz.ax/bridgewater-on-ai
Bridgewater
Assessing the Implications of a Productivity Miracle
What happens when cognitive tasks can be done at zero marginal costs? Co-CIO Greg Jensen explores some of the potential impacts that advancements of AI/ML technology could have on companies and the economy, including an extreme scenario that could potentially…
oh interesting. makes sense as all addictions are driven by the same neurochemicals
https://twitter.com/tenobrus/status/1738364449122357365
https://twitter.com/tenobrus/status/1738364449122357365
X (formerly Twitter)
Tenobrus (@tenobrus) on X
it turns out ozempic is also the cure for doomscrolling and tiktok
💊1
> In this study, we show that when aiming for limited precision, existing approximation methods can be outperformed by programs automatically discovered from scratch by a simple evolutionary algorithm.
https://arxiv.org/abs/2312.08472
https://arxiv.org/abs/2312.08472
arXiv.org
AutoNumerics-Zero: Automated Discovery of State-of-the-Art...
Computers calculate transcendental functions by approximating them through the composition of a few limited-precision instructions. For example, an exponential can be calculated with a Taylor...
> I made this game to teach my daughter how buffer overflows work
DANGER: NERD LEVEL 80
https://punkx.org/overflow/
DANGER: NERD LEVEL 80
https://punkx.org/overflow/
punkx.org
PROJEKT: OVERFLOW
🤩1💅1
⚠️ GitLab arbitrary account takeover, CVSS 10
TLDR: Upgrade your gitlab instance ASAP, likely it's an open door right now.
https://about.gitlab.com/releases/2024/01/11/critical-security-release-gitlab-16-7-2-released/
TLDR: Upgrade your gitlab instance ASAP, likely it's an open door right now.
https://about.gitlab.com/releases/2024/01/11/critical-security-release-gitlab-16-7-2-released/
GitLab
GitLab Critical Security Release: 16.7.2, 16.6.4, 16.5.6
Learn more about GitLab Critical Security Release: 16.7.2, 16.6.4, 16.5.6 for GitLab Community Edition (CE) and Enterprise Edition (EE).
😱2🤝1
if you thought Z̵̋̄ ̱̬͗̐̃͗͋͋͐͂͛̍̀͛̒͘ą̵͔̗͍̝̲͈̘͉͓̰͍̯͑͐ͅĺ̵̢̨̦̫͈͓̖̼̟͎̤̦̖̔͗̓̏̌̾̑̈́͆̎͘͝g̸ ̨̠̠͓͚͙̣̟̪̺̗̺̻̖͆̾͋̽͐̑́͌̚͠ơ̶̋͝ ̞͖ is bad...
https://stackoverflow.com/a/6163129
https://stackoverflow.com/a/6163129
Stack Overflow
Why does modern Perl avoid UTF-8 by default?
I wonder why most modern solutions built using Perl don't enable UTF-8 by default.
I understand there are many legacy problems for core Perl noscripts, where it may break things. But, from my point of
I understand there are many legacy problems for core Perl noscripts, where it may break things. But, from my point of
insecure boot, huh
https://arstechnica.com/security/2024/02/critical-vulnerability-affecting-most-linux-distros-allows-for-bootkits/
https://arstechnica.com/security/2024/02/critical-vulnerability-affecting-most-linux-distros-allows-for-bootkits/
Ars Technica
Critical vulnerability affecting most Linux distros allows for bootkits
Buffer overflow in bootloader shim allows attackers to run code each time devices boot up.
😁2😱2
TIL this is possible in the general case. Neat!
> SQL-99 allows for nested subqueries at nearly all places within a query.
From a user’s point of view, nested queries can greatly simplify the formulation of complex queries.
However, nested queries that are correlated with the outer queries frequently lead to dependent joins with nested loops evaluations and thus poor performance.
We present a generic approach for unnesting arbitrary SQL queries. As a result, the de-correlated queries allow for much simpler and much more efficient query evaluation.
https://btw-2015.informatik.uni-hamburg.de/res/proceedings/Hauptband/Wiss/Neumann-Unnesting_Arbitrary_Querie.pdf
> SQL-99 allows for nested subqueries at nearly all places within a query.
From a user’s point of view, nested queries can greatly simplify the formulation of complex queries.
However, nested queries that are correlated with the outer queries frequently lead to dependent joins with nested loops evaluations and thus poor performance.
We present a generic approach for unnesting arbitrary SQL queries. As a result, the de-correlated queries allow for much simpler and much more efficient query evaluation.
https://btw-2015.informatik.uni-hamburg.de/res/proceedings/Hauptband/Wiss/Neumann-Unnesting_Arbitrary_Querie.pdf
👍1
TRUFFLE–1 $ 1,299
Truffle-1 is an AI inference engine designed to run opensource models at home, on 60 Watts.
https://preorder.itsalltruffles.com/features
Truffle-1 is an AI inference engine designed to run opensource models at home, on 60 Watts.
https://preorder.itsalltruffles.com/features
super detailed explanation of the CVE-2024-1086 Linux v5.14-v6.7 privilege escalation exploit
https://pwning.tech/nftables/
I hope beginners will learn from my VR workflow and the seasoned researchers will learn from my techniques.
https://pwning.tech/nftables/
Pwning Tech
Flipping Pages: An analysis of a new Linux vulnerability in nf_tables and hardened exploitation techniques
A tale about exploiting KernelCTF Mitigation, Debian, and Ubuntu instances with a double-free in nf_tables in the Linux kernel, using novel techniques like Dirty Pagedirectory. All without even having to recompile the exploit for different kernel targets…
🤔1