How Effective Are Neural Networks for Fixing Security Vulnerabilities
Security vulnerability repair is a difficult task that is in dire need of automation. Two groups of techniques have shown promise:
- large code language models (LLMs) that have been pre-trained on source code for tasks such as code completion, and
- automated program repair (APR) techniques that use deep learning (DL) models to automatically fix software bugs.
Findings:
- Existing LLMs and APR models fix very few Java vulnerabilities. Codex fixes 10.2 (20.4%), the most number of vulnerabilities.
- Fine-tuning with general APR data improves LLMs' vulnerability-fixing capabilities.
- New VJBench reveals that LLMs and APR models fail to fix many CWE types, such as CWE-325 Missing cryptographic step and CWE-444 HTTP request smuggling.
- Codex still fixes 8.3 transformed vulnerabilities, outperforming all the other LLMs and APR models on transformed vulnerabilities.
Security vulnerability repair is a difficult task that is in dire need of automation. Two groups of techniques have shown promise:
- large code language models (LLMs) that have been pre-trained on source code for tasks such as code completion, and
- automated program repair (APR) techniques that use deep learning (DL) models to automatically fix software bugs.
Findings:
- Existing LLMs and APR models fix very few Java vulnerabilities. Codex fixes 10.2 (20.4%), the most number of vulnerabilities.
- Fine-tuning with general APR data improves LLMs' vulnerability-fixing capabilities.
- New VJBench reveals that LLMs and APR models fail to fix many CWE types, such as CWE-325 Missing cryptographic step and CWE-444 HTTP request smuggling.
- Codex still fixes 8.3 transformed vulnerabilities, outperforming all the other LLMs and APR models on transformed vulnerabilities.
Data Augmentation Approaches for Source Code Models: A Survey
The paper provides a comprehensive analysis of data augmentation techniques in the context of source code.
[github repo]
The paper provides a comprehensive analysis of data augmentation techniques in the context of source code.
[github repo]
Tuning Models of Code with Compiler-Generated Reinforcement Learning Feedback
The authors propose RLCF approach, that trains a pre-trained LLM using feedback from a code compiler. RLCF views the LLM as an RL agent that generates code step by step and receives:
- compiler-derived feedback on whether the code it generates passes a set of correctness checks; and
- feedback from a different LLM on whether the generated code is similar to a set of reference programs in the training corpus.
Together, these feedback mechanisms help the generated code remain within the target distribution while passing all static correctness checks. The experiments show that RLCF significantly raises the odds that an LLM-generated program compiles, is executable, and produces the right output on tests, often allowing LLMs to match the performance of 2x-8x larger LLMs.
The authors propose RLCF approach, that trains a pre-trained LLM using feedback from a code compiler. RLCF views the LLM as an RL agent that generates code step by step and receives:
- compiler-derived feedback on whether the code it generates passes a set of correctness checks; and
- feedback from a different LLM on whether the generated code is similar to a set of reference programs in the training corpus.
Together, these feedback mechanisms help the generated code remain within the target distribution while passing all static correctness checks. The experiments show that RLCF significantly raises the odds that an LLM-generated program compiles, is executable, and produces the right output on tests, often allowing LLMs to match the performance of 2x-8x larger LLMs.
👍2
Machine-Learning Kronecker Coefficients
The Kronecker coefficients are the decomposition multiplicities of the tensor product of two irreducible representations of the symmetric group. There is no known combinatorial denoscription of the Kronecker coefficients, and it is an NP-hard problem to decide whether a given Kronecker coefficient is zero or not.
In this paper, the author shows that standard machine-learning algorithms such as NNs, CNNs and Gradient Boosting Decision Trees may be trained to predict with high accuracy whether a given Kronecker coefficient is zero or not.
The Kronecker coefficients are the decomposition multiplicities of the tensor product of two irreducible representations of the symmetric group. There is no known combinatorial denoscription of the Kronecker coefficients, and it is an NP-hard problem to decide whether a given Kronecker coefficient is zero or not.
In this paper, the author shows that standard machine-learning algorithms such as NNs, CNNs and Gradient Boosting Decision Trees may be trained to predict with high accuracy whether a given Kronecker coefficient is zero or not.
Scalable and Adaptive Log-based Anomaly Detection with Expert in the Loop
The authors present SeaLog, a scalable and adaptive log-based anomaly detection framework designed to meet the practical requirements of accuracy, lightweight design, and adaptiveness in cloud systems. SeaLog utilizes a trie-based detection agent for lightweight and adaptive anomaly detection in a streaming manner. It also incorporates expert feedback, including utilizing LLMs as an expert, to continuously enhance the system’s accuracy. Experimental results on two public datasets and an industrial dataset from CloudX showed that SeaLog is effective, achieving F1 scores between 0.908 and 0.990
The authors present SeaLog, a scalable and adaptive log-based anomaly detection framework designed to meet the practical requirements of accuracy, lightweight design, and adaptiveness in cloud systems. SeaLog utilizes a trie-based detection agent for lightweight and adaptive anomaly detection in a streaming manner. It also incorporates expert feedback, including utilizing LLMs as an expert, to continuously enhance the system’s accuracy. Experimental results on two public datasets and an industrial dataset from CloudX showed that SeaLog is effective, achieving F1 scores between 0.908 and 0.990
Analysis of ChatGPT on Source Code
The paper explores the use of LLMs and in particular ChatGPT in programming, source code analysis, and code generation. While these models can save time and provide highly accurate results, they are not yet advanced enough to replace human programmers entirely. The paper investigates the potential applications of LLMs and ChatGPT in various areas, such as
- code creation,
- code documentation,
- bug detection,
- refactoring, and
- more.
The paper explores the use of LLMs and in particular ChatGPT in programming, source code analysis, and code generation. While these models can save time and provide highly accurate results, they are not yet advanced enough to replace human programmers entirely. The paper investigates the potential applications of LLMs and ChatGPT in various areas, such as
- code creation,
- code documentation,
- bug detection,
- refactoring, and
- more.
👍2
ICAART 2024
16th International Conference on Agents and Artificial Intelligence
February 24 - 26, 2024
Rome, Italy
Upcoming Submission Deadlines
Regular Paper Submission: October 9, 2023
Position Paper Submission: November 17, 2023
Doctoral Consortium Paper Submission: January 1, 2024
16th International Conference on Agents and Artificial Intelligence
February 24 - 26, 2024
Rome, Italy
Upcoming Submission Deadlines
Regular Paper Submission: October 9, 2023
Position Paper Submission: November 17, 2023
Doctoral Consortium Paper Submission: January 1, 2024
Understanding DeepMind's Sorting Algorithm
AlphaDev is an artificial intelligence system that uses reinforcement learning to discover enhanced computer science algorithms – surpassing those honed by scientists and engineers over decades.
AlphaDev is an artificial intelligence system that uses reinforcement learning to discover enhanced computer science algorithms – surpassing those honed by scientists and engineers over decades.
NeurIPS 2023 Competition Track Program
Special Topics in Machine Learning:
- NeurIPS 2023 Machine Unlearning Competition
- Privacy Preserving Federated Learning Document VQA
- Causal Structure Learning from Event Sequences and Prior Knowledge
- Practical Vector Search Challenge 2023
Natural Language Processing and LLMs
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day
- TDC 2023 (LLM Edition): The Trojan Detection Challenge
Multi-Agent Learning
- The NeurIPS 2023 Neural MMO Challenge: Multi-Task Reinforcement Learning and Curriculum Generation
- Lux AI Challenge Season 2 NeurIPS Edition
- Melting Pot Contest
Special Topics in Machine Learning:
- NeurIPS 2023 Machine Unlearning Competition
- Privacy Preserving Federated Learning Document VQA
- Causal Structure Learning from Event Sequences and Prior Knowledge
- Practical Vector Search Challenge 2023
Natural Language Processing and LLMs
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day
- TDC 2023 (LLM Edition): The Trojan Detection Challenge
Multi-Agent Learning
- The NeurIPS 2023 Neural MMO Challenge: Multi-Task Reinforcement Learning and Curriculum Generation
- Lux AI Challenge Season 2 NeurIPS Edition
- Melting Pot Contest
Forwarded from Consciousnesses
Daniel Dennett — Counterfeit People
In this interview, Dr. Tim Scarfe speaks with renowned philosopher Daniel Dennett about the potential dangers of AI and the concept of "Counterfeit People." Dennett raises concerns about AI being used to create artificial colleagues, and argues that preventing counterfeit AI individuals is crucial for societal trust and security.
- Intro
- Main show kick off
- Counterfeit People
- Reversibility
- Reontologisation
- Realism
- Adversarial LLMs are out to get us
- Exploring mental trajectories and Chomsky
- Gilbert Ryle and Ghost in machine and competition in academia
- 2 Black boxes thought experiment / intentional stance
- Chinese room
- Singularitarianism
- Emergence of consciousness and semanticity
In this interview, Dr. Tim Scarfe speaks with renowned philosopher Daniel Dennett about the potential dangers of AI and the concept of "Counterfeit People." Dennett raises concerns about AI being used to create artificial colleagues, and argues that preventing counterfeit AI individuals is crucial for societal trust and security.
- Intro
- Main show kick off
- Counterfeit People
- Reversibility
- Reontologisation
- Realism
- Adversarial LLMs are out to get us
- Exploring mental trajectories and Chomsky
- Gilbert Ryle and Ghost in machine and competition in academia
- 2 Black boxes thought experiment / intentional stance
- Chinese room
- Singularitarianism
- Emergence of consciousness and semanticity
LLM Powered Autonomous Agents
Building agents with LLM as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.
- Agent System Overview
- Component One: Planning
Task Decomposition
Self-Reflection
- Component Two: Memory
Types of Memory
Maximum Inner Product Search (MIPS)
- Component Three: Tool Use
- Case Studies
Scientific Discovery Agent
Generative Agents Simulation
Proof-of-Concept Examples
- Challenges
- Citation
- References
Building agents with LLM as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.
- Agent System Overview
- Component One: Planning
Task Decomposition
Self-Reflection
- Component Two: Memory
Types of Memory
Maximum Inner Product Search (MIPS)
- Component Three: Tool Use
- Case Studies
Scientific Discovery Agent
Generative Agents Simulation
Proof-of-Concept Examples
- Challenges
- Citation
- References
❤2