NEW BOT Телеграм, страница

ml4se

ITER: Iterative Neural Repair for Multi-Location Patches

In this paper, an iterative program repair paradigm called ITER is proposed. It is founded on the concept of improving partial patches until they become plausible and correct:
- ITER iteratively improves partial single-location patches by fixing compilation errors and further refining the previously generated code.
- ITER iteratively improves partial patches to construct multi-location patches, with fault localization re-execution.

ITER is implemented for Java based on battle-proven deep neural networks and code representation and is evaluated on 476 bugs from 10 open-source projects in Defects4J 2.0.

👍1

322 views06:30

ml4se

A Survey of Trojans in Neural Models of Source Code: Taxonomy and Techniques

With the growing prevalence of neural models in modern software development ecosystem, the security issues in these models have also become widely apparent. Models are susceptible to poisoning by “Trojans", which can lead them to output harmful, insecure code whenever a special “sign” is present in the input; even worse, such capabilities might evade detection. Given these models’ widespread use, it is thus important to study potential Trojan attacks on them. In order to do so it is important to understand how models of code interpret input, and how they can be attacked.

In this work, the authors study literature in Explainable AI and Safe AI to understand poisoning of neural models of code. In order to do so, they establish a novel taxonomy for Trojan AI for code, and present a new aspect-based classification of triggers in neural models of code.

🔥3

237 views08:01

ml4se

COSCO: On Contrastive Learning of Semantic Similarity for Code to Code Search

The paper introduces a novel code-to-code search technique that enhances the performance of LLMs by including both static and dynamic features as well as utilizing both similar and dissimilar examples during training. The authors present the code search method that encodes dynamic runtime information during training without the need to execute either the corpus under search or the search query at inference time. The proposed approach outperforms the state-of-the-art cross-language search tool by up to 44.7%.

COSCO (github)

RQ1. How does COSCO’s performance compare to the performance of other cross-language code search techniques?

RQ2. Does COSCO’s methodology and performance generalize across different models?

RQ3. Does including semantic similarity scores during training improve code search?

RQ4. How does changing the number of positive and negative comparison samples available for training effect COSCO’s performance?

234 views09:39

ml4se

GitHub code search is generally available

New code search and code view are generally available to all users on GitHub.com.

The GitHub Blog

GitHub code search is generally available

The world’s code is now at your fingertips.

232 views09:58

ml4se

Proceedings of the 18th International Conference on Evaluation of Novel Approaches to Software Engineering

This book contains the Proceedings of the 18th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE 2023). This conference is sponsored by the Institute for Systems and Technologies of Information, Control and Communication (INSTICC), held in cooperation with the ACM Special Interest Group on Management Information Systems (ACM SIGMIS) and technically co-sponsored by the IEEE SMC - IEEE Technical Committee on Enterprise Information Systems. This year’s ENASE is held in Prague, Czech Republic, from April 24−25.

234 views16:00

ml4se

StarCoder: may the source be with you!

The BigCode community, an open-scientific collaboration working on the responsible development of Code LLMs, introduces StarCoder and StarCoderBase:
- 15.5B parameter models
- 8K context length
- StarCoderBase is trained on 1 trillion tokens sourced from The Stack, a large collection of permissively licensed GitHub repositories with inspection tools and an opt-out process
- StarCoderBase is fine-tuned on 35B Python tokens, resulting in the creation of StarCoder

StarCoderBase outperforms every open Code LLM that supports multiple programming languages and matches or outperforms the OpenAI code-cushman-001 model.

🔥4

8.62K views13:50

ml4se

The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation

The Vault is an open-source large-scale code-text dataset designed to enhance the training of code-focused LLMs. Existing open-source datasets for training code-based LLMs often face challenges in terms of size, quality, and format. The Vault overcomes these limitations by providing 40M code-text pairs across 10 popular programming languages, thorough cleaning for 10+ prevalent issues, and various levels of code-text pairings, including class, function, and line levels.

👍3

308 views14:05

ml4se

Introducing 100K Token Context Windows

- approximately 75K words
- hundreds of pages
- a book, for example "The Great Gatsby" (about 72K tokens)
- a text that will take approximately 5 hours to read

🔥1

9.94K views03:20

ml4se

Visualization in the Era of Artificial Intelligence: Experiments for Creating Structural Visualizations by Prompting LLMs

Experiments with 2D/3D visualization using LLMs.

364 viewsedited 10:10

ml4se

Measuring the Runtime Performance of Code Produced with GitHub Copilot

GitHub Copilot is an artificially intelligent programming assistant used by many developers. The authors evaluate the runtime performance of code produced when developers use GitHub Copilot versus when they do not. To this end, they conducted a user study with 32 participants where each participant solved two C++ programming problems, one with Copilot and the other without it and measured the run-time performance of the participants’ solutions. The results suggest that using Copilot may produce code with a significantly slower runtime performance.

RQ0: Does using Copilot influence program correctness?
RQ1: Is there a runtime performance difference in code when using GitHub Copilot?
RQ2: Do Copilot’s suggestions sway developers towards or away from code with faster runtime performance?
RQ3: Do characteristics of Copilot users influence the run-time performance when it is used?

249 views14:27

ml4se

RLocator: Reinforcement Learning for Bug Localization

The authors propose RLocator, a RL-based technique to rank the source code files where the bug may reside, given the bug report. The contribution of the study is the formulation of the bug localization problem using the Markov Decision Process, which helps to optimize the evaluation measures directly. RLocator is evaluated on 8,316 bug reports. The authors found that RLocator is better than the other state-of-the-art techniques when using MAP as an evaluation measure and is good most of the time when using MRR. Thus the authors conclude that RL for bug detection is a promising avenue for future exploration.

257 views15:48

ml4se

Recommending Root-Cause and Mitigation Steps for Cloud Incidents using Large Language Models

In this work, the authors do the first large-scale study to evaluate the effectiveness of LLMs for helping engineers root cause and mitigate production incidents. Human evaluation with actual incident owners shows the efficacy and future potential of using artificial intelligence for resolving cloud incidents.

278 views07:43

About

Blog

Apps

Platform