ml4se – Telegram
ml4se
502 subscribers
446 photos
1 file
524 links
Machine Learning for Software Engineering
Download Telegram
BloombergGPT: A Large Language Model for Finance

The work presents BloombergGPT, a 50 billion parameter language model that is trained on a wide range of financial data. Authors construct a 363 billion token dataset based on Bloomberg's extensive data sources. Mixed dataset training leads to a model that outperforms existing models on financial tasks by significant margins without sacrificing performance on general LLM benchmarks.
CONAN: Diagnosing Batch Failures for Cloud Systems (Microsoft)

Failure diagnosis is critical to the maintenance of large-scale cloud systems, which has attracted tremendous attention from academia and industry over the last decade. In this paper, authors focus on diagnosing batch failures, which occur to a batch of instances of the same subject (e.g., API requests, VMs, nodes, etc.), resulting in degraded service availability and performance. CONAN is an efficient and flexible framework that can automatically extract contrast patterns (failed vs. succeeded, slow vs. normal etc.) from contextual data.
ICCQ'23: The Third International Conference on Code Quality

- What IS Code Quality: from “ilities” to QWAN
- Mutant Selection Strategies in Mutation Testing
- Understanding Software Performance Challenges - An Empirical Study on Stack Overflow
- Applying Machine Learning Analysis for Software Quality Test
- Test-based and metric-based evaluation of code generation models for practical question answering

Accepted papers
Live
Federated Learning with Flexible Control (IBM)

Federated learning (FL) enables distributed model training from local data collected by users. Existing works have separately considered different configurations to make FL more efficient, such as infrequent transmission of model updates, client subsampling, and compression of update vectors. However, an important open problem is how to jointly apply and tune these control knobs in a single FL algorithm.

Is it possible to jointly apply a wide range of control options in a single FL algorithm, to support heterogeneous and time-varying costs of multiple types of resources?

FlexFL is an FL algorithm, which allows flexible configurations in the amount of computation at each client and the amount of communication between clients and the server. This algorithm provides a high degree of freedom in adapting the FL procedure to heterogeneous and dynamically changing resource costs.
DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection

The paper presents a new dataset, DiverseVul, for detecting software vulnerabilities using deep learning. The dataset contains 150 CWEs, 26,635 vulnerable functions, and 352,606 nonvulnerable functions extracted from 7,861 commits, which is more diverse and twice the size of the previous largest and most diverse dataset, CVEFixes. The authors plan to publish the DiverseVul dataset.
Samsung's chip boffins couldn't help but tell ChatGPT their secrets

Samsung has been forced to limit access to ChatGPT after dealing with multiple leaks of confidential info via the chatbot. The leaks reportedly taking place only shortly after the company lifted a ban on the chatbot's use due to concerns over leaking.
Tabby: Self-hosted AI coding assistant

Self-hosted AI coding assistant. An opensource / on-prem alternative to GitHub Copilot.

- Self-contained, with no need for a DBMS or cloud service
- Web UI for visualizing and configuration models and MLOps.
- OpenAPI interface, easy to integrate with existing infrastructure.
- Consumer level GPU supports (FP-16 weight loading with various optimization).
👎1
Towards Efficient Fine-tuning of Pre-trained Code Models

There are many studies on accelerating fine-tuning (FT) process. The paper conducts experimental study to explore what happens to layer-wise code knowledge and pre-trained representations during FT. The authors propose alternatives to fine-tune the large pre-trained code model.

The experimental study shows that the lexical, syntactic, and structural properties of source code are mainly captured in the lower, intermediate, and higher layers, respectively, while the semantic property spans across the entire model. The basic code properties captured by lower and intermediate layers are still preserved during FT.

Telly efficiently fine-tunes pre-trained code models via selective layer freezing. The experiments on various downstream tasks demonstrate that both training parameters and time costs can be reduced, while performance is similar or even better.
👍1
Evaluating AIGC Detectors on Code Content

Artificial Intelligence Generated Content (AIGC) has garnered considerable attention for its impressive performance, with ChatGPT emerging as a leading AIGC model that produces high-quality responses across various applications, including software development and maintenance.

Numerous AIGC detectors have been developed and evaluated on natural language data. However, their performance on code-related content generated by ChatGPT remains unexplored. To fill this gap, this paper presents the first empirical study on evaluating existing AIGC detectors in the software domain.

The results indicate that AIGC detectors demonstrate lower performance on code-related data compared to natural language data. Fine-tuning can enhance detector performance, especially for content within the same domain; but generalization remains a challenge. The human evaluation reveals that detection by humans is quite challenging.
AI for IT Operations (AIOps) on Cloud Platforms: Reviews, Opportunities and Challenges (Salesforce AI)

A review of the AIOps vision, trends challenges and opportunities, specifically focusing on the underlying AI techniques.

1. INTRODUCTION
2. CONTRIBUTION OF THIS SURVEY
3. DATA FOR AIOPS
A. Metrics
B. Logs
C. Traces
D. Other data
4. INCIDENT DETECTION
A. Metrics based Incident Detection
B. Logs based Incident Detection
C. Traces and Multimodal Incident Detection
5. FAILURE PREDICTION
A. Metrics based Failure Prediction
B. Logs based Incident Detection
6. ROOT CAUSE ANALYSIS
A. Metric-based RCA
B. Log-based RCA
C. Trace-based and Multimodal RCA
7. AUTOMATED ACTIONS
A. Automated Remediation
B. Auto-scaling
C. Resource Management
8. FUTURE OF AIOPS
A. Common AI Challenges for AIOps
B. Opportunities and Future Trends
9. CONCLUSION
👍1
Technical Report: Evaluation of ChatGPT Model for Vulnerability Detection

Authors found that current GPT-3 and ChatGPT capabilities for effectively detecting vulnerabilities in code are limited. While natural language processing models have demonstrated impressive results in numerous areas, their application in vulnerability detection tasks requires further refinement and investigation.
Forwarded from Consciousnesses
Superintelligence

Discussion with philosopher David Chalmers and his fellow experts on the concepts of consciousness, intelligence, and the possibility that we are living in a simulated universe. They delve into the works of Douglas Hofstadter, the idea of an intelligence explosion, and the challenge of aligning artificial general intelligence with human goals. The conversation also touches on the limitations of intelligence, the relationship between complexity and consciousness, and the potential motivations behind simulating a universe.

Table of Contents:
- Introduction to David Chalmers and his work
- The influence of Douglas Hofstadter on AI and philosophy
- The concept of the intelligence explosion
- Aligning artificial general intelligence with human goals
- Consciousness, introspection, and the meta problem
- The relationship between complexity and consciousness
- What makes a simulation interesting?
Forwarded from Consciousnesses
OpenAI’s CEO Says the Age of Giant AI Models Is Already Over

OpenAI’s CEO warned that the research strategy that birthed ChatGPT is played out. “I think we're at the end of the era where it's going to be these, like, giant, giant models,” he told an audience at an event held at MIT late last week. “We'll make them better in other ways.

Altman’s statement suggests that GPT-4 could be the last major advance to emerge from OpenAI’s strategy of making the models bigger and feeding them more data.

Nick Frosst, a cofounder at Cohere who previously worked on AI at Google, says Altman’s feeling that going bigger will not work indefinitely rings true. He, too, believes that progress on transformers, the type of machine learning model at the heart of GPT-4 and its rivals, lies beyond scaling. “There are lots of ways of making transformers way, way better and more useful, and lots of them don’t involve adding parameters to the model.”
AI / ML / LLM / Transformer Models Timeline

This is a collection of important papers in the area of LLMs and Transformer models.
PDF file.
Forwarded from Consciousnesses
The Anatomy of Autonomy: Why Agents are the next AI Killer App after ChatGPT

GPTs are General Purpose Technologies, but every GPT needs a killer app. The fifth killer app is here, and it is Autonomous Agents.
Bard now helps you code

Bard can help with programming and software development tasks, including code generation, debugging and code explanation in more than 20 programming languages including C++, Go, Java, Javanoscript, Python and Typenoscript. And you can easily export Python code to Google Colab — no copy and paste required. Bard can also assist with writing functions for Google Sheets.
👍3