4.6 Article

Do Pretrained Language Models Indeed Understand Software Engineering Tasks?

相关参考文献

注意:仅列出部分参考文献,下载原文获取全部文献信息。
Article Computer Science, Hardware & Architecture

An Empirical Study of Fault Triggers in Deep Learning Frameworks

Xiaoting Du et al.

Summary: This paper presents the first comprehensive empirical study on fault triggering conditions in three widely-used deep learning frameworks (i.e., TensorFlow, MXNET and PaddlePaddle). By analyzing 3,555 bug reports from GitHub repositories of these frameworks, a bug classification based on fault triggering conditions is performed, followed by the analysis of frequency distribution of different bug types and the evolution features. The correlations between bug types and fixing time, as well as the root causes and important consequences of Bohrbugs and Mandelbugs are investigated. Additionally, the analysis of regression bugs in deep learning frameworks is conducted. 12 important findings are revealed based on empirical results, and 10 implications for developers and users are provided.

IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING (2023)

Article Computer Science, Software Engineering

Are duplicates really harmful? An empirical study on bug report summarization techniques

Rui Hao et al.

Summary: Recent research has shown that duplicate bug reports can be valuable in assisting developers with software tasks. However, reading duplicate bug reports can be time-consuming and inefficient. This paper examines the challenges of applying existing summarization techniques on duplicate bug reports and investigates the effectiveness of state-of-the-art summarization approaches. The study provides insights and guidelines for choosing appropriate summarization approaches in different scenarios.

JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS (2023)

Article Automation & Control Systems

Optimization Techniques and Formal Verification for the Software Design of Boolean Algebra Based Safety-Critical Systems

Jon Perez et al.

Summary: This article describes a method based on optimization and formal verification for designing safety-critical systems. Multiple optimization techniques and a hybrid approach are used to find a design that meets performance, availability, and safety requirements, and then translate it into a formally verifiable knowledge representation.

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS (2022)

Article Computer Science, Information Systems

A Comparative Measurement Study of Deep Learning as a Service Framework

Yanzhao Wu et al.

Summary: This paper conducts an empirical comparison and analysis of four representative DL frameworks, highlighting the impact of hyper-parameter configurations on performance and accuracy, as well as the opportunities for improving performance and accuracy through parallel computing library configurations and tuning of hyper-parameters. The study also measures the resource consumption patterns of the DL frameworks and their implications for performance and accuracy. It provides practical guidance for deploying DL as a Service and selecting the right DL frameworks for specific workloads.

IEEE TRANSACTIONS ON SERVICES COMPUTING (2022)

Article Computer Science, Software Engineering

An Empirical Study on Data Distribution-Aware Test Selection for Deep Learning Enhancement

Qiang Hu et al.

Summary: This paper investigates the challenges faced by deep neural networks in their evolution and proposes a new test selection metric called DAT. Experimental results show that DAT outperforms existing metrics in handling distribution shifts, leading to significant accuracy improvement in model enhancement.

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY (2022)

Review Computer Science, Software Engineering

Systematic literature review on software quality for AI-based software

Bahar Gezici et al.

Summary: This paper investigates the state of software quality for AI-based systems through a systematic literature review, identifying quality attributes, applied models, challenges, and practices reported in the literature. It provides a roadmap for researchers to better understand quality challenges, attributes, and practices in the context of software quality for AI-based software.

EMPIRICAL SOFTWARE ENGINEERING (2022)

Article Information Science & Library Science

Exploring the factors of students' intention to participate in AI software development

Shih-Yeh Chen et al.

Summary: This study investigates the intention of university students to engage in AI software development and examines the influences of self-efficacy, AI literacy, and the theory of planned behaviour on this intention. The findings suggest that AI programming self-efficacy, AI literacy, and course satisfaction have a direct impact on the intention to participate in AI software development. Additionally, course playfulness and usefulness affect course satisfaction and AI literacy.

LIBRARY HI TECH (2022)

Article Computer Science, Software Engineering

Deep Learning Based Vulnerability Detection: Are We There Yet?

Saikat Chakraborty et al.

Summary: This paper investigates the performance of state-of-the-art deep learning-based vulnerability prediction techniques in real-world scenarios and finds that their performance drops by more than 50 percent. The authors identify challenges with training data and model choices as the causes of this drop and propose a more principled approach to data collection and model design, leading to significantly improved solutions compared to existing methods.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2022)

Proceedings Paper Computer Science, Interdisciplinary Applications

A Survey of Using Unsupervised Learning Techniques in Building Masked Language Models for Low Resource Languages

Labehat Kryeziu et al.

Summary: Transformers have become a popular method for representing text in machines, particularly in problems involving sequences. This paper provides an overview of transformers, with a focus on BERT, which is a Bidirectional Encoder Representation. The paper also explores various approaches to using transformers for text representation in low resource languages and aims to determine the feasibility of applying BERT for NLP capabilities in Albanian language.

2022 11TH MEDITERRANEAN CONFERENCE ON EMBEDDED COMPUTING (MECO) (2022)

Article Multidisciplinary Sciences

Evaluating Deep Learning models for predicting ALK-5 inhibition

Gabriel Z. Espinoza et al.

Summary: This study evaluated the performance of Deep Learning models in predicting the biological activity of ALK-5 inhibitors compared to Random Forest and Support Vector Regression, with deep neural network model showing the best performance in external validation. The generalization power of the models was assessed through internal and external validation procedures, indicating that the deep neural network model is suitable for predicting the biological activity of new ALK-5 inhibitors.

PLOS ONE (2021)

Article Computer Science, Software Engineering

Evaluating network embedding techniques' performances in software bug prediction

Yu Qu et al.

Summary: Research on bug prediction techniques utilizing network embedding has shown significant improvement in performance, with newly proposed algorithms such as ProNE showcasing the best results. Combining embedded vectors with traditional software engineering metrics has proven to be highly effective in enhancing bug prediction models.

EMPIRICAL SOFTWARE ENGINEERING (2021)

Article Chemistry, Multidisciplinary

An Empirical Study on Software Defect Prediction Using CodeBERT Model

Cong Pan et al.

Summary: In this research, various CodeBERT models are proposed for software defect prediction, aiming to investigate the potential performance improvement of using a neural language model like CodeBERT in cross-version and cross-project defect prediction. Different prediction patterns in software defect prediction using CodeBERT models are also analyzed in the empirical studies, with further discussion on the results.

APPLIED SCIENCES-BASEL (2021)

Proceedings Paper Computer Science, Software Engineering

Understanding Neural Code Intelligence through Program Simplification

Md Rafiqul Islam Rabin et al.

Summary: This paper introduces a simple, model-agnostic approach called SIVAND, which identifies critical input features for models in CI systems by drawing on software debugging research, specifically delta debugging. This method reduces input program sizes while maintaining the predictions of the model.

PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21) (2021)

Proceedings Paper Computer Science, Software Engineering

Vulnerability Detection with Fine-Grained Interpretations

Yi Li et al.

Summary: IVDETECT is an interpretable vulnerability detector that uses artificial intelligence to detect vulnerabilities and provide explanations of vulnerable statements, outperforming existing DL-based approaches in vulnerability detection accuracy and interpretability.

PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21) (2021)

Article Acoustics

Pre-Training With Whole Word Masking for Chinese BERT

Yiming Cui et al.

Summary: The paper introduces the whole word masking strategy and a new Chinese pre-trained language model, MacBERT, which shows outstanding performance in multiple NLP tasks. Experimental results demonstrate that MacBERT achieves state-of-the-art performances and provides insights for future research.

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2021)

Article Computer Science, Information Systems

Dropout vs. batch normalization: an empirical study of their impact to deep learning

Christian Garbin et al.

MULTIMEDIA TOOLS AND APPLICATIONS (2020)

Article Computer Science, Information Systems

A large scale empirical study of the impact of Spaghetti Code and Blob anti-patterns on program comprehension

Cristiano Politowski et al.

INFORMATION AND SOFTWARE TECHNOLOGY (2020)

Proceedings Paper Computer Science, Software Engineering

HINDBR: Heterogeneous Information Network Based Duplicate Bug Report Prediction

Guanping Xiao et al.

2020 IEEE 31ST INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE 2020) (2020)

Proceedings Paper Computer Science, Software Engineering

Are the Code Snippets What We Are Searching for? A Benchmark and an Empirical Study on Code Search with Natural-Language Queries

Shuhan Yan et al.

PROCEEDINGS OF THE 2020 IEEE 27TH INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION, AND REENGINEERING (SANER '20) (2020)

Article Computer Science, Artificial Intelligence

An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models

Lifu Tu et al.

TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (2020)

Article Statistics & Probability

The relative performance of ensemble methods with deep convolutional neural networks for image classification

Cheng Ju et al.

JOURNAL OF APPLIED STATISTICS (2018)

Proceedings Paper Computer Science, Software Engineering

Deep Code Search

Xiaodong Gu et al.

PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE) (2018)

Article Computer Science, Software Engineering

Learning to rank code examples for code search engines

Haoran Niu et al.

EMPIRICAL SOFTWARE ENGINEERING (2017)

Article Computer Science, Software Engineering

Autofolding for Source Code Summarization

Jaroslav Fowkes et al.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2017)

Proceedings Paper Computer Science, Software Engineering

An Empirical Study on the Usage of SQL Execution Traces for Program Comprehension

Nesrine Noughi et al.

2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY COMPANION (QRS-C) (2017)

Proceedings Paper Computer Science, Software Engineering

Towards Accurate Duplicate Bug Retrieval using Deep Learning Techniques

Jayati Deshmukh et al.

2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME) (2017)

Article Computer Science, Software Engineering

An empirical study of the textual similarity between source code and source code summaries

Paul W. McBurney et al.

EMPIRICAL SOFTWARE ENGINEERING (2016)

Proceedings Paper Computer Science, Software Engineering

SWIM: Synthesizing What I Mean

Mukund Raghothaman et al.

2016 IEEE/ACM 38TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE) (2016)

Proceedings Paper Computer Science, Software Engineering

Relationship-Aware Code Search for JavaScript Frameworks

Xuan Li et al.

FSE'16: PROCEEDINGS OF THE 2016 24TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON FOUNDATIONS OF SOFTWARE ENGINEERING (2016)

Proceedings Paper Computer Science, Software Engineering

CodeHow: Effective Code Search based on API Understanding and Extended Boolean Model

Fei Lv et al.

2015 30TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE) (2015)

Proceedings Paper Materials Science, Multidisciplinary

Study on CMM-based Software Quality Assurance Process Improvement-A case of the Educational Software Quality Assurance Model

Aiming Huang et al.

MODERN TECHNOLOGIES IN MATERIALS, MECHANICS AND INTELLIGENT SYSTEMS (2014)

Review Computer Science, Software Engineering

A Systematic Survey of Program Comprehension through Dynamic Analysis

Bas Cornelissen et al.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2009)

Article Computer Science, Artificial Intelligence

A fast learning algorithm for deep belief nets

Geoffrey E. Hinton et al.

NEURAL COMPUTATION (2006)