SZZ in the time of Pull Requests

09/07/2022
by   Fernando Petrulio, et al.
0

In the multi-commit development model, programmers complete tasks (e.g., implementing a feature) by organizing their work in several commits and packaging them into a commit-set. Analyzing data from developers using this model can be useful to tackle challenging developers' needs, such as knowing which features introduce a bug as well as assessing the risk of integrating certain features in a release. However, to do so one first needs to identify fix-inducing commit-sets. For such an identification, the SZZ algorithm is the most natural candidate, but its performance has not been evaluated in the multi-commit context yet. In this study, we conduct an in-depth investigation on the reliability and performance of SZZ in the multi-commit model. To obtain a reliable ground truth, we consider an already existing SZZ dataset and adapt it to the multi-commit context. Moreover, we devise a second dataset that is more extensive and directly created by developers as well as Quality Assurance (QA) engineers of Mozilla. Based on these datasets, we (1) test the performance of B-SZZ and its non-language-specific SZZ variations in the context of the multi-commit model, (2) investigate the reasons behind their specific behavior, and (3) analyze the impact of non-relevant commits in a commit-set and automatically detect them before using SZZ.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro