Understanding the rationale for updating a function's comment (bibtex)
by Malik, Haroon, Chowdhury, Istehad, Tsou, Hsiao Ming, Zhen, Ming Jiang and Hassan, Ahmed E.
Abstract:
Up-to-date comments are critical for the successful evolution of a software application. When modifying a function, developers may update the comment associated with the function or may not update it. For example, comments associated with a complex function are likely to be updated more often when the function is modified to prevent the code and the comments from drifting apart. Nevertheless, the rationale behind updating a comment has never been studied. In this paper, we present a large empirical study to better understand the rationale for updating comments. We recover the code change history for four large open source projects (GCC: a compiler, FreeBSD: an operation system, PostgreSQL: a database management system, and GCluster: a clustering framework) with an average code history of 10 years. Using the Random Forests algorithm, we investigate the rationale for updating comments along three dimensions: characteristics of the changed function, characteristics of the change itself and time and code ownership characteristics. Our case study shows that we can predict with an accuracy of 80%; the likelihood of updating the comment associated with a modified function. We perform a sensitivity analysis to determine the most important attributes. Our analysis shows that the percentage of changed call dependencies and control statements, the age of the modified function and the number of co-changed functions which depend on it are the most important attributes in determining the likelihood of updating comments.
Reference:
Understanding the rationale for updating a function's comment (Malik, Haroon, Chowdhury, Istehad, Tsou, Hsiao Ming, Zhen, Ming Jiang and Hassan, Ahmed E.), In IEEE International Conference on Software Maintenance, ICSM, 2008.
Bibtex Entry:
@inproceedings{Malik2008,
abstract = {Up-to-date comments are critical for the successful evolution of a software application. When modifying a function, developers may update the comment associated with the function or may not update it. For example, comments associated with a complex function are likely to be updated more often when the function is modified to prevent the code and the comments from drifting apart. Nevertheless, the rationale behind updating a comment has never been studied. In this paper, we present a large empirical study to better understand the rationale for updating comments. We recover the code change history for four large open source projects (GCC: a compiler, FreeBSD: an operation system, PostgreSQL: a database management system, and GCluster: a clustering framework) with an average code history of 10 years. Using the Random Forests algorithm, we investigate the rationale for updating comments along three dimensions: characteristics of the changed function, characteristics of the change itself and time and code ownership characteristics. Our case study shows that we can predict with an accuracy of 80{\%}; the likelihood of updating the comment associated with a modified function. We perform a sensitivity analysis to determine the most important attributes. Our analysis shows that the percentage of changed call dependencies and control statements, the age of the modified function and the number of co-changed functions which depend on it are the most important attributes in determining the likelihood of updating comments.},
author = {Malik, Haroon and Chowdhury, Istehad and Tsou, Hsiao Ming and Zhen, Ming Jiang and Hassan, Ahmed E.},
booktitle = {IEEE International Conference on Software Maintenance, ICSM},
doi = {10.1109/ICSM.2008.4658065},
isbn = {9781424426140},
issn = {1063-6773},
keywords = {SQL;database management systems;pattern clustering;public domain software;random processes;FreeBSD;GCluster;PostgreSQL;compilers;database management system;function comment;open source projects;random forests algorithm;sensitivity analysis;software application;Accuracy;Application software;Clustering algorithms;Costs;Database systems;History;Programming profession;Sensitivity analysis;Software engineering;Writing,cocome_lit-review},
mendeley-tags = {cocome_lit-review},
pages = {167--176},
title = {{Understanding the rationale for updating a function's comment}},
year = {2008}
}
Powered by bibtexbrowser