2016-07-122016-07-122013Rodríguez, D., Ruíz, R., Riquelme Santos, J.C. y Harrison, R. (2013). A study of subgroup discovery approaches for defect prediction. Information and Software Technology, 55 (10), 1810-1822.0950-5849http://hdl.handle.net/11441/43511Context: Although many papers have been published on software defect prediction techniques, machine learning approaches have yet to be fully explored. Objective: In this paper we suggest using a descriptive approach for defect prediction rather than the pre-cise classification techniques that are usually adopted. This allows us to characterise defective modules with simple rules that can easily be applied by practitioners and deliver a practical (or engineering) approach rather than a highly accurate result. Method: We describe two well-known subgroup discovery algorithms, the SD algorithm and the CN2-SD algorithm to obtain rules that identify defect prone modules. The empirical work is performed with pub-licly available datasets from the Promise repository and object-oriented metrics from an Eclipse reposi-tory related to defect prediction. Subgroup discovery algorithms mitigate against characteristics of datasets that hinder the applicability of classification algorithms and so remove the need for preprocess-ing techniques. Results: The results show that the generated rules can be used to guide testing effort in order to improve the quality of software development projects. Such rules can indicate metrics, their threshold values and relationships between metrics of defective modules. Conclusions: The induced rules are simple to use and easy to understand as they provide a description rather than a complete classification of the whole dataset. Thus this paper represents an engineering approach to defect prediction, i.e., an approach which is useful in practice, easily understandable and can be applied by practitioners.application/pdfengAttribution-NonCommercial-NoDerivatives 4.0 Internacionalhttp://creativecommons.org/licenses/by-nc-nd/4.0/Subgroup discoveryRulesDefect PredictionImbalanced datasetsA study of subgroup discovery approaches for defect predictioninfo:eu-repo/semantics/articleinfo:eu-repo/semantics/openAccess10.1016/j.infsof.2013.05.002https://idus.us.es/xmlui/handle/11441/43511