Feature selection is an NP-hard problem to remove irrelevant and redundant features with no predictive information to increase the performance of machine learning algorithms. Many wrapper-based methods using metaheuristic algorithms have been proposed to select effective features. However, they achieve differently on medical data, and most of them cannot find those effective features that may fulfill the required accuracy in diagnosing important diseases such as Diabetes, Heart problems, Hepatitis, and Coronavirus, which are targeted datasets in this study. To tackle this drawback, an algorithm is needed that can strike a balance between local and global search strategies in selecting effective features from medical datasets. In this paper, a new binary optimizer algorithm named BSMO is proposed. It is based on the newly proposed starling murmuration optimizer (SMO) that has a high ability to solve different complex and engineering problems, and it is expected that BSMO can also effectively find an optimal subset of features. Two distinct approaches are utilized by the BSMO algorithm when searching medical datasets to find effective features. Each dimension in a continuous solution generated by SMO is simply mapped to 0 or 1 using a variable threshold in the second approach, whereas in the first, binary versions of BSMO are developed using several S-shaped and V-shaped transfer functions. The performance of the proposed BSMO was evaluated using four targeted medical datasets, and results were compared with well-known binary metaheuristic algorithms in terms of different metrics, including fitness, accuracy, sensitivity, specificity, precision, and error. Finally, the superiority of the proposed BSMO algorithm was statistically analyzed using Friedman non-parametric test. The statistical and experimental tests proved that the proposed BSMO attains better performance in comparison to the competitive algorithms such as ACO, BBA, bGWO, and BWOA for selecting effective features from the medical datasets targeted in this study.
- binary metaheuristic algorithms
- disease diagnosis
- feature selection
- medical data
- starling murmuration optimizer (SMO)
- transfer function