Skip to main navigation Skip to search Skip to main content

Improved intelligent water drop-based hybrid feature selection method for microarray data processing

Research output: Contribution to journalArticlepeer-review

Abstract

Classifying microarray datasets, which usually contains many noise genes that degrade the performance of classifiers and decrease classification accuracy rate, is a competitive research topic. Feature selection (FS) is one of the most practical ways for finding the most optimal subset of genes that increases classification's accuracy for diagnostic and prognostic prediction of tumor cancer from the microarray datasets. This means that we always need to develop more efficient FS methods, that select only optimal or close-to-optimal subset of features to improve classification performance. In this paper, we propose a hybrid FS method for microarray data processing, that combines an ensemble filter with an Improved Intelligent Water Drop (IIWD) algorithm as a wrapper by adding one of three local search (LS) algorithms: Tabu search (TS), Novel LS algorithm (NLSA), or Hill Climbing (HC) in each iteration from IWD, and using a correlation coefficient filter as a heuristic undesirability (HUD) for next node selection in the original IWD algorithm. The effects of adding three different LS algorithms to the proposed IIWD algorithm have been evaluated through comparing the performance of the proposed ensemble filter-IIWD-based wrapper without adding any LS algorithms named (PHFS-IWD) FS method versus its performance when adding a specific LS algorithm from (TS, NLSA or HC) in FS methods named, (PHFS-IWDTS, PHFS-IWDNLSA, and PHFS-IWDHC), respectively. Naïve Bayes(NB) classifier with five microarray datasets have been deployed for evaluating and comparing the proposed hybrid FS methods. Results show that using LS algorithms in each iteration from the IWD algorithm improves F-score value with an average equal to 5% compared with PHFS-IWD. Also, PHFS-IWDNLSA improves the F-score value with an average of 4.15% over PHFS-IWDTS, and 5.67% over PHFS-IWDHC while PHFS-IWDTS outperformed PHFS-IWDHC with an average of increment equal to 1.6%. On the other hand, the proposed hybrid-based FS methods improve accuracy with an average equal to 8.92% in three out of five datasets and decrease the number of genes with a percentage of 58.5% in all five datasets compared with six of the most recent state-of-the-art FS methods.

Original languageEnglish
Article number107809
JournalComputational Biology and Chemistry
Volume103
DOIs
Publication statusPublished - Apr 2023

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being
  2. SDG 7 - Affordable and Clean Energy
    SDG 7 Affordable and Clean Energy
  3. SDG 9 - Industry, Innovation, and Infrastructure
    SDG 9 Industry, Innovation, and Infrastructure
  4. SDG 11 - Sustainable Cities and Communities
    SDG 11 Sustainable Cities and Communities
  5. SDG 12 - Responsible Consumption and Production
    SDG 12 Responsible Consumption and Production
  6. SDG 13 - Climate Action
    SDG 13 Climate Action
  7. SDG 17 - Partnerships for the Goals
    SDG 17 Partnerships for the Goals

Keywords

  • High dimensional datasets
  • Hybrid feature selection
  • Intelligent water drop algorithm
  • Machine learning
  • Medical applications

Fingerprint

Dive into the research topics of 'Improved intelligent water drop-based hybrid feature selection method for microarray data processing'. Together they form a unique fingerprint.

Cite this