Landslide hazards give rise to considerable demolition and losses to lives in hilly areas. To reduce the destruction in these endangered regions, the prediction of landslide incidents with good accuracy remains a key challenge. Over the years, machine learning models have been used to increase the accuracy and precision of landslide predictions. These machine learning models are sensitive to the data on which they are applied. Feature selection is a crucial task in applying machine learning as meticulously selected features can significantly improve the performance of the machine learning model. These selected features decrease the learning time of the model and increase comprehensibility. In this paper, we have considered three feature selection methods namely chi-squared, extra tree classifier and heat map. The paper substantiates that feature selection can significantly increase the performance of the model. The study was carried out on the landslide data of the Kullu to Rohtang Pass transport corridor in Himachal Pradesh, India. The classification score and receiver operating characteristics (ROC) curves were used to evaluate the model performance. Results exhibited that eliminating one or more features using different feature selection methods increased the comprehensibility of the model by reducing the dimensionality of the dataset. The model achieved an accuracy of 90.74% and an area under the ROC curve (AUROC) value of 0.979. Furthermore, it can be deduced that with a reduced number of features model learns faster without affecting the actual result.
- Feature selection methods
- Machine learning
- Landslide susceptibility prediction
- Receiver operating chracteristics