KLASIFIKASI RISIKO FATALITAS EKSPEDISI GUNUNG BERDASARKAN ATRIBUT DEMOGRAFI DAN GEOGRAFIS MENGGUNAKAN NATURAL LANGUANGE PROCESSING DAN SUPERVISED LEARNING
Keywords:
Natural Language Processing, Supervised Learning,, Fattalitas Ekspedisi, Klasifikasi Risiko, Machine LearningAbstract
Penelitian ini bertujuan mengembangkan model klasifikasi risiko fatalitas ekspedisi gunung dengan memanfaatkan teknik Natural Language Processing dan supervised learning untuk mengolah data teks penyebab kematian serta atribut demografis. Penelitian ini merespons tantangan pengolahan data tidak terstruktur yang sering mengandung variasi penulisan dan ambiguitas sehingga membutuhkan metode komputasi yang mampu menangkap informasi penting secara akurat. Metode yang digunakan meliputi preprocessing teks, pembobotan TF-IDF, frequency encoding untuk atribut kewarganegaraan, serta pembangunan model Random Forest dan Support Vector Machine. Model dievaluasi menggunakan metrik Accuracy, Precision, Recall, dan F1-score untuk memastikan kualitas prediksi. Hasil penelitian menunjukkan bahwa Random Forest mencapai akurasi 0.98 dan lebih stabil dibandingkan SVM dalam menangani ketidakseimbangan kelas. Fitur teks terbukti memberi kontribusi terbesar dalam menentukan kategori risiko fatalitas, sementara atribut demografis memberi pengaruh tambahan yang lebih kecil tetapi tetap relevan. Temuan ini menunjukkan bahwa analisis berbasis NLP dapat meningkatkan pemahaman terhadap pola risiko fatalitas dan berpotensi mendukung pengembangan sistem pendukung keputusan untuk keselamatan pendakian gunung. Pendekatan ini memudahkan identifikasi faktor risiko yang sebelumnya sulit diketahui karena keterbatasan analisis manual. Penelitian ini memberi dasar yang kuat untuk pengembangan model risiko yang lebih komprehensif dan dapat diadaptasi pada domain keselamatan lainnya.
References
Acharya, A. (2024). Clinical risk prediction using language models: benefits and considerations. Journal of the American Medical Informatics Association, 31(9), 1856–1867. https://doi.org/10.1093/jamia/ocad028
Bugalia, N., Tarani, V., & Gadekar, H. (2022). Machine learning-based automated classification of worker-reported safety reports in construction. Journal of Information Technology in Construction, 27, 926–950. https://doi.org/10.36680/j.itcon.2022.045
BuHamra, S. S., Al-Jarallah, M., Aldhaheri, N., & AlSumih, F. (2022). An NLP tool for data extraction from electronic health records. Frontiers in Public Health, 10, 1070870. https://doi.org/10.3389/fpubh.2022.1070870
Chen, W., Wu, X., & Wu, G. (2024). A survey on imbalanced learning: latest research, applications and future directions. Artificial Intelligence Review, 57, 137. https://doi.org/10.1007/s10462-024-10759-6
Crespí, A., Arévalo, O., & Santana, J. (2025). Lifecycle models in machine learning development. Expert Systems, e70029. https://doi.org/10.1111/exsy.70029
De Angeli, K., Chakraborty, S., Sandulescu, V., & Rosenberger, H. (2021). Class imbalance in out-of-distribution datasets: improving robustness in biomedical NLP. Scientific Reports, 11482. https://doi.org/10.1038/s41598-021-90760-w
Du, K. L. (2025). Understanding machine learning principles. Mathematics, 13(3), 451. https://doi.org/10.3390/math13030451
Eker, H., & Uçar, E. (2024). Natural Language Processing Risk Assessment in Marble Quarries. Applied Sciences, 14(19), 9045. https://doi.org/10.3390/app14199045
Gao, Y., Dligach, D., Christensen, L., Tesch, S., Laffin, R., Xu, D., Miller, T., Uzuner, Ö., Churpek, M. M., & Afshar, M. (2021). A scoping review of publicly available language tasks in clinical natural language processing. arXiv.
Hancock, J. T., Ritz, L., & Zhao, J. (2024). Data reduction techniques for highly imbalanced Medicare big data. Journal of Big Data, 11, 8. https://doi.org/10.1186/s40537-023-00869-3
Hellín, C. J., Pérez, J., Real, P., & Orts-Escolano, S. (2024). Unraveling the Impact of Class Imbalance on Deep-Learning Prediction Metrics. Applied Sciences, 14(8), 3419. https://doi.org/10.3390/app14083419
Henning, S., Beluch, W., Fraser, A., & Friedrich, A. (2023). A Survey of Methods for Addressing Class Imbalance in Deep-Learning Based Natural Language Processing. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL) (pp. 523–540).
Khairuddin, M. Z. F., Hasikin, K., Abd Razak, N. A., Lai, K. W., Osman, M. Z., Aslan, M. F., Sabanci, K., Azizan, M. M., Satapathy, S. C., & Wu, X. (2022). Predicting occupational injury causal factors using text-based analytics: A systematic review. Frontiers in Public Health, 10, 984099. https://doi.org/10.3389/fpubh.2022.984099
Khairuddin, M. Z. F., Janssen, G. R., & Schipper, S. (2024). Contextualizing injury severity from occupational accident narratives using deep-learning-based text classification. Safety, 10(2), 12. https://doi.org/10.3390/safety10020012
Khalate, P. (2024). Advancements and gaps in natural language processing: A review. Frontiers in Physics. https://doi.org/10.3389/fphy.2024.1445204
Khurana, D., Kaushik, D., & Arora, A. (2023). Natural language processing: State of the art, current trends and challenges. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-022-13428-4
Li, H., Liu, Z., Sun, W., Li, T., & Dong, X. (2024). Interpretable machine learning for the prediction of death risk in patients with acute diquat poisoning. Scientific Reports, 14, 16101. https://doi.org/10.1038/s41598-024-67257-6
Mitrakas, C. (2025). Techniques and Models for Addressing Occupational Risk: Machine Learning Approaches in Real-world Risk Assessment. Applied Sciences, 15(4), 1909. https://doi.org/10.3390/app15041909
Pugliese, R., Brambilla, M., Ferri, F., Franco, S., Ghirardi, G., & Galliani, L. (2021). Machine learning-based approach: Global trends, research directions and applications. Technological Forecasting & Social Change, 169, 120795. https://doi.org/10.1016/j.techfore.2021.120795
Seneviratne, M. G., Tran, L. T., & Stumpf, S. (2022). User-centred design for machine learning in health care: A practical toolkit. BMJ Health & Care Informatics, 29(1), e100656. https://doi.org/10.1136/bmjhci-2022-100656
Shuang, Q., Liu, J., & Zhao, Y. (2023). Determining critical cause combination of fatality accidents on construction sites via machine learning. Buildings, 13(2), 345. https://doi.org/10.3390/buildings13020345
Siregar, K. N., Megananda, N. R., & Cornelis, C. E. (2022). Strengthening causes of death identification through community-based verbal autopsy during the COVID-19 pandemic in Indonesia. BMC Public Health, 22, 14014. https://doi.org/10.1186/s12889-022-14014-x
Sundaram, G., & Berleant, D. (2022). Automating Systematic Literature Reviews with Natural Language Processing and Text Mining: a Systematic Literature Review. arXiv.
Wang, S., Li, Y., & Bar, N. (2024). A natural language processing approach to detect annotation inconsistencies in death investigation notes. Communications Medicine, 4, 82. https://doi.org/10.1038/s43856-024-00631-7
Downloads
Published
Issue
Section
Citation Check
License
Copyright (c) 2025 Imbaraga Gempar Guna Laksana, Dian Ade Kurnia, Yudhistira Arie Wijaya, Mulyawan Mulyawan, Gifthera Dwilestari

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.




