PENINGKATAN ROBUSTNESS MODEL RESNET50 UNTUK KLASIFIKASI UBUR-UBUR MENGGUNAKAN AUGMENTASI DATA CUTMIX DAN MIXUP

Satrio Wicaksono; Dian Ade Kurnia; Yudhistira Arie Wijaya; Bani Nurhakim; Denni Pratama

Authors

Satrio Wicaksono STMIK IKMI Cirebon, Indonesia
Dian Ade Kurnia STMIK IKMI Cirebon, Indonesia
Yudhistira Arie Wijaya STMIK IKMI Cirebon, Indonesia
Bani Nurhakim STMIK IKMI Cirebon, Indonesia
Denni Pratama STMIK IKMI Cirebon, Indonesia

Keywords:

ResNet50; klasifikasi; ubur-ubur; MixUp; CutMix

Abstract

Penelitian ini bertujuan meningkatkan robustness model ResNet50 untuk klasifikasi ubur-ubur melalui penerapan teknik augmentasi MixUp dan CutMix pada kondisi citra bawah laut yang mengalami degradasi visual seperti color cast, noise, scattering, dan low contrast. Penelitian menggunakan pendekatan transfer learning dengan fine-tuning pada 30 lapisan akhir model serta membandingkan dua skema augmentasi, yaitu augmentasi dasar dan kombinasi MixUp–CutMix yang didukung mekanisme early stopping untuk menjaga stabilitas pelatihan. Hasil penelitian menunjukkan bahwa model baseline mengalami overfitting dan hanya mencapai akurasi validasi 41,11%, sedangkan integrasi MixUp dan CutMix meningkatkan akurasi menjadi 47,22% dan menurunkan validation loss dari 1,4373 menjadi 1,3127. Peningkatan ini mengindikasikan bahwa strategi mixed-sample augmentation memperkaya variasi data sintetis dan memperkuat representasi fitur yang dipelajari model, sehingga meningkatkan ketahanan terhadap noise dan pergeseran distribusi citra bawah laut. Penelitian ini menyimpulkan bahwa penggunaan MixUp dan CutMix merupakan pendekatan efektif untuk meningkatkan performa model klasifikasi pada dataset yang terbatas dan bervariasi, serta memberikan implikasi bahwa augmentasi tingkat lanjut dapat diterapkan sebagai solusi praktis untuk memperbaiki generalisasi model tanpa memerlukan penambahan data berskala besar.

References

Abbasian Ardakani, G., & al., et. (2024). Interpretation of Artificial Intelligence Models in Healthcare: Evaluation metrics, pitfalls and recommendations. Journal of Ultrasound in Medicine. https://doi.org/10.1002/jum.16524

Ahmmed, B., Rau, G., E., Mudunuru, K., M., Karra, S., Tempelman, R., J., Wachtor, J., A., Forien, J.-B., & al, et. (2024). Deep learning with mixup augmentation for improved pore detection during additive manufacturing. Scientific Reports.

Ahmmed, B., Rau, G., E., Mudunuru, K., M., Karra, S., Tempelman, R., J., Wachtor, J., A., Forien, J.-B., Guss, M., G., Calta, P., N., DePond, … M., J. (2024). Deep learning with mixup augmentation for improved pore detection during additive manufacturing. Scientific Reports.

Arafin, P., & al, et. (2022). Performance comparison of multiple convolutional neural network architectures for visual defect classification. Sensors, 22(22).

Avberšek, K., L., & al, et. (2022). Deep learning in neuroimaging data analysis: Applications, pitfalls and recommendations.

Benavoli, A., Corani, G., Mangili, F., Zaffalon, M., Ruggeri, & F. (2020). Bayesian signed-rank test for pairwise comparison of classification algorithms. Journal of Machine Learning Research, 1–30.

Benerradi, J., Clos, J., Landowska, A., Valstar, F., M., Wilson, & M., L. (2023). Benchmarking framework for machine learning classification from fNIRS data (BenchNIRS): Improving experimental rigor with nested cross-validation and open artifacts.

Bouthillier, X., & al, et. (2021). Accounting for variance in machine learning benchmarks. Proceedings of the National Academy of Sciences, 118(15).

Cao, C., Zhou, F., Dai, Y., Wang, J., Zhang, & K. (2023). A survey of mix-based data augmentation: taxonomy, methods, applications, and explainability. Journal of Big Data, 11(1).

Chen, Y., Xu, & L. (2022). Robustness evaluation of convolutional neural networks under input perturbations: The role of validation loss. Journal of Computer Science, 145–158.

Dablain, D., Jacobson, N., K., Bellinger, C., Roberts, M., Chawla, & N., V. (2023). Understanding CNN fragility when learning with imbalanced data. Machine Learning, 4785–4810.

Ditria, M., E., Lopez-Marcano, S., Sievers, K., M., Jinks, L., E., Brown, J., C., Connolly, & R., M. (2020). Automating the analysis of fish abundance using object detection: Optimizing animal ecology with deep learning. Frontiers in Marine Science.

Dong, Y., & al, et. (2023). Competition on robust deep learning. National Science Review, 10(6).

Dumoulin, V., & Visin, F. (2021). A Guide to Convolution Arithmetic for Deep Learning. ArXiv. https://doi.org/10.48550/arXiv.1603.07285

Eghbal-Zadeh, H., & al, et. (2024). Rethinking data augmentation for adversarial robustness.

Emmert-Streib, F., Yang, Z., Feng, H., Tripathi, S., Dehmer, & M. (2020). An introductory review of deep learning for prediction models with big data. Frontiers in Artificial Intelligence.

Furusho, Y., Ikeda, & K. (2020). Theoretical analysis of skip connections and batch normalization from generalization and optimization perspectives.

Gao, M., Li, S., Wang, K., Bai, Y., Ding, Y., Zhang, B., Guan, N., Wang, & P. (2023). Real-time jellyfish classification and detection algorithm based on improved YOLOv4-tiny and improved underwater image enhancement algorithm. Scientific Reports.

Garcea, F., Gatta, C., Caputo, & B. (2023). Data augmentation for medical imaging: A systematic review. Computerized Medical Imaging and Graphics.

Guo, H., Somayajula, A., S., Hosseini, R., & al, et. (2024). Improving image classification of gastrointestinal endoscopy using curriculum self-supervised learning (C-Mixup). Scientific Reports.

Hao, X., Liu, L., Yang, R., Yin, L., Zhang, L., Li, & X. (2023). A review of data augmentation methods of remote sensing image target recognition. Remote Sensing, 15(3).

Hellín, J., C., Olmedo, A., A., Valledor, A., Gómez, J., López-Benítez, M., Tayebi, & A. (2024). Unraveling the impact of class imbalance on deep-learning models for medical image classification. Applied Sciences, 14(8).

Ho, M., T., Lee, & K., Y. (2023). Comprehensive analysis of performance metrics for deep convolutional neural networks in image recognition tasks. Computers, 12(8).

Hossain, B., M., Iqbal, H., S. S., Islam, M., M., Akhtar, N., M., Sarker, & I., H. (2022). Transfer learning with fine-tuned deep CNN ResNet50 model for classifying COVID-19 from chest X-ray images. Informatics in Medicine Unlocked.

Houssein, H., E., & al, et. (2024). An effective multiclass skin-cancer classification approach using deep convolutional neural networks: accuracy, recall, precision, F1-score, specificity and AUC.

Hu, X., Zhang, & L. (2024). Theoretical analysis of CutMix: Patch-based Bayesian priors for robust image classification. Neural Networks.

Imaduddin, H., Kumar, & P. (2024). Fine-tuning ResNet-50 for the classification of visual impairments from retinal fundus images.

Islam, T., Hafiz, S., M., Rahman, J., Kabir, M., M., Mridha, & M., F. (2024). A systematic review of deep learning data augmentation in medical imaging: Recent advances and future research directions. Healthcare Analytics.

Jiang, X., Yu, H., Zhang, Y., Pan, M., Li, Z., Liu, J., Lv, & S. (2023). An Underwater Image Enhancement Method for a Preprocessing Framework Based on Generative Adversarial Network. Sensors, 23(13).

Jin, G., & al, et. (2024). Performance evaluation of convolutional neural network models for highway distress detection: precision, recall, F1-score and confusion matrix. Applied Sciences, 14(10).

Kandimalla, V., Richard, M., Smith, F., Quirion, J., Torgo, L., Whidden, & C. (2022). Automated detection, classification and counting of fish in fish passages with deep learning. Frontiers in Marine Science.

Kang, J., & al.], [et. (2024). Diagnosing oral and maxillofacial diseases using deep learning: data-augmentation strategies including Mixup. Scientific Reports.

Kattenborn, T., Leitloff, J., Schiefer, F., Hinz, & S. (2021). Review on convolutional neural networks (CNN) in vegetation remote sensing. ISPRS Journal of Photogrammetry and Remote Sensing, 24–49.

Kaur, B., A., Kaur, M., Singh, D., Roy, S., Amoon, & M. (2023). Efficient skip connections-based residual network (ESRNet) for brain tumor classification. Diagnostics, 13(20).

Khan, H.-M., M., Makoonlall, A., Nazurally, N., Mungloo-Dilmohamud, & Z. (2023). Identification of Crown of Thorns Starfish (COTS) using Convolutional Neural Network (CNN) and attention model. PLOS ONE, 18(4). https://doi.org/10.1371/journal.pone.0283121

Kim, E., H., Cosa-Linan, A., Santhanam, N., Jannesari, M., Maros, E., M., Ganslandt, & T. (2022). Transfer learning for medical image classification: a literature review. BMC Medical Imaging.

Kim, H., Finn, & C. (2022). C-Mixup: Improving Generalization in Regression.

Kim, T., Cho, H., Oh, & S. (2023). Understanding Robustness of Mixup and CutMix Data Augmentation. IEEE Access, 82354–82367.

Klomp, R., S., Wijnhoven, J., R. G., With, de, & N, P. H. (2023). Performance-Efficiency Comparisons of Channel Attention Modules for ResNets. Neural Processing Letters, 55(10), 6797–6813.

Kotsilieris, T., & al, et. (2022). Regularization techniques for machine learning and their theoretical and practical aspects. Electronics, 11(4).

Krichen, M., & al, et. (2023). Convolutional Neural Networks: A Survey. Computers, 12(8).

Lee, H., Lee, H., Kim, & J. (2023). The effects of mixed sample data augmentation are class dependent.

Li, F., Zhao, M., Yu, & X. (2024). Unifying CutMix and Mixup via interpolation manifold theory. Entropy, 26(5).

Li, S., Li, P., He, S., Kuai, Z., Gu, Y., Liu, H., Liu, T., & Lin, Y. (2024). An automatic detection and statistical method for underwater fish based on Foreground Region Convolution Network (FR-CNN). Journal of Marine Science and Engineering, 12(8), 1343. https://doi.org/10.3390/jmse12081343

Li, X., Yang, J., Zhang, & Y. (2023). Comparison of classifiers: A robust bootstrap-based statistical framework. PLOS ONE, 18(3).

Li, Y., Wang, H., Guo, J., Zhao, & L. (2023). Efficient fine-tuning of deep CNNs with discriminative learning-rate strategies for domain-specific adaptation. Sensors, 23(14).

Li, Z., Wang, Y., Chen, & X. (2024). An empirical study of geometric and photometric augmentation magnitudes on CNN robustness.

Liao, T., Yang, R., Zhao, P., Zhou, W., He, M., Li, & L. (2022). MDAM-DRNet: Dual channel residual network with multi-directional attention mechanism in strawberry leaf diseases detection. Frontiers in Plant Science.

Liu, W., Gao, S., Han, & J. (2022). The effect of learning rate scheduling and batch normalization on the convergence of deep neural networks. Applied Sciences, 12(19).

Liu, Y., T., Mirzasoleiman, & B. (2022). Data-Efficient Augmentation for Training Neural Networks.

Luo, H., Zhang, Y., Sun, & J. (2023). A probabilistic interpretation of Mixup for regularizing neural networks. Pattern Recognition Letters, 1–10.

Mai, J., Gao, C., Bao, & J. (2025). Domain generalization through data augmentation: A survey of methods, applications, and challenges. Mathematics, 13(5).

Martin-Abadal, M., Ruiz-Frau, A., Hinz, H., González-Cid, & Y. (2020). Jellytoring: Real-time jellyfish monitoring based on deep learning object detection. Sensors, 20(6).

Mienye, D., I., Swart, & T., G. (2024). A comprehensive review of deep learning: Architectures, recent advances, and applications. Information, 15(12).

Miseta, T., Fodor, A., Vathy-Fogarassy, & Á. (2024). Surpassing early stopping: A novel correlation-based stopping criterion for neural networks. Neurocomputing.

Moodley, C., Sephton, B., Rodríguez-Fajardo, V., Forbes, & A. (2021). Deep learning early stopping for non-degenerate ghost imaging. Scientific Reports.

Mumuni, A., Mumuni, & F. (2022). Data augmentation: A comprehensive survey of modern approaches. Array.

Najeeb, M., Alariyibi, & A. (2024). Imbalanced dataset effect on CNN-based classifier performance for face recognition. International Journal of Artificial Intelligence & Applications, 15(1), 25–41.

Noh, & S.-H. (2021). Performance Comparison of CNN Models Using Gradient Flow Analysis. Informatics, 8(3).

Park, C., Yun, S., Chun, & S. (2022a). A unified analysis of mixed sample data augmentation: A loss-function perspective.

Park, C., Yun, S., Chun, & S. (2022b). A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective. Advances in Neural Information Processing Systems, 17457–17470.

Park, C., Yun, S., Chun, & S. (2022c). A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective.

Park, J., Kim, J., Lee, & J. (2024). Impact of regularization on calibration and robustness: From the representation space perspective.

Pineau, J., Vincent-Lamarre, P., Sinha, K., Larivière, V., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Larochelle, & H. (2021). Improving reproducibility in machine learning research: A report from the NeurIPS 2019 reproducibility program. Journal of Machine Learning Research, 1–20.

Rainio, O., Teuho, J., Klén, & R. (2024). Evaluation metrics and statistical tests for machine learning: Common metrics for CNN classification, segmentation and object detection. Scientific Reports, 14(1).

Razavi, M., & al, et. (2024). ResNet deep models and transfer learning technique for classification of structural content.

Rezvani, S., Wang, & X. (2023). A broad review on class imbalance learning techniques. Applied Soft Computing.

Robillard, J., A., Trizna, G., M., Ruiz-Tafur, M., Panduro, D., L., E., Santana, de, D., C., White, E., A., Dikow, B., R., Deichmann, & J., L. (2023). Application of a deep learning image classifier for identification of Amazonian fishes. Ecology and Evolution, 13(5).

Shafiq, M., Gu, & Z. (2022). Deep residual learning for image recognition: A survey. Applied Sciences, 12(18).

Shantharam, M., R., Hossain, Z., M., Pandey, & R. (2024). ML-based pain recognition model using Mixup data augmentation. Electronics, 13(6).

Shin, W., Kahng, H., Kim, & S., B. (2022). Mixup-based classification of mixed-type defect patterns in wafer bin maps. Computers & Industrial Engineering.

Shorten, C., & Khoshgoftaar, T. M. (2020). A survey on Image Data Augmentation for Deep Learning. Journal of Big Data, 7(1), 1–48. https://doi.org/10.1186/s40537-019-0197-0

Smucny, J., Shi, G., Lesh, A., T., Carter, S., C., Davidson, I., & al, et. (2022). Data augmentation with Mixup: Enhancing performance of a functional neuroimaging-based prognostic deep learning classifier in recent onset psychosis.

Sousa, J. V. M., Almeida, V. R. de, Saraiva, A. A., Santos, D. B. S., Pimentel, P. M. C., & Sousa, L. L. de. (2020). Classification of Pneumonia Images on Mobile Devices with Quantized Neural Network. Research, Society and Development, 9(10), e889108382. https://doi.org/10.33448/rsd-v9i10.8382

Sun, S., Chen, J.-N., He, R., Yuille, A., Bai, & S. (2022). LUMix: Improving Mixup by better modelling label uncertainty.

Tanaka, K., Kobayashi, S., Iwamoto, & T. (2024). Hybrid Mixup-CutMix augmentation for domain-robust image recognition in marine visual datasets. Sensors, 24(8).

Tarling, P., Cantor, M., Clapés, A., Escalera, & S. (2022). Deep learning with self-supervision and uncertainty regularization to count fish in underwater images. PLOS ONE, 17(5).

Usmani, A., I., Qadri, T., M., Zia, R., Alrayes, S., F., Saidani, O., Dashtipour, & K. (2023). Interactive effect of learning rate and batch size to implement transfer learning for brain tumour classification. Electronics, 12(4).

Wang, X., & al, … et. (2023). On the effectiveness of mixed-sample data augmentation for improving model robustness. Machine Learning with Applications.

Wang, Y., Chen, T., Wang, & Z. (2023). Understanding mixed-sample data augmentation through risk variance minimization. IEEE Transactions on Neural Networks and Learning Systems, 34(9), 5764–5778.

Won, S., Bae, S.-H., Kim, & S., T. (2023a). Analyzing effects of mixed sample data augmentation on model interpretability.

Won, S., Bae, S.-H., Kim, & S., T. (2023b). Effects of mixed sample data augmentation on model interpretability.

Wu, P. (2022). A Survey of Few-Shot Learning Research Based on Deep Neural Network. Frontiers in Computing and Intelligent Systems, 2(1), 110–115. https://doi.org/10.54097/fcis.v2i1.3177

Wu, & Z. (2024). Image data augmentation techniques based on deep learning: A survey. Mathematical Biosciences and Engineering, 21(6), 6190–6224.

Xu, C., Coen-Pirani, P., Jiang, & X. (2023). Empirical Study of Overfitting in Deep Learning for Predicting Breast Cancer Metastasis. Cancers, 15(7).

Xu, Z., Choi, S., Park, S., Yun, & S. (2021). Improved CutMix training for deep neural networks. IEEE Access, 60462–60473.

Xu, Z., Liu, C., He, & J. (2022). Geometric interpretation of Mixup in manifold space. Information Sciences, 985–1001.

Zhang, H., Gong, L., Li, X., Liu, F., Yin, & J. (2023). An Underwater Imaging Method of Enhancement via Multi-Scale Weighted Fusion. Frontiers in Marine Science.

Zhang, L., Bian, Y., Jiang, P., Zhang, & F. (2023). A transfer residual neural network based on ResNet-50 for detection of steel surface defects. Applied Sciences, 13(9).

Zhao, X., Wang, L., Zhang, Y., Han, X., Deveci, M., Parmar, & M. (2024). A review of convolutional neural networks in computer vision. Artificial Intelligence Review.

Zhao, Y., Li, R., Xu, & C. (2022). Improving CNN generalization with adversarial and stochastic augmentation. IEEE Access, 86421–86433.

Zhou, X., Li, Y., Wang, H., & Chen, Z. (2024). Cutout-based augmentation for medical imaging: Improving robustness and generalization in deep CNN models. Biomedical Signal Processing and Control, 93, 106059. https://doi.org/10.1016/j.bspc.2024.106059

Zhou, Z., & al, et. (2023). Improving the classification accuracy of fishes and invertebrates in underwater images by using pre-trained ResNet50 neural network and error-minimized random vector functional link classifier. ICES Journal of Marine Science, 80(5), 1256–1268.