ADAPTASI FEW-SHOT PATCH-BASED LEARNING UNTUK COLORIZATION LINE-ART PADA KOMIK DAN MANGA LOKAL

Faridz Azhar; Dian Ade Kurnia; Khaerul Anam

Authors

Faridz Azhar STMIK IKMI CIREBON, Indonesia
Dian Ade Kurnia STMIK IKMI Cirebon, Indonesia
Khaerul Anam STMIK IKMI Cirebon, Indonesia

Keywords:

few-shot, patch-based learning, colorization, continual learning, position embedding

Abstract

Penelitian ini mengusulkan adaptasi metode few-shot patch-based learning yang diintegrasikan dengan continual learning dan position embedding untuk pewarnaan otomatis line-art komik dan manga lokal. Tujuan utama adalah meningkatkan adaptabilitas model terhadap variasi gaya artistik dan keterbatasan data sambil mempertahankan konsistensi warna antar-panel. Metodologi mengkombinasikan arsitektur U Net hibrida dengan modul Vision Transformer pada level patch, dilengkapi mekanisme memori patch-wise dan strategi rehearsal untuk pembaruan berkelanjutan; evaluasi dilakukan menggunakan protokol few shot dengan metrik PSNR, SSIM, LPIPS, FID dan mIoU serta studi persepsi pengguna. Hasil eksperimen pada dataset terkurasi lokal menunjukkan bahwa integrasi position embedding memperbaiki akurasi spasial dan koherensi wilayah, sedangkan mekanisme continual learning mengurangi fenomena catastrophic forgetting saat model diadaptasi ke gaya baru; pendekatan patch-based juga mengurangi kebutuhan sampel anotasi tanpa menurunkan kualitas visual dibanding baseline konvensional. Diskusi menyoroti trade off antara efisiensi komputasi dan fidelitas warna, kebutuhan augmentasi gaya untuk mengatasi kelangkaan data, serta implikasi etis terkait hak cipta dan representasi budaya. Kontribusi penelitian meliputi (1) kerangka kerja few shot patch-based continual colorization yang dioptimalkan untuk konteks lokal; (2) protokol evaluasi terstandardisasi untuk tugas pewarnaan line art; dan (3) rekomendasi implementasi hemat sumber daya untuk ekosistem kreatif Indonesia. Kesimpulan menegaskan bahwa pendekatan yang diusulkan efektif untuk mempercepat alur kerja pewarnaan digital dan meningkatkan adopsi teknologi AI dalam produksi komik lokal. Implikasi praktis mencakup integrasi dalam pipeline produksi studio kecil, pedoman dataset berlisensi terbuka, serta saran perbaikan model ringan untuk inferensi pada perangkat terbatas; penelitian lanjutan direkomendasikan untuk validasi multibahasa dan analisis bias warna lintas budaya. Dampak sosial-ekonomi juga dibahas. Secara keseluruhan direkomendasikan.

References

Aizawa, K., Fujimoto, A., Otsubo, A., Ogawa, T., Matsui, Y., Tsubota, K., & Ikuta, H. (2020). Building a Manga Dataset “Manga109” with Annotations for Multimedia Applications. IEEE MultiMedia, 27(2), 8–18. https://doi.org/10.1109/MMUL.2020.2987895

Akita, K., & al., et. (2023). Hand-drawn anime line drawing colorization of faces with texture details. Computer Animation & Virtual Worlds, 35(3). https://doi.org/10.1002/cav.2198

Anantrasirichai, N., & Bull, D. (2022). Artificial intelligence in the creative industries: A review. Artificial Intelligence Review, 55(1), 589–656. https://doi.org/10.1007/s10462-021-10039-7

Cao, Y. (2024). Computer-aided colorization state-of-the-science: A survey. In arXiv. https://doi.org/10.48550/arXiv.2410.02288

Cui, J., & al., et. (2022). Exemplar-Based Sketch Colorization with Cross-Domain Dense Semantic Correspondence. Mathematics, 10(12), 1988. https://doi.org/10.3390/math10121988

Gil, R., Ravid, S. A., & Sorenson, O. (2025). Talent and technology in creative industries: Introduction to the Special Issue. Journal of Cultural Economics, 49(2), 241–255. https://doi.org/10.1007/s10824-025-09543-3

Golyadkin, M., & al., et. (2025). Closing the Domain Gap in Manga Colorization via Aligned Paired Dataset. WACV 2025. https://doi.org/10.1109/10943952

Grönquist, P., Bhattacharjee, D., Aydemir, B., Ozaydin, B., Zhang, T., Salzmann, M., & Süsstrunk, S. (2024). Unlocking Comics: The AI4VA Dataset for Visual Understanding. ArXiv. https://doi.org/10.48550/arXiv.2410.20459

Gu, Z., Xu, C., Yang, J., & Cui, Z. (2023). Few-Shot Continual Infomax Learning. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 19224–19233. https://doi.org/10.1109/ICCV.2023.00-19224

Hao, F., & al., et. (2023). Class-Patch Similarity Weighted Embedding for Few-Shot Infrared Image Classification. Electronics, 14(2), 290. https://doi.org/10.3390/electronics14020290

He, J., & al., et. (2024). Region-assisted line drawing colorization through diffusion model. The Visual Computer, 41, 5769–5780. https://doi.org/10.1007/s00371-024-03751-2

Kang, X., Yang, T., Ouyang, W., Ren, P., Li, L., & Xie, X. (2022). DDColor: Towards photo-realistic image colorization via dual decoders. ArXiv. https://doi.org/10.48550/arXiv.2212.11613

Koesten, L., Vougiouklis, P., Simperl, E., & Groth, P. (2020). Dataset reuse: Toward translating principles to practice. Patterns (N Y), 1(8), 100136. https://doi.org/10.1016/j.patter.2020.100136

Liu, X., & al., et. (2022). Reference-guided structure-aware deep sketch colorization for cartoons. Computational Visual Media, 8, 135–148. https://doi.org/10.1007/s41095-021-0228-6

Liu, X., & al., et. (2023). PCCNet: A Few-Shot Patch-Wise Contrastive Colorization Network.

Ma, C., & al., et. (2024). SSGAN: A semantic similarity‐based GAN for small-sample image augmentation. Neural Processing Letters, 56, 149. https://doi.org/10.1007/s11063-024-11498-z

Maejima, A., & al., et. (2024). Continual few-shot patch-based learning for anime-style colorization. Computational Visual Media, 10, 705–723. https://doi.org/10.1007/s41095-024-0414-4

Peng, G., Lacagnina, C., Downs, R. R., Ganske, A., Ramapriyan, H. K., Ivánová, I., Wyborn, L., & Jones, D. (2022). Global community guidelines for documenting, sharing, and reusing quality information of individual digital datasets. Data Science Journal, 21, 8. https://doi.org/10.5334/dsj-2022-008

Praja, C. B. E., & al., et. (2025). Authorship and Ownership of AI-Generated Works in Indonesia. Jurnal Media Hukum, 32(1), 151–170. https://doi.org/10.18196/jmh.v32i1.25383

Qin, X., Song, X., & Jiang, S. (2023). Bi-Level Meta-Learning for Few-Shot Domain Generalization. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 15900–15910. https://openaccess.thecvf.com/content/CVPR2023/papers/Qin_Bi-Level_Meta-Learning_for_Few-Shot_Domain_Generalization_CVPR_2023_paper.pdf

Seo, C. W., & Seo, Y. (2021). Seg2pix: Few shot training line art colorization with segmented image data. Applied Sciences, 11(4), 1464. https://doi.org/10.3390/app11041464

Shafiq, H., & Lee, B. (2024). Transforming Color: A Novel Image Colorization Method. Electronics, 13(13), 2511.

Shi, M., & al., et. (2023). Reference-based deep line art video colorization. IEEE Transactions on Visualization and Computer Graphics, 29(6), 2965–2979. https://doi.org/10.1109/TVCG.2022.3146000

Shimizu, S., & Ishikawa, H. (2025). Colourisation quality assessment with CLIP. Open Access. https://doi.org/10.1109/ICCE.2025.00-00000

Shimizu, Y., & al., et. (2021). Painting Style-Aware Manga Colorization Based on Generative Adversarial Networks. IEEE ICIP 2021. https://doi.org/10.1109/ICIP42928.2021.9506254

Treneska, S., & al., et. (2022). GAN-Based Image Colorization for Self-Supervised Visual Feature Learning. Sensors, 22(4), 1599. https://doi.org/10.3390/s22041599

Wang, N., & al., et. (2023). Coloring anime line art videos with transformation region enhancement network. Pattern Recognition, 141. https://doi.org/10.1016/j.patcog.2023.109562

Wang, N., Chen, G.-D., & Tian, Y. (2022). Image Colorization Algorithm Based on Deep Learning. Symmetry, 14(11), 2295. https://doi.org/10.3390/sym14112295

Wang, Z., Yu, Y., Li, D., Wan, Y., & Li, M. (2022). Colorful Image Colorization with Classification and Asymmetric Feature Fusion. Sensors, 22(20), 8010. https://doi.org/10.3390/s22208010

Wu, P. (2023). A Survey of Few-Shot Learning Research Based on Deep Neural Network. Frontiers in Computing and Intelligent Systems. https://doi.org/10.54097/fcis.v2i1.3177

Yu, Y., & al., et. (2023). Query semantic reconstruction for background in few-shot segmentation. The Visual Computer, 40, 799–810. https://doi.org/10.1007/s00371-023-02817-x