DEVELOPMENT OF A MULTIMODAL RECOGNITION SYSTEM AND TARGET CLASSIFICATION BASED ON ARTIFICIAL INTELLIGENCE

Authors

  • Peter Nikolyuk Vasil Stus Donetsk national university https://orcid.org/0000-0002-0286-297X
  • Veronika Sapozhnikova Vasil Stus Donetsk national university
  • Vladyslav Chemes Vasil Stus Donetsk national university

DOI:

https://doi.org/10.30888/2663-5712.2025-34-01-113

Keywords:

multimodal recognition, target classification, artificial intelligence, RGB-T data, deep learning, edge-aware learning, semantic segmentation.

Abstract

The study addresses the problem of developing a multimodal system for target recognition and classification based on artificial intelligence. The relevance of the topic stems from the need to improve the reliability of automatic object detection under co

References

Baltrušaitis, T., Ahuja, C., Morency, L.P. Multimodal Machine Learning: A Survey and Taxonomy. IEEE TPAMI, 2019. DOI: https://doi.org/10.1109/TPAMI.2018.2798607

Hazirbas, C., Ma, L., Domokos, C., Cremers, D. FuseNet: Incorporating Depth into Semantic Segmentation via Fusion. ACCV, 2016. DOI: https://doi.org/10.1007/978-3-319-54181-5_14.

Ha, Q., Watanabe, K., Karasawa, T., Ushiku, Y., Harada, T. MFNet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes. IEEE, 2017. DOI: https://doi.org/10.1109/IROS.2017.8206396.

Shivakumar, S. S.; Rodrigues, N.; Zhou, A.; Miller, I. D.; Kumar, V.; Taylor, C. J. PST900: RGB-Thermal Calibration, Dataset and Segmentation Network. arXiv, 2019. / GRASP Lab, University of Pennsylvania. arXiv preprint. URL: https://arxiv.org/abs/1909.10980.

Ji, W.; Li, J.; Bian, C.; Zhang, Z.; Cheng, L. SemanticRT: Large-Scale RGB-T Dataset for Semantic Segmentation and Target Recognition. arXiv, 2023. DOI: https://doi.org/10.1145/3581783.3611738.

Sun, Y.; Zuo, W.; Liu, M. RTFNet: RGB-T Fusion Network for Semantic Segmentation of Urban Scenes. IEEE, 2019. DOI: https://doi.org/10.1109/LRA.2019.2904733.

Chen, Z., et al. MMTM: Multimodal Transfer Module for CNN Fusion. CVPR, 2020. DOI: https://doi.org/10.48550/arXiv.1911.08670.

Li, P.; Chen, J.; Lin, B.; Xu, X. Residual Spatial Fusion Network for RGB-Thermal Semantic Segmentation. In: Proceedings / preprint, 2023. URL: https://huggingface.co/papers/trending.

Li, G.; … CAFNet: Cross-Modal Adaptive Fusion Network With Attention and Gated Weighting for RGB-T Semantic Segmentation. IEEE Access, (DOAJ). DOI: https://doi.org/10.1109/access.2025.3595811.

Zhou, Z.; Wu, S.; Zhu, G.; Wang, H.; He, Z. Channel and Spatial Relation-Propagation Network for RGB-Thermal Semantic Segmentation (CSRPNet). arXiv, 2023. DOI: https://doi.org/10.48550/arXiv.2308.12534.

He, K., Girshick, R., Dollár, P. Feature Pyramid Networks for Object Detection. CVPR, 2017. DOI: https://doi.org/10.1109/CVPR.2017.106.

Valada, A., Vertens, J., Dhall, A., Burgard, W. Self-Supervised Multimodal Fusion for Semantic Segmentation. IEEE RAL, 2019. DOI: https://doi.org/10.1007/s11263-019-01188-y.

Zhang, J.; Liu, H.; Yang, K.; Hu, X.; Liu, R.; Stiefelhagen, R. CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers. arXiv, 2022. DOI: https://doi.org/10.48550/arXiv.2203.04838.

Wang, M.; Zhu, Z.; Wang, Y.; Tu, R.; Weng, J.; Yu, X. Edge-Supervised Attention-Aware Fusion Network for RGB-T Semantic Segmentation. Electronics, 2025. DOI: https://doi.org/10.3390/electronics14081489.

Published

2025-11-30

How to Cite

Ніколюк, П., Сапожнікова, В., & Чемес, В. (2025). DEVELOPMENT OF A MULTIMODAL RECOGNITION SYSTEM AND TARGET CLASSIFICATION BASED ON ARTIFICIAL INTELLIGENCE. SWorldJournal, 1(34-01), 191–209. https://doi.org/10.30888/2663-5712.2025-34-01-113

Issue

Section

Articles