posted on 2024-05-30, 06:16authored byXIAOQIANG SHAO, Hao Li, Zhiyue Lu, Ma Bo, Liu ming qian, HAN ZeHui
Due to the strong robustness of RGBT object tracking, which is less susceptible to the effects of illumination and occlusion, it has been widely used in the fields of video surveillance and automated driving. In this paper, an effective tracking network is constructed by fully interacting with both modal information using challenging attributes in infrared and visible images. The network consists of three parts: the special attribute fusion(SAF) module, the common attribute fusion(CAF) module, and the cross-modal interaction(CMI) module. The SAF module enables the network to extract unique challeng attribute information from two modalities, fully leveraging the advantages of different modal information. The CAF module extracts features from attributes matched in both modalities, and adaptively aggregates them, assigning corresponding weights to each challenging attribute to enhance the tracker's adaptability. The CMI module facilitates modal interaction between the infrared and visible image modalities, integrating common modal information with the specific modal information of each, thereby enhancing the network's robustness. The proposed network is tested on GTOT, RGBT234, and LasHer datasets, respectively. The results show that our tracker outperforms other trackers, proving the superiority of our method.
History
Preprint ID
113947
Highlighter Commentary
Researchers have developed a robust RGBT object-tracking network for video surveillance and automated driving, leveraging the strengths of infrared and visible images. This network, comprising the Special Attribute Fusion (SAF), Common Attribute Fusion (CAF), and Cross-Modal Interaction (CMI) modules, fully utilizes and interacts with both image modalities. The SAF module extracts unique challenging attributes from each modality, the CAF module adapts and aggregates common attributes, and the CMI module integrates these for enhanced robustness. Tested on GTOT, RGBT234, and LasHeR datasets, the proposed tracker demonstrated superior performance compared to existing methods, validating its effectiveness.
-- Mousa Moradi, Post-Doctoral Researcher, Harvard Medical School, Boston, MA