Infrared and visible image (vis-ir) fusion is vital for enhancing clarity and enriching information, with significant applications in medical and biological imaging. Existing models, relying on convolutional neural networks (CNNs) or Transformers, often face challenges in balancing computational efficiency and capturing long-range dependencies. This paper introduces VSS-SpatioNet, a novel lightweight fusion network leveraging a Visual State Space (VSS) module instead of Transformers. Its asymmetric encoder-decoder structure integrates multi-scale features through a VSS-Spatial (VS) fusion strategy, combining CNN and VSS branches to capture fine details and long-range information. Extensive experiments demonstrate VSS-SpatioNet's superior performance, highlighting its potential in advanced imaging applications.
History
Funder Name
Henan University of Science and Technology (202410464018)