Distributed Collaborative Perception using Graph Attention Networks in Intelligent Transportation Systems

Activity: Examination and mentoringMember of a PhD jury

Description

Autonomous Vehicles (AV)s rely heavily on onboard visual sensors such as LiDAR, RADAR, and cameras to perceive their surrounding environment. However, the limitations inherent in local sensors, including blind spots and restricted fields of view, hinder a vehicle’s ability to achieve comprehensive situational awareness, increasing the risk of safety critical failures. Collaborative perception has emerged as a promising paradigm to overcome these challenges by enabling connected vehicles to share perception data, thereby extending their perception range and improving overall awareness. This dissertation advances the state of collaborative perception for connected autonomous vehicles by developing novel graph-based feature aggregation and attention-driven fusion methodologies.
At the foundation of this work is the design and implementation of a collaborative perception communication framework, leveraging a ZeroMQ-based system for the real-time exchange of visual and meta-data between vehicles. To address the high communication overhead associated with raw sensory data, a custom compression algorithm was introduced to ensure efficient bandwidth utilization within vehicular networks.
The research first introduces FF-GAT, a graph attention-based network that aggregates semantic features extracted from multiple Convolutional Neural Networks (CNN)s. By representing feature maps as graph nodes and constructing edges based on feature similarities, FF-GAT adaptively fuses multi-level feature vectors, establishing a foundation for applying Graph Attention Networks
(GAT) in collaborative perception tasks.
Building upon this, a GAT-based intermediate collaborative perception framework is proposed, where feature maps from ego and neighbouring vehicles are fused through attention-guided aggregation. This method dynamically emphasizes informative spatial regions, significantly improving object detection performance on the V2XSim dataset.
Further, CollabGAT extends this approach by incorporating both intermediate feature and positional data exchanges among vehicles, utilizing spatial and channel-level attention mechanisms within a graph-based structure. This method achieves notable gains in object detection accuracy while maintaining a favorable trade-off between performance and communication efficiency,
which is critical for real-world deployment.
Another key contribution is the Graph-based Iterative Feature Fusion (GIFF) framework, which introduces an iterative channel and spatial attention mechanism for refining fused perception data across vehicles and roadside infrastructure. GIFF demonstrates improved average precision and
computational efficiency, particularly in scenarios with occlusions, sensor noise, and partial sensor failures.
Finally, to address the challenges posed by communication delays and asynchronous data transmission, the Delay-Aware Collaborative Perception (DelAwareCol) framework is developed. This framework incorporates intra-agent temporal aggregation, inter-agent spatial alignment, and an adaptive multi-source fusion mechanism guided by dynamic contribution estimation. Experimental results on simulated and real-world datasets confirmthe method’s robustness under high-latency and localization uncertainty conditions.
Collectively, the contributions of this dissertation present a comprehensive, scalable, and efficient framework for collaborative perception in autonomous driving systems. Through the integration of advanced graph-based aggregation techniques, attention-driven feature fusion, and delay-aware communication strategies, this research provides a significant advancement toward safer, more
reliable, and perceptually capable connected autonomous vehicles.

Additional Description

Private PhD Defence
Period23 Jun 2025
ExamineeAhmed Ahmed
Examination held at
  • University of Antwerp
Degree of RecognitionInternational