Chameleon: Fast-slow Neuro-symbolic Lane Topology Extraction
Published in IEEE International Conference on Robotics and Automation (ICRA 2025, Oral Presentation), 2025
Status: Accepted to ICRA 2025 (Oral Presentation) 🎉
Role: Team Member
Abstract
We developed a neuro-symbolic algorithm combining symbolic reasoning over detected instances with Chain-of-Thought-based visual language models (VLMs) to handle corner cases in lane topology extraction, achieving consistent improvements on OpenLane-V2 dataset while reducing inference time from >200s to 0.1-8s per frame.
Key Contributions
- Program Synthesis Framework: Generated executable Python code based on few-shot visual/text prompts, expert rules, and API descriptions to reason over spatial relationships
- Dense Visual Prompting VQA Benchmark: Designed benchmark tasks (lane adjacency, direction matching, intersection inclusion) and evaluated GPT-4o, GPT-4-vision, LLaVA, and ResNet18-based MLP for understanding complex 3D driving scenes
- Performance Improvements: Achieved consistent improvements on OpenLane-V2 dataset in 3-shot settings, matching or outperforming fully supervised baselines without additional finetuning
- Efficiency Gains: Reduced inference time from >200s to 0.1-8s per frame, with ablation studies showing 5% accuracy improvement through expert rules and few-shot examples
- Real-world Impact: Delivered cost-efficient and scalable solution for real-time deployment in mapless autonomous driving, significantly lowering computational cost and carbon footprint
Paper | Code |