Paper: OpenFACADES: An open framework for architectural caption and attribute data enrichment via street view imagery

Paper: OpenFACADES: An open framework for architectural caption and attribute data enrichment via street view imagery

Abstract

Building properties, such as height, usage, and material, play a crucial role in spatial data infrastructures, supporting various urban applications. Despite their importance, comprehensive building attribute data remain scarce in many urban areas. Recent advances have enabled the extraction of objective building attributes using remote sensing and street-level imagery. However, establishing a pipeline that integrates diverse open datasets, acquires holistic building imagery, and infers comprehensive building attributes at scale remains a significant challenge.



Among the first, this study bridges the gaps by introducing OpenFACADES, an open framework that leverages multimodal crowdsourced data to enrich building profiles with both objective attributes and semantic descriptors through multimodal large language models. First, we integrate street-level image metadata from Mapillary with OpenStreetMap geometries via isovist analysis, identifying images that provide suitable vantage points for observing target buildings. Second, we automate the detection of building facades in panoramic imagery and tailor a reprojection approach to convert objects into holistic perspective views that approximate real-world observation.



Third, we introduce an innovative approach that harnesses and investigates the capabilities of open-source large vision-language models (VLMs) for multi-attribute prediction and open-vocabulary captioning in building-level analytics, leveraging a globally sourced dataset of 31,180 labeled images from seven cities.


Evaluation shows that fine-tuned VLM excel in multi-attribute inference, outperforming single-attribute computer vision models and zero-shot ChatGPT-4o. Further experiments confirm its superior generalization and robustness across culturally distinct region and varying image conditions. Finally, the model is applied for large-scale building annotation, generating a dataset of 1.2 million images for half a million buildings. This open‐source framework enhances the scope, adaptability, and granularity of building‐level assessments, enabling more fine‐grained and interpretable insights into the built environment.

Related Publications

OpenFACADES: an open framework for architectural caption and attribute data enrichment via street view imagery

Xiucheng Liang, Jinheng Xie, Tianhong Zhao, Rudi Stouffs, Filip Biljecki

Published in ISPRS Journal of Photogrammetry and Remote Sensing, 2025

PDF DOI Code

Related Posts

Talk: Climate Change AI Discussion Seminar: Multimodal AI Approaches for Urban Microclimate Prediction and Building Analysis

Talk: Climate Change AI Discussion Seminar: Multimodal AI Approaches for Urban Microclimate Prediction and Building Analysis

Attended the October event in CCAI's Discussion Seminar Series. I presented my research on *Evaluating human perception of building exteriors using street view imagery* together with my colleagues Kunihiko Fujiwara and Binyu Lei. The presentation was followed by an interactive discussion in breakout rooms focused on translating this data into meaningful insights.

Read More
Paper: Revealing spatio-temporal evolution of urban visual environments with street view imagery

Paper: Revealing spatio-temporal evolution of urban visual environments with street view imagery

This study presents an embedding-driven clustering approach that combines physical and perceptual attributes to analyze the spatial structure and spatio-temporal evolution of urban visual environments. Using Singapore as a case study, it leverages street view imagery and graph neural networks to classify streetscapes into six clusters, revealing changes over the past decade. The findings provide insights into urban visual dynamics, supporting planning and landscape improvement.

Read More
Paper: Evaluating human perception of building exteriors using street view imagery

Paper: Evaluating human perception of building exteriors using street view imagery

This study explores how building appearances shape urban perception, using machine learning and survey data to analyze human responses to over 250,000 building images from Singapore, San Francisco, and Amsterdam. Findings reveal how architectural styles influence streetscape perceptions, offering insights for architects and city planners.

Read More