Paper: Heterogeneous graph neural networks for building attribute prediction from hierarchical urban features and cross-view imagery

Paper: Heterogeneous graph neural networks for building attribute prediction from hierarchical urban features and cross-view imagery

Abstract

Data on building properties are essential for a variety of urban applications, yet such information remains scarce in many parts of the world. Recent efforts have leveraged instruments such as machine learning (ML), computer vision (CV), and graph neural networks (GNNs) to assess these properties at scale by leveraging urban features or visual information. However, extracting holistic representations to infer building attributes from multi-modal data across multiple spatial scales and vertical building characteristics remains a significant challenge.

To bridge this gap, we present a innovative framework, that captures both hierarchical urban features and cross-view visual information through a heterogeneous graph. First, we construct a heterogeneous graph that incorporates multi-dimensional urban elements --- buildings, streets, intersections, and urban plots --- to comprehensively represent multi-scale geospatial features.

Constructing heterogeneous urban graph with four element types (buildings, street segments, intersections, and urban plots).


Second, we automatically crop images of individual buildings from both very high-resolution satellite and street-level imagery, and introduce feature propagation on semantic similarity graphs to supplement missing facade information. Third, feature fusion is applied to integrate both morphological and visual features, with holistic representations generated for building attribute prediction.

Generating holistic building representations to support building attribute prediction.


Illustration of the heterogeneous GraphSAGE framework for building attribute prediction.


Systematic experiments across three global cities demonstrate that our method outperforms existing CV, ML, and homogeneous GNN-based models, achieving classification accuracies of 86% to 96% across 10 to 12 distinct building types, with mean F1 scores ranging from 0.70 to 0.73. The framework demonstrates robustness to class imbalance and produces more distinctive embeddings for ambiguous categories. In additional task of inferring building age, the method delivers similarly strong performance.

Model performance.

This framework advances scalable approaches for filling gaps in building attribute data and offers new insights into modeling holistic urban environments.

Related Publications

Heterogeneous graph neural networks for building attribute prediction from hierarchical urban features and cross-view imagery

Xiucheng Liang, Winston Yap, Filip Biljecki

Published in ISPRS Journal of Photogrammetry and Remote Sensing, 2026

PDF DOI Code

Related Posts

Paper: Decoding characteristics of building facades using street view imagery and vision-language model

Paper: Decoding characteristics of building facades using street view imagery and vision-language model

This study leverages street view imagery and vision-language models to analyze 48,752 building images in Hong Kong, identifying eight building clusters. It demonstrates the potential of scalable SVI-based analyses to capture urban spatial and semantic details, enhancing the understanding of the built environment.

Read More
Paper: Revealing spatio-temporal evolution of urban visual environments with street view imagery

Paper: Revealing spatio-temporal evolution of urban visual environments with street view imagery

This study presents an embedding-driven clustering approach that combines physical and perceptual attributes to analyze the spatial structure and spatio-temporal evolution of urban visual environments. Using Singapore as a case study, it leverages street view imagery and graph neural networks to classify streetscapes into six clusters, revealing changes over the past decade. The findings provide insights into urban visual dynamics, supporting planning and landscape improvement.

Read More
Paper: Evaluating human perception of building exteriors using street view imagery

Paper: Evaluating human perception of building exteriors using street view imagery

This study explores how building appearances shape urban perception, using machine learning and survey data to analyze human responses to over 250,000 building images from Singapore, San Francisco, and Amsterdam. Findings reveal how architectural styles influence streetscape perceptions, offering insights for architects and city planners.

Read More