top of page
Writer's pictureIMPANA S V

Extracting Built-Up Areas from Satellite Imagery Using AI & ML in Python: A Step-by-Step Guide

Introduction

Urbanization and land-use analysis are critical for sustainable development, urban planning, and environmental conservation. Identifying and extracting built-up areas from satellite imagery is one of the key processes in such analyses. This blog walks through the process of extracting built-up regions from a raster image using Python.


We’ll leverage libraries like Rasterio, Shapely, GeoPandas, and OpenCV along with image segmentation techniques like Otsu's Thresholding. This blog includes a step-by-step explanation, the code implementation, and a visual output.


Before we begin, ensure you have the necessary Python libraries installed:

Study area Bengaluru:
Study area  Bengaluru
Step 1: Setting Up the Environment

rasterio: For reading raster data

geopandas: For handling shapefiles

matplotlib: For visualization

scikit-image: For image segmentation (Otsu’s thresholding)

OpenCV: For contour extraction


Step 1: Reading the Raster Data

To start, we load the Bengaluru raster image and read its data using the Rasterio library. Here's the code snippet:


The open_and_read_raster function reads a raster file and returns the image data and metadata. Using Rasterio, the raster file is opened, and all its bands are read into a NumPy array. The dimensions of the array are rearranged to (rows, cols, bands) for easier manipulation during further processing. Additionally, the function extracts important geospatial metadata, such as the CRS (Coordinate Reference System) and the affine transformation matrix, which are essential for georeferencing the image. These outputs enable downstream operations like calculating indices or saving spatial data.

Reading the Raster Data
Step 2: Calculating the NDBI Index

The Normalized Difference Built-up Index (NDBI) is used to distinguish built-up regions from other land types. It is calculated using the SWIR (Short-Wave Infrared) and NIR (Near-Infrared) bands of the raster image:

Calculating the NDBI Index

Calculating the NDBI Index- Code Input
Step 3: Segmenting Built-up Regions

To segment built-up areas, the NDBI values are processed using Otsu's Thresholding, which determines an optimal threshold to distinguish built-up regions (foreground) from the background. After thresholding, the mask is refined using morphological closing to fill small gaps and remove noise.


Step 4: Extracting Polygons for Built-up Areas

The built-up regions are converted to polygons using OpenCV's findContours and Shapely: Contours are then extracted from the cleaned mask using OpenCV’s findContours method, and valid contours are converted into Shapely polygons. These polygons are filtered to exclude small or invalid ones before being stored in a GeoDataFrame and saved as a shapefile for geospatial analysis. This ensures clean and meaningful segmentation of built-up regions.

Extracting Polygons for Built-up Areas

These polygons are stored in a GeoDataFrame and saved to a shapefile for further geospatial analysis:


Step 5: Visualization

Finally, we visualize the segmented built-up areas with Matplotlib. The contours of the extracted polygons are overlaid on the mask:

visualize the segmented built-up areas with Matplotlib

Built-Up Regions

Conclusion

This blog demonstrates how to process raster data to extract built-up regions using Python. The workflow involves computing the NDBI index, segmenting regions using Otsu's thresholding, and extracting polygons for spatial analysis. This pipeline can be customized for various types of land-use classification tasks.

666 views0 comments

Comments


bottom of page