Skip to content

Clustering

This document covers the clustering functionality within Turf.js, which provides spatial clustering algorithms for grouping geographic points based on proximity and density. The clustering system includes implementations of K-means and DBSCAN algorithms, along with general clustering utilities.

For information about grid generation and spatial tessellation, see Grid Generation. For statistical analysis functions like nearest neighbor analysis, see Statistical Analysis.

Overview

The clustering system in Turf.js consists of three main packages that work together to provide comprehensive spatial clustering capabilities:

SVG
100%

Clustering Architecture Overview

Core Clustering Modules

The clustering functionality is organized into three specialized packages, each serving distinct purposes:

PackagePurposeKey DependenciesAlgorithm Type
@turf/clustersGeneral clustering utilities and base functionality@turf/helpers, @turf/metaUtility functions
@turf/clusters-kmeansK-means clustering implementationskmeans, @turf/clone, @turf/invariantPartitioning
@turf/clusters-dbscanDensity-based clusteringrbush, @turf/distanceDensity-based
SVG
100%

Core Clustering Module Dependencies

K-means Clustering Implementation

The @turf/clusters-kmeans package implements the K-means clustering algorithm using the external skmeans library. This algorithm partitions points into k clusters by minimizing within-cluster sum of squares.

SVG
100%

K-means Clustering Data Flow

Key Dependencies

  • skmeans: External JavaScript implementation of K-means algorithm
  • @turf/clone: Ensures immutability by cloning input features before modification
  • @turf/invariant: Provides input validation and type checking
  • @turf/meta: Used for iterating over GeoJSON features and coordinates

DBSCAN Clustering Implementation

The @turf/clusters-dbscan package implements the DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm, which groups points based on density and can identify outliers as noise.

SVG
100%

DBSCAN Clustering Algorithm Flow

Key Dependencies

  • rbush: R-tree spatial index for efficient neighbor queries during density calculations
  • @turf/distance: Great circle distance calculations for determining point proximity
  • @turf/clone: Feature cloning for immutable operations
  • @turf/meta: GeoJSON feature iteration utilities

General Clustering Utilities

The @turf/clusters package provides base functionality and utility functions that support the clustering algorithms. This package serves as the foundation for common clustering operations.

SVG
100%

General Clustering Utilities Architecture

Dependencies

The base clustering package has minimal dependencies, focusing on core Turf.js utilities:

DependencyPurpose
@turf/helpersGeoJSON creation and property manipulation
@turf/metaFeature and coordinate iteration
@types/geojsonTypeScript type definitions

External Library Integration

The clustering system integrates with specialized external libraries to provide robust algorithm implementations:

External Library Integration Pattern

Library Versions and Usage

LibraryVersionPackagePurpose
skmeans0.9.7@turf/clusters-kmeansK-means algorithm implementation
rbush^3.0.1@turf/clusters-dbscanSpatial indexing for neighbor queries

Both libraries are wrapped by Turf.js modules to handle GeoJSON coordinate extraction and result formatting, ensuring consistent input/output patterns across the clustering system.