Clustering
This document covers the clustering functionality within Turf.js, which provides spatial clustering algorithms for grouping geographic points based on proximity and density. The clustering system includes implementations of K-means and DBSCAN algorithms, along with general clustering utilities.
For information about grid generation and spatial tessellation, see Grid Generation. For statistical analysis functions like nearest neighbor analysis, see Statistical Analysis.
Overview
The clustering system in Turf.js consists of three main packages that work together to provide comprehensive spatial clustering capabilities:
Clustering Architecture Overview
Core Clustering Modules
The clustering functionality is organized into three specialized packages, each serving distinct purposes:
| Package | Purpose | Key Dependencies | Algorithm Type |
|---|---|---|---|
@turf/clusters | General clustering utilities and base functionality | @turf/helpers, @turf/meta | Utility functions |
@turf/clusters-kmeans | K-means clustering implementation | skmeans, @turf/clone, @turf/invariant | Partitioning |
@turf/clusters-dbscan | Density-based clustering | rbush, @turf/distance | Density-based |
Core Clustering Module Dependencies
K-means Clustering Implementation
The @turf/clusters-kmeans package implements the K-means clustering algorithm using the external skmeans library. This algorithm partitions points into k clusters by minimizing within-cluster sum of squares.
K-means Clustering Data Flow
Key Dependencies
- skmeans: External JavaScript implementation of K-means algorithm
- @turf/clone: Ensures immutability by cloning input features before modification
- @turf/invariant: Provides input validation and type checking
- @turf/meta: Used for iterating over GeoJSON features and coordinates
DBSCAN Clustering Implementation
The @turf/clusters-dbscan package implements the DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm, which groups points based on density and can identify outliers as noise.
DBSCAN Clustering Algorithm Flow
Key Dependencies
- rbush: R-tree spatial index for efficient neighbor queries during density calculations
- @turf/distance: Great circle distance calculations for determining point proximity
- @turf/clone: Feature cloning for immutable operations
- @turf/meta: GeoJSON feature iteration utilities
General Clustering Utilities
The @turf/clusters package provides base functionality and utility functions that support the clustering algorithms. This package serves as the foundation for common clustering operations.
General Clustering Utilities Architecture
Dependencies
The base clustering package has minimal dependencies, focusing on core Turf.js utilities:
| Dependency | Purpose |
|---|---|
@turf/helpers | GeoJSON creation and property manipulation |
@turf/meta | Feature and coordinate iteration |
@types/geojson | TypeScript type definitions |
External Library Integration
The clustering system integrates with specialized external libraries to provide robust algorithm implementations:
External Library Integration Pattern
Library Versions and Usage
| Library | Version | Package | Purpose |
|---|---|---|---|
skmeans | 0.9.7 | @turf/clusters-kmeans | K-means algorithm implementation |
rbush | ^3.0.1 | @turf/clusters-dbscan | Spatial indexing for neighbor queries |
Both libraries are wrapped by Turf.js modules to handle GeoJSON coordinate extraction and result formatting, ensuring consistent input/output patterns across the clustering system.