Large Survey DataBase

A framework for spatial analysis of extremely large astronomical surveys in HiPSCat format

What is LSDB

A framework for spatial analysis of extremely large astronomical surveys

Designed to enable querying and crossmatching of O(1B) sources. It addresses large-scale data processing challenges, in particular those brought up by LSST.

Built on top of Dask to efficiently scale and parallelize operations across multiple workers, it leverages the HiPSCat data format for surveys in a partitioned HEALPix (Hierarchical Equal Area isoLatitude Pixelization) structure.

   
An HiPSCat partitioning schema for Gaia DR3

Get Started

Install the latest release version of LSDB via conda.

$ conda install -c conda-forge lsdb

Or, if preferred, via pip.

$ pip install lsdb

Import the package, read two catalogs and perform their crossmatch.

>> import lsdb

# Read the Gaia DR3 object catalog
>> gaia = lsdb.read_hipscat(gaia_path)

# Read the ZTF DR14 object catalog
>> ztf = lsdb.read_hipscat(ztf_path)

# Crossmatch the two catalogs
>> ztf.crossmatch(gaia, n_neighbors=1, radius_arcsec=1)
For advanced use cases, please have a look at the tutorials we put together.

Available HiPSCat surveys

data.lsdb.io

Hosted by DiRAC @ UW

The Institute for Data Intensive Research in Astrophysics & Cosmology hosts a collection of survey catalogs in HiPSCat format. Among them are Zwicky Transient Facility's (Data Release 14 and Zubercal) and Gaia (Data Release 3). They are available to the community for public use.

Interested in collaborating with us?


Join our working group