Satellogic, a leader in sub-meter resolution Earth Observation (โEOโ) data collection, today announced the release of a large open dataset of high-resolution imagery, curated from the companyโs archive, to support the training of foundation models.
The dataset contains around 3 million Satellogic images of unique locations โ 6 million images, including location revisits โ from around the world. Each image is 384 by 384 pixels, totaling 900 Gigapixels spanning different land-use types, objects, geographies, and seasons. The full dataset can be accessed onย Hugging Face.
โFollowing a stream of recent publications, with the release of this large dataset we aim to accelerate the development of foundational models in the field of EO,โ said Javier Marin, Applied AI Director at Satellogic. โInstead of relying on analysts to manually select and process satellite images, we will soon start interacting with large Earth Observation AI models with access to high-resolution, real-time imagery of our planet to derive those insights.โ
Satellogic data is released under a Creative Commons CC-BY 4.0 license, allowing for commercial use of the data with attribution.
A paper presenting the dataset will be published along with the release of a baseline foundation model, a masked autoencoder (scalable self-supervised learners for computer vision), built on top of it.
The paper describes how the dataset is built, the model architecture and the experimental setup. This work is the result of Satellogicโs collaboration with an exceptional team of researchers led by Alexandre Lacoste at ServiceNow under Yoshua Bengioโs guidance.