Models by this creator




Total Score


The Prithvi-100M model is a first-of-its-kind temporal Vision Transformer pre-trained by the IBM and NASA team on contiguous US Harmonised Landsat Sentinel 2 (HLS) data. The model adopts a self-supervised encoder developed with a ViT architecture and Masked AutoEncoder (MAE) learning strategy, with an MSE loss function. The model includes spatial attention across multiple patches and also temporal attention for each patch. This model can be compared to other similar models like moondream1, which is a 1.6B parameter model built using SigLIP, Phi-1.5 and the LLaVa training dataset, as well as neural-chat-7b-v3-1, a 7B parameter LLM finetuned on the Intel Gaudi 2 processor. Model inputs and outputs Inputs The Prithvi-100M model accepts remote sensing data in a video format (B, C, T, H, W), where the temporal dimension (T) is crucial for this application and not present in most other remote sensing models. The model can handle both time series of remote sensing images as well as static imagery with T=1. The input data includes the following bands from the NASA HLS V2 L30 product: Blue, Green, Red, Narrow NIR, SWIR 1, and SWIR 2. Outputs The model can perform image reconstruction on a set of HLS images from the same location at different time steps. The output can be used for a variety of downstream tasks such as Burn Scars segmentation, Flood Segmentation, and Land Cover Classification. Capabilities The Prithvi-100M model's unique capability is its ability to handle temporal remote sensing data, which can benefit a variety of applications in the geospatial domain. By incorporating spatial and temporal attention, the model can learn meaningful representations from time-series imagery, enabling more accurate and robust analysis of land cover changes, disaster events, and other environmental phenomena. What can I use it for? The Prithvi-100M model can be used for a range of applications in the remote sensing and geospatial fields. Some potential use cases include: Land Cover Classification**: The model can be finetuned on labeled land cover data to perform accurate and efficient classification of different land cover types over time. Burn Scar Mapping**: The temporal capabilities of the model can be leveraged to detect and map the extent of burn scars after wildfires, which is crucial for disaster response and mitigation efforts. Flood Monitoring**: By analyzing time-series remote sensing data, the model can be used to identify and track the progression of flood events, supporting flood risk assessment and emergency planning. Things to try One interesting aspect of the Prithvi-100M model is its ability to handle both static and time-series remote sensing imagery. Researchers and developers could explore how the model's performance varies when applying it to different types of input data, such as comparing its accuracy on single-date versus multi-date land cover classification tasks. Additionally, the model's finetuning capabilities, as demonstrated by the provided examples for burn scar segmentation, present an opportunity to investigate how the pre-trained model can be further optimized for specific downstream applications. Experimenting with different finetuning strategies and dataset compositions could yield insights into the model's adaptability and versatility.

Read more

Updated 5/28/2024