Mask prompting based on Grounding DINO & Segment Anything | Integral cog of doiwear.it

## Model overview

`grounded_sam` is an AI model that combines the strengths of [Grounding DINO](https://aimodels.fyi/models/replicate/groundingdino-shilongliu) and [Segment Anything](https://aimodels.fyi/models/replicate/sam-vit-base-facebook) to provide a powerful pipeline for solving complex masking problems. Grounding DINO is a strong zero-shot object detector that can generate high-quality bounding boxes and labels from free-form text, while Segment Anything is an advanced segmentation model that can generate masks for all objects in an image. This project adds the ability to prompt multiple masks and combine them, as well as to subtract negative masks for fine-grained control.

## Model inputs and outputs

`grounded_sam` takes an image, a positive mask prompt, a negative mask prompt, and an adjustment factor as inputs. It then generates a set of masks that match the provided prompts. The positive prompt is used to identify the objects or regions of interest, while the negative prompt is used to exclude certain areas from the mask. The adjustment factor can be used to dilate or erode the masks.

### Inputs
- **Image**: The input image to be masked.
- **Mask Prompt**: The text prompt used to identify the objects or regions of interest.
- **Negative Mask Prompt**: The text prompt used to exclude certain areas from the mask.
- **Adjustment Factor**: An integer value that can be used to dilate (+) or erode (-) the generated masks.

### Outputs
- **Masks**: An array of image URIs representing the generated masks.

## Capabilities

`grounded_sam` is a powerful tool for programmed inpainting and selective masking. It can be used to precisely target and mask specific objects or regions in an image based on text prompts, while also excluding unwanted areas. This makes it useful for tasks like image editing, content creation, and data annotation.

## What can I use it for?

`grounded_sam` can be used for a variety of applications, such as:

- **Image Editing**: Precisely mask and modify specific elements in an image, such as removing objects, replacing backgrounds, or adjusting the appearance of specific regions.
- **Content Creation**: Generate custom masks for use in digital art, compositing, or other creative projects.
- **Data Annotation**: Automate the process of annotating images for tasks like object detection, instance segmentation, and more.

## Things to try

One interesting thing to try with `grounded_sam` is using it to create masks for programmed inpainting. By combining the positive and negative prompts, you can precisely target the areas you want to keep or remove, and then use the adjustment factor to fine-tune the masks as needed. This can be a powerful tool for tasks like object removal, image restoration, or content-aware fill.