Frame Interpolation for Large Scene Motion

## Model overview

The `frame-interpolation` model, developed by the Google Research team, is a high-quality frame interpolation neural network that can transform near-duplicate photos into slow-motion footage. It uses a unified single-network approach without relying on additional pre-trained networks like optical flow or depth estimation, yet achieves state-of-the-art results. The model is trainable from frame triplets alone and uses a multi-scale feature extractor with shared convolution weights across scales.

The `frame-interpolation` model is similar to the [FILM: Frame Interpolation for Large Motion](https://aimodels.fyi/models/replicate/film-frame-interpolation-for-large-motion-zsxkib) model, which also focuses on frame interpolation for large scene motion. Other related models include [stable-diffusion](https://aimodels.fyi/models/replicate/stable-diffusion-stability-ai), a latent text-to-image diffusion model, [video-to-frames](https://aimodels.fyi/models/replicate/video-to-frames-fofr) and [frames-to-video](https://aimodels.fyi/models/replicate/frames-to-video-fofr), which split a video into frames and convert frames to a video, respectively, and [lcm-animation](https://aimodels.fyi/models/replicate/lcm-animation-fofr), a fast animation model using a latent consistency model.

## Model inputs and outputs

The `frame-interpolation` model takes two input frames and the number of times to interpolate between them. The output is a URI pointing to the interpolated frames, including the input frames, with the number of output frames determined by the "Times To Interpolate" parameter.

### Inputs
- **Frame1**: The first input frame
- **Frame2**: The second input frame
- **Times To Interpolate**: Controls the number of times the frame interpolator is invoked. When set to 1, the output will be the sub-frame at t=0.5; when set to > 1, the output will be an interpolation video with (2^times_to_interpolate + 1) frames, at 30 fps.

### Outputs
- **Output**: A URI pointing to the interpolated frames, including the input frames.

## Capabilities

The `frame-interpolation` model can transform near-duplicate photos into slow-motion footage that looks as if it was shot with a video camera. It is capable of handling large scene motion and achieving state-of-the-art results without relying on additional pre-trained networks.

## What can I use it for?

The `frame-interpolation` model can be used to create high-quality slow-motion videos from a set of near-duplicate photos. This can be particularly useful for capturing dynamic scenes or events where a video camera was not available. The model's ability to handle large scene motion makes it well-suited for a variety of applications, such as creating cinematic-quality videos, enhancing surveillance footage, or generating visual effects for film and video production.

## Things to try

With the `frame-interpolation` model, you can experiment with different levels of interpolation by adjusting the "Times To Interpolate" parameter. This allows you to control the number of in-between frames generated, enabling you to create slow-motion footage with varying degrees of smoothness and detail. Additionally, you can try the model on a variety of input image pairs to see how it handles different types of motion and scene complexity.