Note: The output RGB is a premultiplied version to avoid the color decontamination problem.
It can directly composite with a background using:
composite = rgb + (1 - alpha) * background
Due to limited online resources, we have restricted the inference steps to 25 and the number of frames to 13,
which may affect the generation quality to some extent.
For a better experience, we recommend visiting our GitHub repository and running the method locally by following the provided setup instructions.