Skip to content

Any plans to add ModelScope's 1.7B text2video synthesis diffusion model? #2736

@kabachuha

Description

@kabachuha

Model/Pipeline/Scheduler description

Hello!

There seems to be a new 1.7B-parameter Diffusion-based model by ModelScope allowing text2video synthesis as noted by AKHaliq https://twitter.com/_akhaliq/status/1637321077553606657?s=20. Both the model implementation and weights (downloaded with their pipeline) are in open access and it's already possible to launch it via HuggingFace's spaces. However, the model lacks a lot of possible optimizations, especially concerning LowVRAM mode, and accessibility options, and I believe it would benefit greatly from the help of Diffusers community.

Example: monkey playing on drums

tmp2tkrr492.mp4

At this time the model should be fitting around 16 gbs of VRAM, but since it's a combination of 4 gb, 6 gb, and 5 gb models, I believe with half precision and sequential pipeline it will be eventually possible to launch it on modern consumer hardware.

The license is Apache-2.0 license, so there will be no problems with using the code as the reference.

Open source status

  • The model implementation is available
  • The model weights are available (Only relevant if addition is not a scheduler).

Provide useful links for the implementation

HuggingFace space:

https://huggingface.co/spaces/damo-vilab/modelscope-text-to-video-synthesis

All the parts of the model at HuggingFace:

https://huggingface.co/damo-vilab/modelscope-damo-text-to-video-synthesis/tree/main

The model PyTorch implementation:

https://github.com/modelscope/modelscope/tree/master/modelscope/models/multi_modal/video_synthesis

Google Colab from the devs:

https://colab.research.google.com/drive/1uW1ZqswkQ9Z9bp5Nbo5z59cAn7I0hE6R?usp=sharing

License: Apache-2.0 license

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions