[Feature Request]: CLIP Driven Universal Model

# Contrastive Language-Image Pre-training (CLIP) Driven Models and Partially Supervised Learning for Medical Image Segmentation

This issue is to discuss adding the CLIP-Driven Universal Model Features to MONAI.

Potential assignee: @tangy5 

## CLIP-Driven Universal Model


<img src="https://user-images.githubusercontent.com/58751975/210438322-96d90f81-331c-44ac-abe5-8b2969140249.png" width = "480" height = "345" alt="" align=center />


## Key features

The implementation will bring several new feature as follows:

1. **Universal Model:** one model to detect and segment all abdominal organs and all types of tumors (Liver tumor, kidney tumor, Lung nodule, Pancreas tumor, hepatic vessel tumor, colon tumor). 
2. **Language model (CLIP) and text-driven embeddings** boost medical image analysis.
3. **Training Partial labelled datasets**.
4. **Incremental learning:** Users can continue to train new segmentation classes using the current trained model without catastrophic forgetting. 

## ⏳ Dataset: The Universal Model is trained with following datasets

![Screenshot from 2023-01-03 13-16-57](https://user-images.githubusercontent.com/58751975/210442807-30bd8705-6f01-4c1b-950a-21bbc4817ca0.png)

- 01 [Multi-Atlas Labeling Beyond the Cranial Vault - Workshop and Challenge (BTCV)](https://www.synapse.org/#!Synapse:syn3193805/wiki/217789)
- 02 [Pancreas-CT TCIA](https://wiki.cancerimagingarchive.net/display/Public/Pancreas-CT)
- 03 [Combined Healthy Abdominal Organ Segmentation (CHAOS)](https://chaos.grand-challenge.org/Combined_Healthy_Abdominal_Organ_Segmentation/)
- 04 [Liver Tumor Segmentation Challenge (LiTS)](https://competitions.codalab.org/competitions/17094#learn_the_details)
- 05 [Kidney and Kidney Tumor Segmentation (KiTS)](https://kits21.kits-challenge.org/participate#download-block)
- 06 [Liver segmentation (3D-IRCADb)](https://www.ircad.fr/research/data-sets/liver-segmentation-3d-ircadb-01/)
- 07 [WORD: A large scale dataset, benchmark and clinical applicable study for abdominal organ segmentation from CT image](https://github.com/HiLab-git/WORD)
- 08 [AbdomenCT-1K](https://github.com/JunMa11/AbdomenCT-1K)
- 09 [Multi-Modality Abdominal Multi-Organ Segmentation Challenge (AMOS)](https://amos22.grand-challenge.org)
- 10-15 [Decathlon (Liver, Lung, Pancreas, HepaticVessel, Spleen, Colon)](https://drive.google.com/drive/folders/1HqEgzS8BV2c7xYNrZdEAnrHk7osJJ--2)
- 16 [CT volumes with multiple organ segmentations (CT-ORG)](https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=61080890)
- 17 [13 AbdomenCT 12organ](https://github.com/JunMa11/AbdomenCT-1K)

## Implementation plans

- [ ] Transformations (pre-processing) for partial labelled datasets: “PartialLabelTransfer”, etc
- [ ] Segmentation backbone with CLIP embedding, text-driven segmentor: plug-and-play CLIP embedding and text encoder. 
- [ ] Tutorial for training and inference of Universal Model.
- [ ] Tutorial for demonstrating partial supervised learning and incremental learning.
- [ ] Model release: Bundle for Model Zoo for publishing the trained universal model to segment all types of tumours and abdominal organs. 

## More Details of the Feature Methodology:

1. Universal Model:
![Screenshot from 2023-01-03 12-09-23](https://user-images.githubusercontent.com/58751975/210440352-50b68a10-69f5-48d5-bbca-c100ccd680a0.png)

2. CLIP Driven  and text-driven segmentor:
![Screenshot from 2023-01-03 12-10-09](https://user-images.githubusercontent.com/58751975/210440462-4607db99-72a3-4908-ad8e-68e7186282d2.png)

3. Partial Supervised Learning:
![Screenshot from 2023-01-03 12-04-46](https://user-images.githubusercontent.com/58751975/210440515-2d58f5cf-134f-4c53-ba77-2fefde51644e.png)

4. Incremental Leraning:

![Screenshot from 2023-01-03 12-11-14](https://user-images.githubusercontent.com/58751975/210440563-f8d930f5-953a-4de8-818a-5c63c4da3246.png)


Detailed steps of implantation will provide after open discussion.

Welcome all suggestions and comments! 

@ljwztc @MrGiovanni

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: CLIP Driven Universal Model #5800

Contrastive Language-Image Pre-training (CLIP) Driven Models and Partially Supervised Learning for Medical Image Segmentation

CLIP-Driven Universal Model

Key features

⏳ Dataset: The Universal Model is trained with following datasets

Implementation plans

More Details of the Feature Methodology:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request]: CLIP Driven Universal Model #5800

Description

Contrastive Language-Image Pre-training (CLIP) Driven Models and Partially Supervised Learning for Medical Image Segmentation

CLIP-Driven Universal Model

Key features

⏳ Dataset: The Universal Model is trained with following datasets

Implementation plans

More Details of the Feature Methodology:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions