lazyslide.models.multimodal.OmiCLIP#
- class OmiCLIP(model_path=None, token=None)#
Bases:
ImageTextModelGitHub 🤗 Hugging Face Paper | Multimodal foundation model
- encode_image(image)#
Batch–encode a list of image file paths into L2‑normalized embeddings. Returns a tensor of shape (N, D).
- encode_text(text)#
Batch–encode a list of strings into L2‑normalized embeddings. Returns a tensor of shape (N, D).