lazyslide.tl.text_image_similarity#
- text_image_similarity(wsi, text_embeddings, model='plip', tile_key='tiles', feature_key=None, key_added=None, normalize=True, softmax=False, scoring_func=None)#
Compute the similarity between text and image.
Note
Prerequisites:
The image features should be extracted using
zs.tl.feature_extraction.The text embeddings should be computed using
zs.tl.text_embedding.
- Parameters:
- wsi
WSIData The WSIData object to work on.
- text_embeddingspd.DataFrame
The embeddings of the texts, with texts as index.
- modelstr, default: “plip”
The text embedding model.
- tile_keystr, default: ‘tiles’
The tile key.
- feature_keystr
The feature key.
- key_addedstr
The key to store the similarity scores. If None, defaults to ‘{feature_key}_text_similarity’.
- normalizebool, default: True
Apply L2 normalization to the tile features before computing the similarity score to the text embeddings.
- softmaxbool, default: False
Whether to apply softmax to the similarity scores.
- distance_metricstr or callable, optional
The distance metric from scipy.spatial.distance to use instead of dot product. Can be a string metric name or a callable function. If provided, distances will be computed and converted to similarities (1 - distance). Common string options include ‘cosine’, ‘euclidean’, ‘manhattan’, ‘chebyshev’, etc. If None, uses dot product similarity. Cannot be used together with scoring_func.
- scoring_funccallable, optional
A custom scoring/similarity function that takes two matrices and returns a similarity score matrix (higher = more similar). Should have same signature as np.dot: func(X, Y) where X is (n_texts, feature_dim) and Y is (feature_dim, n_features), returning (n_texts, n_features). If provided, this takes precedence over distance_metric and dot product. Cannot be used together with distance_metric.
- wsi
- Returns:
- None
Note
The similarity scores will be saved in the tables slot of the spatial data object.
Examples
>>> import lazyslide as zs >>> # Using dot product similarity (default) >>> zs.tl.text_image_similarity(wsi, embeddings, model="plip", ... tile_key="text_tiles", ... softmax=True) >>> # Using scipy distance functions >>> zs.tl.text_image_similarity(wsi, embeddings, model="plip", ... tile_key="text_tiles", ... distance_metric="euclidean") >>> # Using custom scoring function >>> zs.tl.text_image_similarity(wsi, embeddings, model="plip", ... tile_key="text_tiles", ... scoring_func=custom_scoring_func)