lazyslide_models.vision.Moozy#
- class Moozy(model_path=None, token=None)#
Bases:
SlideEncoderModelmoozy 🤗Hugging Face GitHub Paper Params: 85.77M CC-BY-NC-SA-4.0 [Kotp et al., 2026] A patient-first foundation model for computational pathology MOOZY slide and case encoder.
The slide encoder requires spatial coordinates and patch sizes for its ALiBi position bias. Pass
coords(xy positions) andpatch_sizesas keyword arguments toencode_slide().The case transformer aggregates multiple slide embeddings into a single patient-level representation via
encode_case().- encode_case(slide_embeddings)#
Aggregate slide embeddings into a case-level embedding.
- Parameters:
- slide_embeddingstorch.Tensor
Slide-level CLS embeddings. Shape
[S, 768]where S is the number of slides for a patient case.
- Returns:
- torch.Tensor
Case embedding of shape
[768].
- encode_slide(embeddings, coords=None, **kwargs)#
Encode patch features into a slide-level embedding.
- Parameters:
- embeddingstorch.Tensor
Patch features. Accepted shapes:
[B, H, W, 384]— spatial grid layout (native format).[H, W, 384]— single slide spatial grid (will be unsqueezed).[B, T, 384]— flat sequence; will be reshaped to a square grid (T must be a perfect square or will be zero-padded).[T, 384]— single slide flat sequence.
- coordstorch.Tensor
Spatial coordinates for each patch token. Must match the spatial layout of
embeddings. Shape[B, H, W, 2]or[H, W, 2]for grid inputs, or[B, T, 2]/[T, 2]for flat inputs. Required — the ALiBi position bias needs real-space positions.- **kwargs
patch_sizesfloat or torch.Tensor, optionalPatch size in level-0 pixels. Defaults to 224.
invalid_masktorch.Tensor, optionalBoolean mask
[B, H, W]where True = invalid/background.
- Returns:
- dict
{"embeddings": cls_output}wherecls_outputis[B, 768].