Computer Vision Toolbox Model for Moondream Vision Language Model

Moondream is a small footprint vision language model, with image captioning capability.

You are now following this Submission

The Moondream 2 model is a lightweight Vision-Language Model (Vision-LLM) capable of image captioning. Due to its small size, it can be run efficiently on most local workstations.

Tags

Add Tags

Add the first tag.

MATLAB Release Compatibility

  • Compatible with R2026a

Platform Compatibility

  • Windows
  • macOS (Apple Silicon)
  • macOS (Intel)
  • Linux