This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materials including this page, select Japan from the country navigator on the bottom of this page.

Image Category Classification and Image Retrieval

Create a bag of visual words for image classification and content-based image retrieval (CBIR) systems

To classify images into categories, you generate a histogram of visual word occurrences that represent an image. These histograms, called a bag of visual words, are used to train an image category classifier. You can also use the Computer Vision System Toolbox™ functions to search by image, also known as a content-based image retrieval (CBIR) system. CBIR systems are used to retrieve images from a collection of images that are similar to a query image.


Image LabelerLabel ground truth in a collection of images


trainImageCategoryClassifierTrain an image category classifier
bagOfFeaturesBag of visual words object
imageCategoryClassifierPredict image category
invertedImageIndexSearch index that maps visual words to images
evaluateImageRetrievalEvaluate image search results
indexImagesCreate image search index
retrieveImagesSearch image set for similar image
imageDatastoreCreate ImageDatastore object for collections of image data


Image Retrieval Using Customized Bag of Features

This example shows how to create a Content Based Image Retrieval (CBIR) system using a customized bag-of-features workflow.

Image Retrieval with Bag of Visual Words

Retrieve images from a collection of images similar to a query image using a content-based image retrieval (CBIR) system.

Image Classification with Bag of Visual Words

Use the Computer Vision System Toolbox functions for image category classification by creating a bag of visual words.

Define Ground Truth for Image Collections

Interactively label rectangular ROIs for object detection, pixels for semantic segmentation, and scenes for image classification.