Imagine you’ve just been given a new batch of 10,000 images or hours of video and you need to find only a small portion of relevant images. How would you go about isolating that subset of the data?
In many cases, we receive a batch of visual data, images or video, with very limited control over the content. Video from vehicles will contain a lot of unnecessary frames, while surveillance cameras will record plenty of empty ones. In other cases, the acquisition process may be used for several different projects, and it is up to us to extract the relevant images.
Unless you have the proper tools, you’ll most likely be spending a ton of labor and time sifting through the images manually. That’s why we created Data Explorer.
Akridata’s Data Explorer allows us to search for images by simply marking an example or by marking a patch to find similar images.
In a previous blog, we explored the nuScene database. In this post, we will demonstrate the search capabilities over other sets.
Finding the needle(s) in your data haystack
The dataset for this example is a set of surfaces and we will be trying to isolate the image frames that have a horizontal crack in the surface (the needle).
Start by visualizing the dataset, sampling it and looking at a few examples, shown in the image below:
With any luck, we can easily find one or two examples by scrolling through the randomly sampled gallery initially shown. If not, it may require manually adding an example of a desired image to the dataset, or introducing an example of the ‘needle’ in a different way.
In this example, luckily a quick scroll through the images provides the first example. Giving it a ‘thumbs up’ will provide a positive reinforcement to the search algorithm and it will search for images with matching characteristics.