We'll keep you in the loop with everything good going on in the Akridata world.

How to effectively combine metadata and raw images for model training?

Introduction

A dataset of videos, used for computer vision tasks, could be the key to success or failure. A clean dataset could lead the way to a great algorithm, model and ultimately system, while no matter how good the model or algorithm is, garbage in – garbage out.

If the data is labeled, then you must have a file json/txt/yaml/other, with metadata about each image, or some of the images.

How can you filter based on that information? How can you build the best dataset using labeled and raw images?

You can write custom code, generate a table, some visualization and start supporting it. If your team has more than 1 person working on this data, collaboration becomes increasingly harder.

There is a simpler way!

Data Explorer

Data Explorer is a platform that was built to allow us focus on the data, curate it, clean it and make sure we start the development cycles with a great foundation.

It can be used for dataset curation via visualization of the data on a 2D plot, automatically clustered into distinct classes; exploration of each cluster, and even using image-based-search.

However, the first thing you can do, even before visualization, is filtering the data using a simple interface, where all the metadata is stored in a single table.

The image below shows the basic metadata visualization — a table on the right contains data per image, such as file-name, detected class, object bounding box coordinates, confidence score etc. The table can be extended with any additional metadata per image, while security measure can be applied to hide internal data, revealing only the relevant entries of a table.

Metadata per image arranged in the table

This type of visualization allows you to filter based on any of the column with an sql-like query. The image below shows that by clicking the “pencil” on the top right, you could filter the data by setting conditions on any of the columns, while on the left you define the frame range to process:

Click the “pencil” icon to filter the dataset based on metadata in any of the columns; On left— Define number of frames to process

For example, Pascal dataset contains natural images with various objects marked on them. Choose only images with a “bird” to start building a dataset for a bird classifier.

After filtering based on the metadata, view the structure of the dataset chosen via the visualization method of Data Explorer to continue building the required dataset for the current task.

Summary

In this blog, we saw how to first filter based on metadata, and then continue visualizing the dataset based on the images chosen. Continue to curate the dataset, develop the algorithm, model and system.

In a future blog, we will see how to continue working with a labeled dataset and analyze model training results.

Stay updated with Akridata by signing up for our newsletter.

Alexander Berkovich

Alex, a principal AI/ML engineer at Akridata, has worked on vision-based systems for almost 20 years, holding positions such as an R&D manager, team lead, and algorithm developer in a variety of domains, ranging from smart cities, to medical quality inspections, manufacturing and more.

comments

No Responses

TOP PRODUCTS in SUITe

Data Explorer

Platform for data science teams to Accelerate Model Accuracy

Learn more

Edge Data Platform

Reduce false positives and negatives to eliminate defective shipments.

Learn more

How to effectively combine metadata and raw images for model training?

Introduction

There is a simpler way!

Data Explorer

Summary

Stay updated with Akridata by signing up for our newsletter.

Alexander Berkovich

related posts

comments

No Responses

Leave a Reply Cancel reply

TOP PRODUCTS in SUITe

Revolutionize your inspections. Try Visual Data Copilot now!

Latest Blogs

Akridata Helps Railway Organizations Better…

Modern Model Accuracy Analysis

Manual Inspection Processes Are Costing…

Fast and Improved Data Labeling…

Products

Solutions

Edge Data Platform

Data Explorer

Resources

Edge Data Platform

Data Explorer

COMPANY

Pricing

Blog

Privacy Policy

Terms & Conditions