Introduction
A dataset of videos, used for computer vision tasks, could be the key to success or failure. A clean dataset could lead the way to a great algorithm, model and ultimately system, while no matter how good the model or algorithm is, garbage in – garbage out.
Typically you will have several long videos, with multiple scenes in it, that might be used for different projects, while you need a one or more parts from each of the videos. How will you find these segments? Curate them? Confirm they are all of the desired quality?
The Power of Akridata Data Explorer for Video Data
Data Explorer is a platform that was built to allow us focus on the data, curate it, clean it and make sure we start the development cycles with a great foundation.
In previous blogs, we saw how a dataset of images could be visualized, explored, and even image based search used to find elusive examples.
The same can be applied on videos, but in a way that suits them over images.
Data Explorer allows you to start the process by choosing the frame rate at which to continue the process — keep the original or sub sample the video for faster results.
For example, if the original videos were recorded at 30 fps (frames per second), and you choose to process them at 3 fps, you now have only 10% of the data.
Naturally, there is a trade off between the volume and the accuracy but the choice is yours.
Next, like with images, every frame is represented using a feature-vector and the frames are clustered based on their similarity. Taking into consideration the time elements, the 2D space could be arranged vs. the time axis. In the example below, about 75+ min of dash cam footage are displayed, split automatically into 7 very distinct scenes, with examples from each scene are on the right:
75+ min of dash cam footage, split into 7 scenes (left) with examples from each scene (right)
This allows you to understand the structure of the video and start exploration step, like you would for images.
We could split the video even further into more scenes, in less than 30 seconds — click on the “filter” icon, choose the number of clusters and hit “submit”:
Left to Right: “Filter” icon; Choose new number of cluster and press “submit”; Video was split into 20 clusters with an example from each displayed
It is now fast and convenient to find the relevant scene, manipulate it, understand the similarity in each scene and prepare the best foundation for the next step of the computer vision algorithm, model and system.
Frame Rate Adjustment for Efficient Processing
Akridata Data Explorer enables efficient processing of large video datasets through frame rate adjustment. By allowing users to modify the frame rate, it helps in managing data volume while preserving critical information. For instance, if your videos are originally recorded at 30 frames per second (fps), you can choose to process them at a lower frame rate, such as 3 fps. This reduction significantly decreases the data size, making it easier and faster to handle, without losing essential details necessary for computer vision tasks.
Adjusting the frame rate not only speeds up the processing time but also optimizes resource usage, allowing for quicker iterations and analysis. This is particularly beneficial when dealing with extensive datasets, as it ensures a balance between data volume and processing efficiency. With Akridata Data Explorer, you can tailor the frame rate to suit your specific requirements, thereby enhancing the overall data curation process.
Scene Splitting for Detailed Analysis
One of the standout features of Akridata Data Explorer is its ability to split videos into distinct scenes, facilitating detailed analysis. Scene splitting is crucial for identifying and isolating different segments within a video, which can then be curated based on specific project needs. By leveraging advanced clustering algorithms, Akridata Data Explorer automatically divides the video into scenes that share similar characteristics.
This feature is particularly useful for tasks such as autonomous driving, where different driving conditions and environments need to be analyzed separately. For example, a long dash cam video can be segmented into scenes based on day, night, urban, and rural environments. This granular level of analysis allows for more targeted data curation and better preparation for training computer vision models.
With Akridata Data Explorer, the scene splitting process is quick and intuitive. You can adjust the number of clusters as needed, ensuring that each scene is accurately represented and ready for further exploration and analysis.
Exploring and Curating Video Data with Akridata
Akridata Data Explorer provides a comprehensive platform for exploring and curating video data, ensuring that you start your development cycles with high-quality datasets. Once videos are processed and split into scenes, you can delve deeper into each segment to understand its structure and content. This exploration phase is critical for identifying relevant scenes, outliers, and patterns that might be useful for your computer vision tasks.
The platform’s user-friendly interface allows for easy navigation and manipulation of video data. You can filter scenes, tag important segments, and organize the data according to your project needs. This detailed curation process ensures that only the most relevant and high-quality data is used for training and testing your models, improving the overall performance and reliability of your computer vision systems.
Furthermore, Akridata Data Explorer supports advanced features such as visual similarity search and metadata tagging, which enhance the curation process. By providing a robust set of tools for video data exploration and curation, Akridata Data Explorer helps streamline your workflow and ensures that your datasets are well-prepared for the next stages of development.
Summary
In this blog we saw how Data Explorer manipulates videos, modifies frame rate for faster processing, splits them into scenes and allows for further curation and exploration. In the next article, we will see why a combination of the search and sample capabilities is so powerful on videos.
No Responses