How to automatically split a video into scenes?
A dataset of videos, used for computer vision tasks, could be the key to success or failure. A clean dataset could lead the way to a great algorithm, model and ultimately system, while no matter how good the model or algorithm is, junk in — junk out.
Typically you will have several long videos, with multiple scenes in it, that might be used for different projects, while you need a one or more parts from each of the videos. How will you find these segments? Curate them? Confirm they are all of the desired quality?
Power of Data Explorer
Data Explorer is a platform that was built to allow us focus on the data, curate it, clean it and make sure we start the development cycles with a great foundation.
In previous blogs, we saw how a dataset of images could be visualized, explored, and even image based search used to find elusive examples.
The same can be applied on videos, but in a way that suits them over images.
Data Explorer allows you to start the process by choosing the frame rate at which to continue the process — keep the original or sub sample the video for faster results.
For example, if the original videos were recorded at 30 fps (frames per second), and you choose to process them at 3 fps, you now have only 10% of the data.
Naturally, there is a trade off between the volume and the accuracy but the choice is yours.
Next, like with images, every frame is represented using a feature-vector and the frames are clustered based on their similarity. Taking into consideration the time elements, the 2D space could be arranged vs. the time axis. In the example below, about 75+ min of dash cam footage are displayed, split automatically into 7 very distinct scenes, with examples from each scene are on the right:
75+ min of dash cam footage, split into 7 scenes (left) with examples from each scene (right)
This allows you to understand the structure of the video and start exploration step, like you would for images.
We could split the video even further into more scenes, in less than 30 seconds — click on the “filter” icon, choose the number of clusters and hit “submit”:
Left to Right: “Filter” icon; Choose new number of cluster and press “submit”; Video was split into 20 clusters with an example from each displayed
It is now fast and convenient to find the relevant scene, manipulate it, understand the similarity in each scene and prepare the best foundation for the next step of the computer vision algorithm, model and system.
In this blog we saw how Data Explorer manipulates videos, modifies frame rate for faster processing, splits them into scenes and allows for further curation and exploration. In the next article, we will see why a combination of the search and sample capabilities is so powerful on videos.