As we prepare for 2024, the world of AI, and particularly computer vision (CV), is poised to grow by leaps and bounds.
Computer vision, a form of AI used since the 1960s to capture, analyze, and form predictions based on visual data, is used in a wide variety of fields today. From medical imaging to security and surveillance, to manufacturing for process checks and quality inspection, to autonomous vehicles, computer vision is used everywhere, by nearly everyone.
However, with the recent breakthroughs in AI models, we are now also headed for a fundamental shift in our approach in computer vision, with the adoption of deep learning architectures.
Deep learning involves the use of multiple layers of deep neural networks, as inspired by the human brain. With the advanced complexity of modern deep learning networks, we now have the ability to address a broader set of use cases that never could be solved with classical computer vision.
Now, let’s dive into my predictions for AI and computer vision in 2024 and beyond.
Prediction # 1: Foundational Models Pave the Way for Software 3.0
To start, moving forward, we can expect that the rate of innovation in Foundational models will pave the way for Software 3.0.
If Software 1.0 and Software 2.0 focused on “how to do it,” Software 3.0 will be all about “what to do.” Over the past few years, Foundational models have made major progress in Natural Language Processing (NLP), and “understanding” human language. As this progress for models continues, the capabilities of AI to problem solve and make useful, accurate decisions and predictions will also grow.
Additionally, we can anticipate the world of software development of applications to become flatter. Thanks to the creation of the mobile app marketplace, we’ve experienced an explosion of useful apps over the years. Similarly, Software 3.0 and Foundational models will also continue to break the barriers to entry for people to productize their ideas and enable more talented individuals to create useful business and consumer applications.
Prediction # 2: More Investment in Data Quality
We’ve all witnessed the shift towards a data-centric approach in the past year. But in 2024, AI will take even more of a leap towards the data-centric approach.
What exactly will this look like? Organizations investing more resources into tools that aid in the selection and curation of data, in order to best and most effectively train models, while simultaneously boosting data quality and accuracy. This will be especially effective for Computer Vision, as CV is so frequently used to pinpoint anomalies or rarities in data.
Think: visual data showing the cracks in a widget produced in a manufacturing line or images of a forgotten school bag disrupting the path of a delivery robot. While finding such unique instances is always challenging, as we move forward and continue to capture more high-quality visual data, our models will be able to perform more reliably in real-world applications.
Additionally, there’s always the option of synthesizing training data. For visual datasets, this will require a sophisticated class of tools to augment datasets.
Moving ahead, expect the data augmentation class to become even more critical in making production-grade applications based on clean, high-quality data a reality.
Prediction #3: Foundation to Edge
A third major prediction for Computer Vision in 2024 is that the emphasis for CV models will become more focused on size and efficiency vs. more parameters.
Foundational LLM models are multi-billion parameters and continue to keep growing larger. So far, this is not a problem, as most LLM models typically run on performance infrastructure, using text as the input, and preventing latency constraints from impacting the inference performance.
On the other hand, CV models are often required to run in a constrained edge infrastructure, with applications typically requiring edge inference.
This becomes an issue, especially with video data, which are typically very large datasets. With such a large size, it’s not practical to host a large CV model in a central computer, or for video/image data to be fed in real time from the edge to the central computer (DC or Cloud). And yet, the amount of visual data we create and capture expands. every day
The future fix to solving this conundrum? Placing a higher emphasis on CV model
Performance and efficiency, rather than simply making models bigger.
Prediction # 4: Shift from Human in the Loop Annotation
My final prediction for CV in 2024 and beyond is for organizations to embrace modern visual data preparation tools to assist with automated data selection and curation.
While we all know data preparation is the key to a successful Computer Vision algorithm, it’s also a notoriously tricky, time-consuming, and expensive process. Data must be collected, cleaned, labeled, and curated, and organizations are already scrambling to fill (and keep happy) their data scientists.
So far, there has been a lack of tools that can scale to improve the repetitive, prone to human error tasks of visual data collection, cleaning, and labeling. Many have skirted around this and gotten by by stitching together open utilities, scripts, and other cobbled-together tools. This works. While this DIY approach (mostly) works for a one or two-person Computer Vision team, it completely falls apart as teams and datasets grow and scale.
Thankfully, there is another path.
Modern, AI-powered visual data tools that aid in quickly and accurately collecting, curating, and cleaning your data will be the next stage of the AI revolution, and significantly reduce the time and hefty costs typically associated with data preparation. Even better? Advancements in visual data search, clustering, and model audit tools will also work to improve model training and accuracy over time, again improving accuracy, quality, and predictions.
As we look ahead to a successful 2024, it’s clear that fully embracing LLMs and Computer Vision will be critical for companies wanting to stay competitive and offer the best, most useful, and high-quality services and products, whether it’s in the medical imaging field or quality control for a manufacturing business.
As AI continues to march forward, we also must remember that the responsible and ethical use of this incredible technology will be paramount to harness its full benefits and positively impact business operations.
Get ready for 2024 to be the biggest year yet for AI and Computer Vision advancements.