Computer Vision applications and digital twins inherently involves a very large scale data. Storing and Managing 100's -1000's of PB require new breed of tiering technologies
What exactly is the issue?
Modern manufacturing environments are equipped with multiple camera sensors, which capture information about both coarse-grained and fine-grained activities happening on the factory shop floor.
The data collected by these sensors can be employed for several use cases — building digital twins for predictive maintenance, ensuring the safety of factory personnel, quality inspection, or cross-lifecycle tracking of manufactured components for warranty and customer service purposes.
All of the use cases require scalable, cost-effective solutions for ingesting, transferring, and storing large volumes of image data across multiple storage locations, e.g., at the factory level, at a regional data center, or in a public cloud-like global location.
Preferred solutions would choose cost-effective archival storage tiers to host the bulk of the data without compromising the ability to access such data with reasonable latencies whenever required.
As an example, older data capturing detailed views of manufactured components are expected to see reduced usage for ongoing analytics and model training activities, but still need to be retained and retrieved on-demand for customer service and warranty purposes.
The Akridata Solution
Akridata’s Edge Data Platform provides a comprehensive ‘smart ingest + smart tiering’ solution to the manufacturing data management problem as illustrated in the figure below.
Smart ingest components fuse information from both the camera sensors and various components of the Manufacturing Execution System (MES) and support agile data pipelines for transforming, transferring, and storing the resulting data across multiple locations.
These pipelines support:
Practically unlimited number of objects or files (tens of billions).
Processing/compute as part of the ingest/migration process for extracting metadata, for transforming/creating derived objects, e.g., image signal processing, image format conversion, etc.
Prioritization of data transfers in a flexible fashion, e.g., based on sub-folders, file names, other meta-data, etc.
Policy-based migration from one source to multiple destinations which may be any combination of cloud (tiers, regions) or other on-prem data centers. Mixing and matching of POSIX FS, NFS, HDFS, S3-compatible sources and sinks.
Optimized performance across a diversity of networking links from edge sites and on-prem data centers to cloud regions.
Smart tiering components support flexible tiering policies for cross-location and cross-tier data movement … while supporting a global namespace for objects independent of the current location.
These components also allow the provisioning of a transparent cache (say, in front of the Archive tier) to improve access times for repeatedly accessed objects.
The tiering policies include the full spectrum of policies, including:
Standard Policies — Standard time-based or access temperature-based policies, e.g., archive objects seeing fewer than 10 accesses in a 6-month period.
Catalog-Driven Policies — Catalog-driven archiving policies, e.g., treat the data from all camera sensors located in the same area of the factory uniformly, independent of the access temperature of specific images from these cameras.
Use- Case-Specific Policies — Use case-specific policies, e.g., storing a lower-resolution replica of archived images (10x/100x smaller as desired by the end-user) in Tier 1 storage while the full-resolution data is placed in an Archive tier.
The smart ingest process allows a metadata catalog with a user-defined schema to be created in a relational (e.g. SQL Server) or NoSQL database.
Users can query using attributes in this catalog and instantly access and evaluate the low-res (“thumbnail”) replicas. Based on this evaluation, they can then decide if they need to access full-res archived images and then retrieve the relevant subset.
Trusted By Leaders in
Designed for Data Science teams to accelerate the path to building Production Grade AI models