Our Investment in ApertureData

ApertureData created a Database for Visual Data, specifically designed for Machine Learning

Alex Iskold

Just a decade ago, the internet was filled with text. Today, text is receding in favor of its richer, more visual, fun, and information dense rivals: photo and video.

The ubiquity of smartphones, combined with increased bandwidth and commoditized storage, is leading to an explosion of photo and video content in nearly every facet of our lives. Each one of us is generating gigabytes of data on a daily basis, and the pace of visual data creation is only accelerating.

Data Science teams across different sectors are feeling this shift. Increasingly, machine learning algorithms are focused on processing photos and videos in order to make sense of the world — across every vertical, and especially in ecommerce.

This all begs the question: how do you store and manage large sets of visual data for machine learning pipelines?

Most teams have built home grown solutions that effectively rely on Amazon S3 or other data stores. These teams have to spend days — and frequently months — of valuable engineering time to build file indexes, store labels & vectors, and organize terabytes of data.

Enter ApertureDB: a database for scalable storage and management of visual data for machine learning.

The company is the brainchild of Vishakha Gupta and Luis Remis, two systems researchers who worked together for years with data scientists and machine learning engineers at Intel Labs. While at Intel, the duo invented a scalable database that enabled storage and fast query of large photo and video assets, with a specific focus on machine learning.

ApertureDB helps data scientists focus on what they do best: building and deploying complex machine learning algorithms.

Some of the ApertureDB’s features include:

Highly optimized C++ implementation with in-memory metadata handling
Optimized, "on the fly" transformation operations for photos and videos
Native support for indexing / search of feature vectors or embeddings describing visual data
Easy integration with ML frameworks (PyTorch, Tensorflow, etc.)
Multimodal, easy to update, application metadata
Automatically link application metadata with visual data
Access relevant data through combined feature-based and complex metadata-based searches
Manage multiple annotations on images or video frames
Native support for images, bounding boxes, pre-processing, and augmentation operations
Native support for multiple video encodings and containers with a rich set of operations & efficient frame-level access
Built-in similarity matching for high-dimensional feature vectors. Native support for k-nearest neighbor computations
Metadata information as a knowledge graph for easy and more insightful analytics

The product is early, but candidly, it’s awesome — just ask some of ApertureData’s enterprise customers.

At 2048 Ventures, we have a strong thesis on “video eating the world” - an explosion of visual data in our lives and across industries. That's why we enthusiastically co-led ApertureData’s pre-seed round, alongside our friends at Root.vc, Graph Ventures, and other strategic VCs and angels.

Watch this video to learn more about ApertureDB, read the seed round announcement by the company, read more coverage on TechCrunch, and of course, reach out to vishakha.gupta@aperturedata.io to schedule a demo if ApertureDB could help your organization.

Subscribe to our newsletter

Signup to get quarterly updates and important announcements from 2048 Ventures.

Subscribe

Recent posts

Browse all articles

Thoughts

May 5, 2024

Our Thesis on Digital Bioprocessing

By capturing unique data through existing bioprocess workflows and employing AI to fully reimagine these workflows, we can enhance the robustness and efficiency of biomanufacturing operations.

Our Investment in Century Health

Century Health is accelerating drug development and commercialization by applying AI to clinical data previously untapped by biopharma.

Our Thesis on AI

We believe in two major AI opportunities - the first one is around unique data capture through existing workflows and the second is to use AI to fully reinvent the workflows.