Reconstructing search
Running real-time search at a billion scale over traditional search database technology, forces companies to compromise on accuracy, speed or cost. Hyperspace cloud Search Processing Unit (SPU) solves this problem with domain-specific computing.
Solving the search and data retrieval computation limits
Domain-specific computing skips the standard software semantics, cache hierarchy, and other CPU abstractions. It then implements the core parts as a custom datapath processor. Together with a new software stack, this runs search and informaiton retrieval workloads hundreds of times faster.
Architecture
Hyperspace cloud SPU includes tens of dedicated search cores running proprietary instruction sets to filter, rank and aggregate search results in a super efficient way. These custom search instructions along with advanced data prefetch, enabling speeds that can’t be matched by general purpose CPU.
Hyper Candidate Generation
Custom filter processors enable the fast search space reduction based on inverted index term lookup. They perform merge (OR), intersect (AND), and subtract (NOT) operations on lists of documents ID’s, streaming them to the ranking processor with extensive parallelism.
Superior ranking efficiency
Hyperspace ranking processors use a proprietary instruction set, with each instruction being a CISC specific to search functionality and equivalent to hundreds of CPU instructions. The ranking processors swiftly score documents relative to the query document based on user business logic (aka score function) and TF/IDF.
Fast memory access
Hyperspace proprietary SPU and advanced data structures, allow prefetching all required data into the SPU before being used hence avoiding cache misses. This architecture overcomes the lack of control in software-based workload, that doesn’t have a tight control over the caches and uses unoptimized LRU mechanism.
Highly Predictable Performance
Unlike traditional SW based solutions, Hyperspace’s Domain-specific computing technology avoids common issues from asynchronous mechanisms like OS thread schedulers and garbage collectors. This ensures deterministic and predictable performance, crucial for real-time, high-throughput applications.
Low memory footprint
Lucene-based data structures are inefficient in memory footprint due to their high-level abstractions. Hyperspace rebuilt the index and data structures from scratch, avoiding high-level abstractions for better memory efficiency and retrieval speed.
Layered storage architecture
Hyperspace optimizes usage of in-memory and disk storage, based on the indexes statistical profile and fields cardinality level to dynamically balance between performance and cost given the specific use case objectives.