Accelerating Large-Scale Data Analytics with GPU-Native Velox and NVIDIA cuDF

TableScan: The Velox TableScan was prolonged on CPU to be compatible with GPU I/O, decompression, and decoding components in cuDF.
HashJoin: The available join types were expanded to incorporate left, right, and inner, in addition to support for filters and null semantics.
HashAggregations: A streaming interface was introduced to administer partial and final aggregations.

As workloads scale and demand for faster data processing grows, GPU-accelerated databases and query engines have been shown to deliver significant price-performance gains in comparison with CPU-based systems. The high memory bandwidth and thread count of GPUs especially profit compute-heavy workloads like multiple joins, complex aggregations, strings processing, and more. The growing availability of GPU nodes and the broad feature coverage of GPU algorithms makes GPU data processing more accessible than ever before.

By addressing performance bottlenecks, each data and business analysts can now query massive datasets to generate real-time insights and explore analytics scenarios.

To support the increasing demand, IBM and NVIDIA are working together to bring NVIDIA cuDF to the Velox execution engine, enabling GPU-native query execution for widely used platforms like Presto and Apache Spark. That is an open project.