Accelerating Foreign-Key Joins using Asymmetric Memory Channels
This paper was published in Proceedings of the Second International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures (ADMS 2011) in conjunction with VLDB 2011, Seattle, WA on September 2, 2011.
Indexed Foreign-Key Joins expose a very asymmetric ac- cess pattern: the Foreign-Key Index is sequentially scanned whilst the Primary-Key table is target of many quasi-random lookups which is the dominant cost factor. To reduce the costs of the random lookups the fact-table can be (re-) partitioned at runtime to increase access locality on the dimen- sion table, and thus limit the random memory access to inside the CPU's cache. However, this is very hard to opti- mize and the performance impact on recent architectures is limited because the partitioning costs consume most of the achievable join improvement.
GPGPUs on the other hand have an architecture that is well suited for this operation: a relatively slow connection to the large system memory and a very fast connection to the smaller internal device memory. We show how to accelerate Foreign-Key Joins by executing the random table lookups on the GPU's VRAM while sequentially streaming the Foreign- Key-Index through the PCI-E Bus. We also experimentally study the memory access costs on GPU and CPU to provide estimations of the benefit of this technique.