Posts

Showing posts from November, 2012

Coming up at ALENEX 2013

Coming up at ALENEX 2013, in which Dave discovers that perfect hashing and friends can be a really intellectually engaging problem. This is the algorithm engineering paper about the most memory-efficient index we developed for SILT, with more analysis and depth about the index itself vs. focusing on the big system. Practical Batch-Updatable External Hashing with Sorting Hyeontaek Lim, David G. Andersen, Michael Kaminsky The key thing about this paper is that we've achieved the best-yet overhead for static external perfect hashing. That's a mouthful: Concretely, we use 2.5 bits of DRAM to tell you exactly where to look on your SSD to find a stored key-value pair. The data is stored densely on the SSD, wasting no space on the storage. The prior best approaches were Botelho's EPH scheme , which used about 3.8 bits per key, and the "EM" variant of EPH, which used about 3.1--3.3 bits/key. As a systems geek, I also really like that Hyeontaek's algorithm u