• The Burrows-Wheeler transform and move-to-front compression

    Bit count:
    Bit count:
    The above form implements algorithms to code and decode from the move-to-front compression scheme, with or without an additional application of the Burrows-Wheeler transform. These extremely slick algorithms blew my mind when I first heard of them. They are currently in use, among other places, in [bzip2](https://en.wikipedia.org/wiki/Bzip2) file compression.

    Read more →

  • Splay trees and optimality

    The splay tree is probably my favourite data structure. Is it useful in practice? Probably not, but its remarkable optimality properties coupled with its bare simplicity are so tantalizing that I’ve fallen in love with splaying. In the rest of this post, I’ll describe the splay tree structure, and present some of my favourite splay tree properties. You will also find an instructive D3 visualization of a splay tree in motion.

    Read more →

  • A uniform sum threshold problem

    Let \(U_1, U_2, \dots\) be an infinite sequence of independent \(\mathrm{Uniform}[0, 1]\) random variables. Let \(N\) be the minimum index for which \[ U_1 + U_2 + \dots + U_N > 1 . \] What is the expected value \(\mathbf{E}\{N\}\)?

    Read more →

  • Generating spherical points without complex operations

    These days, most of everyone’s favourite languages and libraries for scientific computing come ready-equipped with random number generators for most common univariate distributions: the uniform, binomial, normal, geometric, exponential, beta, etc. In my experience, multivariate generation is comparatively hit-or-miss. But in any case, since documentation usually doesn’t specify implementation methods or running time, you usually can’t be sure of the efficiency of one of these functions without personally examining some source code, or being lucky and finding that someone else on StackExchange already did. Thankfully, when in doubt, one can always refer to the excellent and totally free book Non-Uniform Random Variate Generation by my old PhD supervisor Luc Devroye. In fact, it seems this book is even more than free, as stated in this plea posted by the author on the book’s webpage.

    I give anyone the permission, even without asking me, to take these PDF files to a printer, print as many copies as you like, and sell them for profit. If you would like me to advertise the sales points of the hard copies, please let me know. To the libraries: Please do not charge patrons for copying this book. I grant everyone the right to copy at will, for free.

    Read more →

  • Discrete minimax estimation with trees

    This morning, I submitted the final version of my paper Discrete minimax estimation with trees (Devroye and Reddad, 2019), which is to appear in the Electronic Journal of Statistics. I think this paper is conceptually quite interesting, and I’m very happy with the final result, so in this post I’ll describe some of the main ideas present in the work.

    Read more →