Posts by Tags

C/C++

How to quickly factor out a constant factor from integers

29 minute read

Published:

This post revisits the topic of integer division, building upon the discussion in the previous post. Specifically, I’ll delve into removing trailing zeros in the decimal representation of an input integer, or more broadly, factoring out the highest power of a given constant that divides the input. This exploration stems from the problem of converting floating-point numbers into strings, where certain contemporary algorithms, such as Schubfach and Dragonbox, may yield outputs containing trailing zeros.

On the optimal bounds for integer division by constants

71 minute read

Published:

It is well-known that the integer division is quite a heavy operation on modern CPU’s - so slow, in fact, that it has even become a common wisdom to avoid doing it at ALL cost in performance-critical sections of a program. I do not know why division is particularly hard to optimize from the hardware perspective. I am just guessing, maybe (1) every general algorithm is essentially just a minor variation of the good-old long division, (2) which is almost impossible to parallelize. But whatever, that is not the topic of this post.

Faster integer formatting - James Anhalt (jeaiii)’s algorithm

29 minute read

Published:

This post is about an ingenious algorithm for printing integers into decimal strings. It sounds like an extremely simple problem, but it is in fact quite more complicated than one might imagine. Let us more precisely define what we want to do: we take an integer of specific bit-width and a byte buffer, and convert the input integer into a string consisting of its decimal digits, and then write it into the given buffer. For simplicity, we will assume that the integer is unsigned and is of $32$-bits. So, we want to implement the following function written in C++:

char* itoa(std::uint32_t n, char* buffer) {
  // Convert n into decimal digit string and write it into buffer.
  // Returns the position right next to the last character written.
}

There are numerous algorithms for doing this, and I will dig into a clever algorithm invented by James Anhalt (jeaiii), which seems to be the fastest known algorithm at the point of writing this post.

Continued fractions and their application into fast computation of \(\lfloor nx\rfloor\)

34 minute read

Published:

When I was working on Dragonbox and Grisu-Exact (which are float-to-string conversion algorithms with some nice properties) I had to come up with a fast method for computing things like $\lfloor n\log_{10}2 \rfloor$ or $\lfloor n\log_{2}10 \rfloor$, or more generally $\lfloor nx\rfloor$ for some integer $n$ and a fixed positive real number $x$. Actually, the sign of $x$ isn’t extremely important, but let us just assume $x>0$ for simplicity.

analysis

The Fourier transform of the Heaviside step function

9 minute read

Published:

Back in 2010, as an electrical engineering major student I was taking an introductory course on signals and systems taught by my formal advisor. The course was mainly about four different kinds of Fourier transforms (Discrete-Time Fourier Series a.k.a. Discrete Fourier Transform, Discrete-Time Fourier Transform, Continuous-Time Fourier Series, and Continuous-Time Fourier Transform) and some additional topics like the famous sampling theorem.

continued fractions

Continued fractions and their application into fast computation of \(\lfloor nx\rfloor\)

34 minute read

Published:

When I was working on Dragonbox and Grisu-Exact (which are float-to-string conversion algorithms with some nice properties) I had to come up with a fast method for computing things like $\lfloor n\log_{10}2 \rfloor$ or $\lfloor n\log_{2}10 \rfloor$, or more generally $\lfloor nx\rfloor$ for some integer $n$ and a fixed positive real number $x$. Actually, the sign of $x$ isn’t extremely important, but let us just assume $x>0$ for simplicity.

distribution-theory

The Fourier transform of the Heaviside step function

9 minute read

Published:

Back in 2010, as an electrical engineering major student I was taking an introductory course on signals and systems taught by my formal advisor. The course was mainly about four different kinds of Fourier transforms (Discrete-Time Fourier Series a.k.a. Discrete Fourier Transform, Discrete-Time Fourier Transform, Continuous-Time Fourier Series, and Continuous-Time Fourier Transform) and some additional topics like the famous sampling theorem.

functional-analysis

geometry

harmonic-analysis

The Fourier transform of the Heaviside step function

9 minute read

Published:

Back in 2010, as an electrical engineering major student I was taking an introductory course on signals and systems taught by my formal advisor. The course was mainly about four different kinds of Fourier transforms (Discrete-Time Fourier Series a.k.a. Discrete Fourier Transform, Discrete-Time Fourier Transform, Continuous-Time Fourier Series, and Continuous-Time Fourier Transform) and some additional topics like the famous sampling theorem.

pde

programming

How to quickly factor out a constant factor from integers

29 minute read

Published:

This post revisits the topic of integer division, building upon the discussion in the previous post. Specifically, I’ll delve into removing trailing zeros in the decimal representation of an input integer, or more broadly, factoring out the highest power of a given constant that divides the input. This exploration stems from the problem of converting floating-point numbers into strings, where certain contemporary algorithms, such as Schubfach and Dragonbox, may yield outputs containing trailing zeros.

On the optimal bounds for integer division by constants

71 minute read

Published:

It is well-known that the integer division is quite a heavy operation on modern CPU’s - so slow, in fact, that it has even become a common wisdom to avoid doing it at ALL cost in performance-critical sections of a program. I do not know why division is particularly hard to optimize from the hardware perspective. I am just guessing, maybe (1) every general algorithm is essentially just a minor variation of the good-old long division, (2) which is almost impossible to parallelize. But whatever, that is not the topic of this post.

Faster integer formatting - James Anhalt (jeaiii)’s algorithm

29 minute read

Published:

This post is about an ingenious algorithm for printing integers into decimal strings. It sounds like an extremely simple problem, but it is in fact quite more complicated than one might imagine. Let us more precisely define what we want to do: we take an integer of specific bit-width and a byte buffer, and convert the input integer into a string consisting of its decimal digits, and then write it into the given buffer. For simplicity, we will assume that the integer is unsigned and is of $32$-bits. So, we want to implement the following function written in C++:

char* itoa(std::uint32_t n, char* buffer) {
  // Convert n into decimal digit string and write it into buffer.
  // Returns the position right next to the last character written.
}

There are numerous algorithms for doing this, and I will dig into a clever algorithm invented by James Anhalt (jeaiii), which seems to be the fastest known algorithm at the point of writing this post.

Continued fractions and their application into fast computation of \(\lfloor nx\rfloor\)

34 minute read

Published:

When I was working on Dragonbox and Grisu-Exact (which are float-to-string conversion algorithms with some nice properties) I had to come up with a fast method for computing things like $\lfloor n\log_{10}2 \rfloor$ or $\lfloor n\log_{2}10 \rfloor$, or more generally $\lfloor nx\rfloor$ for some integer $n$ and a fixed positive real number $x$. Actually, the sign of $x$ isn’t extremely important, but let us just assume $x>0$ for simplicity.