We often have to write code using permissive programming languages like C and C++. They tend to generate hard-to-debug problems that can crash your applications. Thankfully, many compilers offer “sanitizers”. I discussed them in my post No more leaks with sanitize flags in gcc a … | Continue reading
In my post Really fast bitset decoding for “average” densities, I reported on our work accelerating the decoding of bitsets. E.g., given a 64-bit register, you want to find the location of every 1-bit. So given 0b110011, you would want to get 0, 1, 4, 5. We want to do this operat … | Continue reading
A few months ago, I ordered ROCKPro64. If you are familiar with the Raspberry Pi, then it is a bit of the same… an inexpensive computer that comes in the form of a single card. The ROCKPro64 differs from the Raspberry Pi in that it is much closer in power to a normal PC. You … Co … | Continue reading
I learned to program with BASIC back when I was twelve. I would write elaborate programs and run them. Invariably, they would surprise me by failing to do what I expect. I would struggle for a time, but I'd eventually give up and just accept that wha | Continue reading
Suppose I give you a word and you need to determine the location of the 1-bits. For example, given the word 0b100011001, you would like to get 0,3,4,8.You could check the value of each bit, but that would take too long. A better approach is use the fact that modern processors hav … | Continue reading
A common problem in software performance is that you are essentially limited by memory access. Let us consider such a function where you write at random locations in a big array. for | Continue reading
It is common to represent binary data or numbers using the hexadecimal notation. Effectively, we use a base-16 representation where the first 10 digits are 0, 1, 2, 3, 5, 6, 7, 8, 9 and where the following digits are A, B, C, D, E, with the added complexity that we can use either … | Continue reading
A common optimization in software is to unroll loops. It is best explained with an example. Suppose that you want to compute the scalar product between two arrays: sum = 0 | Continue reading
We are all familiar with biological aging. Roughly speaking, it is the loss of fitness that most animals undergo with time. At the present time, there is simply not much you can do against biological aging. You are just not going to win any gold medals in the Olympics at age 65.H … | Continue reading
In my previous post, I reviewed a new fast random number generator called wyhash. I commented that I expected it to do well on x64 processors (Intel and AMD), but not so well on ARM processors.Let us review again wyhash:uint64_t wyhash64_x | Continue reading
Not all instructions on modern processors cost the same. Additions and subtractions are cheaper than multiplications which are themselves slower than divisions. For this reason, compilers frequently replace division instructions by multiplications. Roughly speaking, it works in t … | Continue reading
I have been having performance problems with my blog and this forced me to spend time digging into the issue. Some friends of mine advocate that I should just “pay someone” and they are no doubt right that it would be the economical and strategic choice. Sadly, though I am eager … | Continue reading
My blog is relatively minor enterprise. It is strictly non-profit (no ad). I have been posting one or two blog posts a week for about fifteen years. I have been using the same provider in all this time (csoft.net). They charge me about $50 a month. I also subscribe to Cloudflare … | Continue reading
Many Internet formats from email (MIME) to the Web (HTML/CSS/JavaScript) are text-only. If you send an image or executable file by email, it often first gets encoded using base64. The trick behind base64 encoding is that we use 64 different ASCII characters including all letters, … | Continue reading
Richard Hamming is a famous computer scientist. In his talk You and Your Research, Hamming recounts how asked researchers three questions which I paraphrase: What are the important problems of your field? What important problems are you working on? If what you are doing is not im … | Continue reading
A common problem within databases and search engines is to compute the intersection between two sorted array. Typically one array is much smaller than the other one. The conventional strategy is the “galloping intersection”. In effect, you go through the values in the small array … | Continue reading
All programmers know about multicore parallelism: your CPU is made of several nearly independent processors (called cores) that can run instructions in parallel. However, our processors are parallel in many different ways. I am interested in a particular form of parallelism calle … | Continue reading
We prefer to invent new jobs rather than trying harder and inventing a new system that wouldn’t require everybody to have a job.” (Philippe Beaudoin) In the XXIst century, people from wealthy countries work hard primarily to gain social status. We often make the mistake of tying … | Continue reading
Programming languages make it hard to sort arrays properly. Look at how JavaScript sorts arrays of integers: > v = [1,3,2,10] [ 1, 3, 2, 10 ] > v.sort() [ 1, 10, 2, 3 ] You need a magical incantation to get the right result: > v.sort((a,b)=>a>b) [ 1, 2, 3, 10 ] Though this … Cont … | Continue reading
Schools train us to provide the right answers to predefined questions. Yet anyone with experience from the real world knows that, more often than not, the difficult part is to find the right question. To make a remarkable contribution, you need to start by asking the right questi … | Continue reading
One of my colleagues is teaching an artificial intelligence class. In his class, he uses old videos where experts from the early eighties make predictions about where AI is going. These experts come from the best schools such as Stanford. These videos were not meant as a joke. Wh … | Continue reading
Schools train us to provide the right answers to predefined questions. Yet anyone with experience from the real world knows that, more often than not, the difficult part is to find the right question. To make a remarkable contribution, you need to start by asking the right questi … | Continue reading
We all know the regular multiplication that we learn in school. To multiply a number by 3, you can multiply a number by two and add it with itself. Programmers write: a * 3 = a + (a | Continue reading
Modern processors execute instructions in parallel in many different ways: multi-core parallelism is just one of them. In particular, processor cores can have several outstanding memory access requests “in flight”. This is often described as “memory-level parallelism”. You can me … | Continue reading
Our processors can issue several memory requests at the same time. In a multicore processor, each core has an upper limit on the number of outstanding memory requests, which is reported to be 10 on recent Intel processors. In this sense, we would like to say that the level of mem … | Continue reading
Most programs running on web sites are written in JavaScript. There are still a few Java applets and other plugins hanging around, but they are considered obsolete at this point. While JavaScript is superbly fast, some people feel that we ought to do better. That’s where WebAssem … | Continue reading
When receiving bytes from the network, we often assume that they are unicode strings, encoded using something called UTF-8. Sadly, not all streams of bytes are valid UTF-8. So we need to check the strings. It is probably a good idea to optimize this problem as much as possible. I … | Continue reading
Suppose that you want to quickly determine a sequence of eight characters are made of digits (e.g., ‘9434324134’). How fast can you go? In software, characters are mapped to integer values called the code points. The ASCII and UTF-8 code points for the digits 0, 1,…, 9 are the co … | Continue reading
Related Posts: Vectorizing random number generators for greater… Innovation as a Fringe Activity Top speed for top-k queries | Continue reading
Related Posts: Innovation as a Fringe Activity Science and Technology links (March 17, 2017) By how much does AVX-512 slow down your CPU? A first… | Continue reading
Related Posts: Innovation as a Fringe Activity Setting up a “robust” Minecraft server… Science and Technology links (March 17, 2017) | Continue reading
Related Posts: Setting up a “robust” Minecraft server… Innovation as a Fringe Activity Science and Technology links (March 17, 2017) | Continue reading
Related Posts: Setting up a “robust” Minecraft server… Innovation as a Fringe Activity Stream VByte: breaking new speed records for integer… | Continue reading
Related Posts: Accelerating intersections with SIMD instructions Where are all the search trees? Software performance is… counterintuitive | Continue reading
Related Posts: Setting up a “robust” Minecraft server… Innovation as a Fringe Activity The dangers of AVX-512 throttling: myth or reality? | Continue reading
Related Posts: Setting up a “robust” Minecraft server… Innovation as a Fringe Activity Science and Technology links (March 17, 2017) | Continue reading
Related Posts: Innovation as a Fringe Activity Setting up a “robust” Minecraft server… Aging is a software bug | Continue reading
Related Posts: Setting up a “robust” Minecraft server… Innovation as a Fringe Activity Science and Technology links (March 17, 2017) | Continue reading
Related Posts: Setting up a “robust” Minecraft server… Innovation as a Fringe Activity Science and Technology links (April 7th, 2017) | Continue reading
Related Posts: Setting up a “robust” Minecraft server… Innovation as a Fringe Activity Science and Technology links (March 17, 2017) | Continue reading
Related Posts: Science and Technology links (April 7th, 2017) Innovation as a Fringe Activity Stream VByte: breaking new speed records for integer… | Continue reading
Related Posts: Science and Technology links (March 17, 2017) Setting up a “robust” Minecraft server… Innovation as a Fringe Activity | Continue reading
Related Posts: On the memory usage of maps in Java Computing in 2025… what can we expect? Are 8-bit or 16-bit counters faster than 32-bit counters? | Continue reading
Related Posts: Setting up a “robust” Minecraft server… Innovation as a Fringe Activity Science and Technology links (April 7th, 2017) | Continue reading
Related Posts: We need more than spam filters: we need bona fide… The week-end freedom test What kind of researcher are you? | Continue reading
(If you enjoy these predictions, you can follow me on Twitter at @lemire.)2020 Virtual reality is ubiquitous. New game consoles come with virtual capabilities by default. Volvo commercializes self-driving cars. Ot | Continue reading
Related Posts: Setting up a “robust” Minecraft server… Graph algorithms and software prefetching Innovation as a Fringe Activity | Continue reading
Related Posts: Setting up a “robust” Minecraft server… Innovation as a Fringe Activity Aging is a software bug | Continue reading