2021-10-04

Accidentally Quadratic: Python List Membership

We had a performance regression in a test suite recently when the median test time jumped by two minutes.

We tracked it down to this (simplified) code fragment:

task_inclusions = [ some_collection_of_tasks() ]
invalid_tasks = [t.task_id() for t in airflow_tasks
                 if t.task_id() not in task_inclusions]

This looks fairly innocuous—and it was—until the size of the result returned from some_collection_of_tasks() jumped from a few hundred to a few thousand.

The in comparison operator conveniently works with all of Python's standard sequences and collections, but its efficiency varies. For a list and other sequences, in must search …continue.

2020-04-23

Now You Have 32 Problems

Some people, when confronted with a problem, think “I know, I'll use regular expressions.” Now they have two problems.

— Jaime Zawinksi

A Twitter thread about very long regexes reminded me of the longest regex that I ever ran afoul of, a particularly horrible multilevel mess that had worked acceptably on the 32-bit .NET CLR, but brought the 64-bit CLR to its knees.

Whenever I ran our ASP.NET web application [on Win64], it would go berserk, eat up all 4GB of my physical RAM, push the working set of IIS's w3wp.exe to 12GB, and max out one of my 4 cores! The only way to maintain any sanity was to run iisreset every 20 minutes to …continue.

2017-02-07

Old Presentations

I uploaded some presentations to SpeakerDeck.com tonight.

Here are various presentations of mine at SpeakerDeck.com and SlideShare.net:

LKRhash: the design of a scalable hashtable. I spoke about this at NWCPP (video) in June 2012.
Software Performance: an Overview. I gave this talk at an internal brownbag at MetaBrite in January 2016.
FlyingCloud: Docker containers built with SaltStack. I gave this talk at PuPPy (video) in March 2016.
Security 101. I gave this talk at an internal brownbag at Cozi in November 2011.

2017-01-10

LKRhash: Scalable Hash Tables

LKRhash is a hashtable that scales to multiple processors and to millions of items. LKRhash was invented at Microsoft in 1997 by Per-Åke (Paul) Larson of Microsoft Research and Murali Krishnan and George Reilly of Internet Information Services. LKRhash has been used in many Microsoft products. The techniques that give LKRhash its performance include linear hashing, cache-friendly data structures, and fine-grained locking.

Northwest C++ Users' Group talk, June 2012. Speaker Deck Slides.
Unpublished paper submitted to Software: Practice & Experience (Oct 1999).
US 6,578,131 patent. Associated PDF contains some of the SP&E paper.

If Microsoft had had 20% time, LKRhash would have been my main 20% project. I put a lot of …continue.

2016-08-23

Flame Graphs and Flame Charts

I was investigating the performance of a web app today, and I spent some time looking at the Flame Chart visualization in Chrome's profiling tools, which helped identify some problems.

Flame Charts are like Brendan Gregg's Flame Graphs, except that the charts are sorted by time, while the graphs are sorted alphabetically.

Quoting from Gregg's recent ACM Queue article:

A flame graph has the following characteristics:

A stack trace is represented as a column of boxes, where each box represents a function (a stack frame).

The y-axis shows the stack depth, ordered from root at the bottom to leaf at the top. The top box shows the function that was on-CPU when the …continue.

2016-03-25

Profiling

Despite being a bona fide performance expert—I spent a couple of years as the Performance Lead for Microsoft's IIS web server product about 15 years ago—I still forget to measure rather than assume.

I wrote some code today that imported nearly 300,000 nodes into a graph from a 500MB XML file. The code was not particularly fast and I assumed that it was the XML parser. I had been using the built-in streaming parser, cElementTree iterparse. I assumed that using the lmxl iterparse would make the code faster. It didn't.

Then I had the bright idea of temporarily disabling the per-node processing, which left only the XML parsing. Instead of …continue.

2016-01-05

Decrementing Loops

The canonical for-loop in C and C++ is written thus, counting up i = 0, i = 1, ..., i = N-1:

for (int i=0; i < N; i++) {
    // loop body
}

(In C, you have to declare int i before the for-loop.)

Let's unpack that for-loop into an equivalent while-loop:

int i = 0;
while (i < N) {
    // loop body
    i = i + 1
}

In other words, we initialize i to zero. Then, before every execution of either loop, we check i < N. If i is still within bounds, we execute the loop body. Then we postincrement …continue.