George V. Reilly


Despite being a bona fide per­for­mance expert—I spent a couple of years as the Per­for­mance Lead for Mi­crosoft­'s IIS web server product about 15 years ago—I still forget to measure rather than assume.

I wrote some code today that imported nearly 300,000 nodes into a graph from a 500MB XML file. The code was not par­tic­u­lar­ly fast and I assumed that it was the XML parser. I had been using the built-in streaming parser, cEle­ment­Tree iterparse. I assumed that using the lmxl iterparse would make the code faster. It didn't.

Then I had the bright idea of tem­porar­i­ly disabling the per-node processing, which left only the XML parsing. Instead of continue.

Raising IOError for 'file not found'

I wanted to raise Python's IOError for a file-not-found condition, but it wasn't obvious what the parameters to the exception should be.

from errno import ENOENT

if not os.path.isfile(source_file):
    raise IOError(ENOENT, 'Not a file', source_file)
with open(source_file) as fp:

IOError can be in­stan­ti­at­ed with 1, 2, or 3 arguments:

IOError(errno, strerror, filename)
These arguments are available on the errno, strerror, and filename attributes of the exception object, re­spec­tive­ly, in both Python 2 and 3. The args attribute contains the verbatim con­struc­tor arguments as a tuple.
IOError(errno, strerror)
These are available on the errno and strerror attributes of the exception, re­spec­tive­ly, in both Python 2 and continue.

PuPPy Startup Row Pitch Night

Last night, Adam Porad and I were one of five teams pitching our startups at the PuPPy-organized PyCon Startup Row Pitch Night:

Techstars Seattle and PuPPy [Puget Sound Pro­gram­ming Python] presents PyCon Startup Row Pitch Night. The time has come again for you, the members of PuPPy, to select Seattle’s startup rep­re­sen­ta­tive to travel to PyCon in Portland to represent our Python community and startup scene at the annual conference produced by the Python Software Foundation.

We were pitching MetaBrite and our technology that captures receipts, yielding receipt in­for­ma­tion to users and onsumer insights. We use Python ex­ten­sive­ly—we've written 120,000 lines of Python code for web services, web apps, machine learning, image processing, and continue.

Fear of Public Speaking

It is often said that people fear public speaking more than they fear death. I certainly used to fear getting up in front of a crowd, though not to the point of death. Tonight I spoke about Fly­ing­Cloud in front of more than 100 people for half an hour at the PuPPy Meetup. I wasn't nervous beforehand and I wasn't nervous talking to the crowd.

I've been an active Toast­mas­ter for nearly 15 years and I've spoken at Toast­mas­ters hundreds of times. I'm used to a room of 15–25 people but not to a larger audience. Adam and I put our slides together late last week. We ran through it together continue.

FlyingCloud 0.3.0 released

I just announced the release of Fly­ing­Cloud 0.3.0 on the fly­ing­cloud-users mailing list. I'll have more to say about Fly­ing­Cloud in future. For now, let's just say it's a tool that we use to build Docker images using masterless SaltStack.

I'll be speaking about Fly­ing­Cloud at Wednes­day's PuPPy meetup.

psutil kill

From Python, I needed to find a process that was performing SSH tunneling on port 8080 and kill it.

The following works in Bash:

ps aux | grep [s]sh.*:8080 | awk '{print $2}' | xargs kill -9

The grep [s]sh trick ensures that the grep command itself won't make it through to awk.

Here's what I came up with in Python using psutil:

def kill_port_forwarding(host_port):
    ssh_args = ["-f", "-N", "-L", "{0}:localhost:{0}".format(host_port)]
    for process in psutil.process_iter():

Python: Import subclass from dynamic path

I needed to import some plugin code written in Python from a directory whose path isn't known until runtime. Further, I needed a class object that was a subclass of the plugin base class.

from somewhere import PluginBase

class SomePlugin(PluginBase):
    def f1(self): ...
    def f2(self): ...

You can use the imp module to actually load the module from impl_dir. Note that impl_dir needs to be tem­porar­i­ly prepended to sys.path. Then you can find the plugin subclass using dir and issubclass.

import os, imp

def import_class(implementation_filename, base_class):
    impl_dir, impl_filename = os.path.split(implementation_filename)
    module_name, _ = os.path.splitext(impl_filename)


Sorting Python Dictionaries by Value

New post at the MetaBrite Dev Blog.

Nose Test Discovery

I figured out why I saw the following error every time I ran Nose:

ERROR: Failure: TypeError (type() takes 1 or 3 arguments)
Traceback (most recent call last):
  File ".../lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/", line 523, in makeTest
    return self._makeTest(obj, parent)
  File ".../lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/", line 582, in _makeTest
    return MethodTestCase(obj)
  File ".../lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/", line 345, in __init__
    self.inst = self.cls()
TypeError: type() takes 1 or 3 arguments

It turns out that one module was importing a class called TestApi which had a class­method called run_in­te­gra­tion_tests. The module itself had no tests; it just declared a class called TestO­b­fus­cat­ed­Mix­in, which used some continue.

Recursive Generators in Python 2 and 3

New post at the MetaBrite blog.

Previous » « Next