George V. Reilly

Bash: Getting and Setting Default Values

Bash has some handy syntax for getting and setting default values. Un­for­tu­nate­ly, it's a collection of punc­tu­a­tion characters, which makes it hard to Google when you can't quite remember the syntax.

Getting a default value using ${var:-fallback}:

# set $LOGDIR to $1 if $1 has a value; otherwise set $LOGDIR to "/var/log"
LOGDIR="${1:-/var/log}"

# use $VERSION unless it's empty or unset; fall back to extracting someprog's version num
build_version=${VERSION:-$(someprog --version | sed 's/[^0-9.]*\([0-9.]*\).*/\1/')}

The colon-dash con­struc­tion is known as the dog's bollocks in typography.

Setting a default value, using ${var:=fallback}:

$ echo $HOME
/Users/georgevreilly
$ echo ${HOME:=/tmp}
/Users/georgevreilly
$ unset HOME
$ echo ${HOME:=/tmp}
/tmp
$ echo $HOME
/tmp
$ cd; pwd
/tmp

Note: := uses the new value in two cases. First, when continue.

Obfuscating Passwords in URLs in Python

[Pre­vi­ous­ly published at the now defunct MetaBrite Dev Blog.]

RFC 1738 allows passwords in URLs, in the form <scheme>://<username>:<password>@<host>:<port>/<url-path>. Although passwords are deprecated by RFC 3986 and other newer RFCs, it's oc­ca­sion­al­ly useful. Several important packages in the Python world allow such URLs, including SQLAlchemy ('post­gresql://scott:tiger@localhost:5432/my­data­base') and Celery ('amqp://guest:guest@localhost:5672//'). It's also useful to be able to log such URLs without exposing the password.

Python 2 has urlparse.urlparse (known as urllib.parse.urlparse in Python 3 and six.moves.url­lib_­parse.urlparse in the Six com­pat­i­bil­i­ty library) to split a URL into six components, scheme, netloc, path, parameters, query, and fragment. The netloc cor­re­sponds to <user>:<password>@<host>:<port>.

Un­for­tu­nate­ly, neither Python 2 nor 3's urlparse properly handle the userinfo (username + optional password in the netloc), as continue.

Python: a use for nested list comprehensions

I wanted to turn a list like ['*.zip', '*.pyc', '*.log'] into ['--exclude', '*.zip', '--exclude', '*.pyc', '--exclude', '*.log'].

A simple list com­pre­hen­sion doesn't work as desired:

In [1]: excludes = ['*.zip', '*.pyc', '*.log']

In [2]: [('--exclude', e) for e in excludes]
Out[2]: [('--exclude', '*.zip'), ('--exclude', '*.pyc'), ('--exclude', '*.log')]

The trick is to use a nested com­pre­hen­sion:

In [5]: [arg for pattern in excludes
             for arg in ['--exclude', pattern]]
Out[5]: ['--exclude', '*.zip', '--exclude', '*.pyc', '--exclude', '*.log']

Checking minimum version numbers in Bash

I worked on a Bash script today that sets up various pre­req­ui­sites for our build. We need a recent version of Docker but our Bamboo build agents are running on Ubuntu 14.04, which has a very old version of Docker. The script upgrades Docker when it's first run. The script may be run more than once during the lifetime of the agent, so the second and subsequent calls should not upgrade Docker.

Basically, I wanted

if $DOCKER_VERSION < 1.9; then upgrade_docker; fi

Un­for­tu­nate­ly, it's not that easy in Bash. Here's what I came up with.

install_latest_docker() {
    if docker --version | python -c "min=[1, 9]; import sys; ↩
v=[int(x) 
continue.

Including Data Files in Python Packages

[Pre­vi­ous­ly published at the now defunct MetaBrite Dev Blog.]

I spent some time today struggling with setuptools, trying to make a Python source package not only include a data file, but also install that file.

Building the installer

Consider the following source tree layout:

├── MANIFEST.in
├── README.md
├── my_stuff/
│   ├── bar.py
│   ├── foo.py
│   ├── __init__.py
│   └── quux.py
├── models/
│   └── long_ugly_name_20151221.json
└── setup.py*

I wanted to create a Python source dis­tri­b­u­tion, some_­pack­age-N.N.N.tar.gz, which contains the code in the my_stuff directory, as well as models/long_ug­ly_­name_20151221.json, using python setup.py sdist.

It's not that hard to get models/long_ug­ly_­name_20151221.json included in the tarball. Add an entry in MANIFEST.in:

include models/*.json

Then be sure to set in­clude_­pack­age_­da­ta=True in the call to setup():

from setuptools 
continue.

Python f-strings

At this month's PuPPy (Puget Sound Pro­gram­ming Python) Meetup, I heard a brief mention of Python f-strings as a new feature coming in Python 3.6.

In essence, they offer a simpler, more versatile method of string formatting and in­ter­po­la­tion over existing methods. F-strings can include not only symbol names but Python ex­pres­sions within strings. With str.format, you can write 'Hello, {name}'.format(name=some_name). You can control various aspects of how name is formatted, such as being centered within a field—see PyFormat and Python String Format Cookbook for ex­am­ples—but no more complex expression is allowed between the braces.

Herewith some examples of f-string ex­pres­sions drawn from PEP 0498:

>>> date = datetime.date(1991, 10, 12)
>>> f'{date} was on a 
continue.

Python Print Formatting

On Stack­Over­flow, someone wanted to print triangles in Python in an M-shape. Various clumsy solutions were offered.

Here's mine which uses the left- and right-jus­ti­fi­ca­tion features of str.format.

Putting them together:

.. code:: pycon
>>> WIDTH = 4
>>>
>>> for a in range(1, WIDTH+1):
...     print("{0:<{1}}{0:>{1}}".format('*' * a, WIDTH))
...
*      *
**   
continue.

Explaining the epilog of fnmatch.translate, \Z(?ms)

I was debugging a filtering directory walker (on which, more to follow) and I was trying to figure out the mysterious suffix that fnmatch.translate appends to its result, \Z(?ms).

fnmatch.translate takes a Unix-style glob, like *.py or test_*.py[cod], and translates it character-by-character into a regular expression. It then appends \Z(?ms). Hence the latter glob becomes r'test\_.*\.py[cod]\Z(?ms)', using Python's raw string notation to avoid the backslash plague. Also, the ? wildcard character becomes the . regex special character, while the * wildcard becomes the .* greedy regex.

A Stack­Over­flow answer partially explains, which set me on the right track. (?ms) is equivalent to compiling the regex with re.MULTILINE | re.DOTALL. The re.DOTALL modifier makes the . special character match any character, including continue.

RunSnakeRun (wxPython) apps in a Brew Virtualenv

I'm doing some Python profiling and I wanted to use the Run­SnakeRun utility to view the profile data. Un­for­tu­nate­ly, that's not straight­for­ward on Mac OS X if you use a virtualenv, and it's even less easy if you're using the Python installed by the Homebrew (brew) package manager.

There are several problems:

Installing wxPython

I downloaded wxPython3.0-osx-3.0.2.0-cocoa-py2.7.dmg, released in November 2014.

If you open the DMG and attempt to run the PKG, you will likely get a misleading error message from OS X:

“wxPython3.0-osx-cocoa-py2.7.pkg” continue.

Python Egg Cache

Every so often, one of our Bamboo builds would break thus:

pkg_resources.ExtractionError: Can't extract file(s) to egg cache

The following error occurred while trying to extract file(s) to the Python egg
cache:

  [Errno 17] File exists: '/home/bamboo/.python-eggs'

The Python egg cache directory is currently set to:

  /home/bamboo/.python-eggs

Perhaps your account does not have write access to this directory?  You can
change the cache directory by setting the PYTHON_EGG_CACHE environment
variable to point to an accessible directory.

This occurred while trying to make use of PyCrypto.

After a little research, I decided that instead of installing PyCrypto as a zipped egg (as it does by default) into the continue.

Previous » « Next