Strings, Lists, Joins and Python
Although ruby is my first choice for scripting and web programming these days, I recently had a chance to dabble with python at work. Reading and making basic modifications to the python code was straightforward enough until I needed to join a list of strings together.
Suppose you have a list containing the strings ‘snakes’, ‘eat’, and ‘gems’ and you want to join them together with a spaces between them.
Because I’m familiar with ruby and its the closest reference point I had to python, I expected to join the strings like so:
['snakes', 'eat', 'gems'].join(' ')
This seems natural enough to me since it is the list that is being joined and the space is acting as a modifier of this operation. I was quite surprised when I was told the way to do the same operation in python is:
' '.join(['snakes', 'eat', 'gems'])
A colleague with more snake-wrangling experience than I argued that this is a more natural way to join lists since the join is essentially a string operation and so it’s more natural to think of the list elements as arguments. I was intrigued but sceptical so I decided to see what the good citizens of the web had to say about the matter.
A Temper Flares on Python-Dev
A language cannot live without a community who speak it and it cannot thrive without at least some of those people critically discussing how the language should evolve. Mailing lists are where this sort of discussion usually takes place and, sure enough, Google led me straight to the python-dev mailing list when I searched for a discussion of python’s join method.
Here I found someone called Zack Weinberg asking the same question I had in 2003: “How about seq.join(‘,’) where seq is an instance of a sequence type?” There was even a reply in the same thread:
PLEASE! GET THIS DISCUSSION OFF PYTHON-DEV!
NO MORE COMMENTS ON JOIN()!
—Guido van Rossum (home page: http://www.python.org/~guido/)
I had read enough about python to know that the author of this response was the inventor of python himself. I nearly fell off my chair! Here was the leader of a large language community shutting down a discussion in all-caps with exclamation marks ending every sentence.
I’ve recently been involved in developing a language and have some appreciation of how many difficult decisions have to be made. Anyone who can develop a language that is as wildly popular as python has my enduring admiration. This is partly the reason I was so shocked to see a response like this.
To be fair, I have no idea what is considered an appropriate topic for python-dev and maybe this question about join was way out of line. Also, after a bit more searching I found a FAQ answer about join that address my and Zack’s question in depth. Asking a question in mailing list that is answered in a FAQ is a definite no-no but so is Guido’s shouting in my opinion.
Arguments and Inverses
I was quite unconvinced by the arguments put forward by the aforementioned FAQ answer in that they basically boiled down to: 1) strings can be Unicode or ASCII and the type of the separator should determine the result of the join, and 2) the split function (the inverse of join) is a method on strings so join should be too.
I find neither of these convincing. The type of the separator can be used to determine the return type of the joined string even if it’s passed in as an argument. Sure, it’s probably easier for the implementers of python to have separate methods on the Unicode and ASCII versions of string than to query the separator’s type when its an argument. The join method will be implemented once but used in thousands of programs. A small bit of ugliness in python’s implementation is surely a small price to pay for a more intuitive interface.
Secondly, inverses should act like inverses. In ruby I can write:
'snakes eat gems'.split(' ').join(' ')
['snakes','eat','gems'].join(' ').split(' ')
and get back the original string or list. In python the equivalent code is:
' '.join('snakes eat gems'.split(' '))
' '.join(['snakes eat gems']).split(' ')
which doesn’t make the inverse relationship between split and join clear at all.
Please keep in mind that the above analysis is based on the very limited experience I’ve had with python so I may be missing some crucial aspect of python that make the defence of ' '.join(list) hold more water. I’d love to hear about it. However, I’m not the only one who thinks join is a wart on a language that is otherwise extremely well thought out.
Developing a language is hard and I don’t think there is a language in existence that gets everything right, mainly because “right” is inherently subjective. My intention in this post was not to come off as a language pedant (even though I may be). I know and use many languages and, depending on the problem being solved, like and dislike them for various reasons. This post came about because I found this little dive into python through the join method and interesting peek into the philosophy and culture of a programming language.
four comments
comp.lang.python is where questions of this sort should go, and you like likely get a reasonable response (although you probably won’t be satisfied with it, if you’re not satisfied with the FAQ). You will also probably get some flames, as you’re taking up peoples bandwidth (both real and mental), since the reason why is well-explained in the FAQ, even if you don’t agree with it.
I don’t like the FAQ answer either and how Python handles this and that there are some functions in the string module that aren’t string methods (capwords, etc).
I’ve been using Python for almost 10 years, and it’s still my favorite language. But I still don’t like everything about it.
Leave a new comment