Monthly Archives: August 2020

Code Golfing in Python (1)

Code Golf

I spend way too much time playing Clash of Code on CodinGame as you can see in the “featured image” above. I still haven’t caught up with Haskell, and I struggle to stay above Allis. Both are really strong, and seem to know Ruby very well!

My favorite mode is “Shortest”. The goal is to write a program against a given specification within 15 minutes. But, unlike the “Fastest” and “Reverse” modes, the main criterion is the number of characters in the submitted code.

So, if the problem is to write a program that asks for an integer n and then prints n! (the factorial of n), you are not going to submit the following snippet (69 characters):

from math import factorial
n = int(input())
f = factorial(n)
print(f)

At the very least, you are going to remove the extraneous spaces and avoid unnecessary declarations (47 characters):

import math
print(math.factorial(int(input())))

Or you might notice that importing a module and calling a method with a long name costs you many characters and that you are better off implementing it yourself (45 characters):

n=int(input())
p=1
while n:p*=n;n-=1
print(p)

This is less readable and less efficient, but it is terser. And that’s all that count!

Note: to be more specific, if two people get the same character count for their submissions, the one who submitted it first gets the advantage; however, this seldom happens

Uglifying Python

I am now going to list some of the techniques I commonly use when code golfing in Python.

1. Remove Extraneous Spaces

There is no need to put spaces after or before parentheses, brackets, quotes. You do not need spacing after a number either. Assume you are given an input of the form:

n
a b c d e…

And your program should print True if n+1 is among a, b, c, d, e… and False otherwise. If you follow the PEP 8, you could write:

print(int(input()) + 1 in [int(x) for x in input().split()])

Instead, we can do:

print(int(input())+1in[int(x)for x in input().split()])

Note that no space is needed between 1 and the keyword in.

2. Reduce Indentation

Indentation is semantic in Python. But there is no point in putting more than one space per level of indentation (88 characters):

t=0
for i in range(w):
    for j in range(h):
        t+=i+j
        print(i+j)
print(t)

This could just as well be written as (73 characters):

t=0
for i in range(w):
 for j in range(h):
  t+=i+j
  print(i+j)
print(t)

Better yet, since there are no further blocks (if, for, while) in the inner loop, we can put it in a single line (68 characters):

t=0
for i in range(w):
 for j in range(h):t+=i+j;print(i+j)
print(t)

Note: combining several lines of code with a semi-colon is normally useless, since we save the newline character but it pay the price of the semi-colon; however, when an indented block, we also save the characters of the indent, which makes it worth it to put several statements on a single line.

But it’s even better if we can get rid of the indent altogether (63 characters):

print(sum(print(i+j)or i+j for i in range(w)for j in range(h)))

Note: here, I combined the print statement with the elements of the same by exploiting the fact and print always returns None, which is falsy (that is, it evaluates as False when evaluated as a boolean), which means that the expression will always evaluate as i+j after printing the value.

3. Exploit Iterables

Earlier, I wrote the following to read a list of integers from the standard input (32 characters):

[int(x)for x in input().split()]

This is a list comprehension, which has the advantage of being easy to read quickly once you understand the syntax. But, before PEP 202, it was already possible to do it without a loop block, using the functional-style map() and filter() built-in methods. Both take as argument a method and an iterable, and apply the method to each element of the iterable. map() simply returns the result of this evaluation, while filter() returns the original element if the result is truthy, otherwise, it skips it. I have rarely found a use for filter() while code golfing, but map() lets me write the following (24 characters):

map(int,input().split())

This expression evaluates to a generator of integers corresponding to the numbers in the line. So, if all you want is their sum, you could just write (29 characters):

sum(map(int,input().split()))

Note that, if you need to iterate over the integers several times, or if you need to access them by index, then you need to exhaust the iterable into a list. It is still shorter than a list comprehension however (30 characters):

list(map(int,input().split()))

But many things are iterable in Python. If you need to count the number of unique characters of a string s, you can just write (11 characters):

len(set(s))

In fact, you can exploit it for when you want to iterate a fixed number of times and do not care about the loop index. For instance, instead of writing (40 characters):

for _ in range(99):print(int(input())%2)

You could just as well write (36 characters):

for i in[0]*99:print(int(input())%2)

Note: there used to be a reduce(), method in the built-ins of Python, that would let you aggregate a range in the fashion of sum(), max(), min(), any(), all(). With it, you could easily have computed the product of the elements of an iterable. However, it was moved into the functools module in Python 3 because comprehensions are deemed more Pythonic.

4. Use Shorthands

Suppose you need to read a single string, then a list of string (their count on a line, then a string per line), and count how many of the strings contain the initial string. You could do (58 characters):

s=input()
print(sum(s in input()for _ in[0]*int(input())))

Note: the input() of [0]*int(input()) will be evaluated before that of s in input() since it is necessary to start the (generator) comprehension in the first place.

Note: s in input() is a boolean expression but, in Python, True is equal to 1 and False is equal to 0; thus, we can easily count the number of time a predicate is met.

Now, you might notice that we have written input thrice. Since Python methods and classes are themselves objects, we can affect them to a variable and write (54 characters):

I=input
s=I()
print(sum(s in I()for _ in[0]*int(I())))

We save 4 characters for each call to input and assigning it to I costs us 8 characters, so we save 4 characters here. We would break even if there were only two calls to input. But it allows us another neat (horrible) trick (50 characters):

I=input
s=I()
I(sum(s in I()for _ in[0]*int(I())))

Seriously. Here, we are exploiting the fact that input takes an optional position argument, which will be used as the prompt for the user. For instance, if you are asking the name of the user, you would write: name=input('Name: '), which is essentially equivalent to print('Name: ', end='');name = input(). Since there is nothing more to read, the program will crash, but not before having written the result on the standard input.

Note: this trick only works as the final output of a program (you cannot use it several times in a loop for instance), and you need to give it a single argument (you might need to format the output first).

5. Use f-strings

I referred to the idea of passing several arguments to print at the end of the previous section. This is something very handy, even outside code golf, when you’d like to print several values separated by spaces:

x = 4
y = 6
name = 'Hector'
print(x, y, name)

This will result in printing 4 6 Hector to the standard output. You can easily apply it to an enumerable:

l = [1, 2, 3, 4]
print(*l)
# Output:
# 1 2 3 4

If you prefer to separate the elements with a comma, you can use an optional keyword parameter of print:

l = [1, 2, 3, 4]
print(*l, sep=',')
# Output:
# 1,2,3,4

However, if you have a generator comprehension that outputs strings, you could save writings extraneous parentheses with str.join():

l = [1, 2, 3, 4]
print(*(chr(96+i)for i in l),sep=',')
# versus
print(','.join(chr(96+i)for i in l))
# Output:
# a,b,c,d

Where are the f-strings you promised in all this?!

A reader annoyed by my prolixity

Well, at some point, you will want to output something in a particular manner, or you will need to format a string to pass it as a single argument to input() (60 characters):

print('Values are x= '+str(x)+' and y='+str(y)+' for '+name)

Or, using %-formatting (51 characters):

print('Values are x=%d and y=%d for %s'%(x,y,name))

This is advantageously rewritten with an f-string as (47 characters):

print(f'Values are x={x} and y={y} for {name}')

Note: since Python 3.8 (CodinGame uses 3.7.4 as of writing this article), you can write print(f'Values are {x=} and {y=} for {name}'), saving two characters.

But f-string are much more than string interpolation. They are %-formatting on steroids. So, you can still pad left or right, with spaces or zeroes, toggle positive sign, manage decimal precision, convert to hexadecimal (lower or upper case). But you can also convert to octal and decimal, center strings and call repr(). So instead of the well-known way to convert to binary (10 characters):

bin(x)[2:]

You can now write (8 characters):

f'{x:b}'