Programming Puzzle: Ticket to Ride First Journey
My son’s new favorite game is called Ticket to Ride First Journey, which is a junior version of the full game Ticket to Ride. The goal is to connect cities with train tracks to complete a certain number of routes the fastest. My son had the obvious question of how to finish the game the fastest. Before we could tackle that problem, however, we first looked at how to connect all the cities using the least number of tracks. This turned out to be a perfect opportunity to study the minimum spanning tree problem:
The minimum spanning tree problem is related to graphs, which are sets of nodes (or vertices) that are connected by edges, each of which has a weight. The MST problem is to find a set of edges that connect all the nodes in the graph while having the lowest sum of weights over those edges. The Ticket to Ride board is a graph where the vertices are cities, the edges are the routes between them, and the weights are the number of track slots between each city. So a minimum spanning tree on this board is a route that connects every city while using the least number of tracks to do so.
Any route that visits every node on a graph once is a spanning tree; the minimum spanning tree is the one that uses the least weights or length to do so. This obviously has huge practical applications—just the fact that our motivating example is of international rail network should show how broad. But it’s easy to come up with a bad spanning tree. Take this one:
By connecting many nodes in inefficient ways, this path uses 50 track slots to connect all the cities. Obviously we need an algorithmic solution, but the MST problem doesn’t yield to the simple depth- and breadth-first search algorithms that we examined before. Instead we’re going to have to use a special graph traversal algorithm.
Prim’s Algorithm
There are several fancy solutions to the MST problem, but the one I’m most familiar with is Prim’s algorithm. It’s a basic greedy algorithm that starts from one node in the graph (here one city) and searches progressively outwards one node at a time, examining all the edges that connect to that new node and using them to adjust the route of the minimum spanning tree. Wikipedia provides handy pseudocode:
This may look complicated, but we’re going to implement it by directly translating it line-for-line into Python.
Building the Graph
Before we can use Prim’s algorithm, however, the first—and most tedious—step of the problem is initializing the data structures we’re going to use. For a graph problem you’re generally going to need three things:
- The list of vertices
- The list of edges
- The graph represented in memory
Working in Python, I’m going to use string labels for each vertex, which I’ll initialize with constants:
ALBUQUERQUE = 'Albuquerque'
ATLANTA = 'Atlanta'
CALGARY = 'Calgary'
...
WINNIPEG = 'Winnipeg'VERTICES = [
ALBUQUERQUE,
ATLANTA,
CALGARY,
...
WINNIPEG,
]
Warning: DO NOT initialize the vertices like this:
VERTICES = [
'Albuquerque',
'Atlanta',
'Calgary',
...
'Winnipeg',
]
Using string literals means you’re going to have to retype the strings every time you want to use them, and it’s very easy to misspell the word “Albuquerque”—which will introduce a silent bug into your program. If you instead misspell the constant name, ALBUQUERQUE
, however, the interpreter will throw an error and prompt you to fix it.
Next we list the edges as tuples of (CITY_A, CITY_B, WEIGHT)
:
EDGES = [
(SEATTLE, CALGARY, 3),
(SEATTLE, HELENA, 2),
(SEATTLE, SALT_LAKE_CITY, 3),
...
(SAN_FRANCISCO, LOS_ANGELES, 1),
]
The edges are bidirectional, so the order of the city names doesn’t matter; we just need to remember that city B also connects back to city A.
Finally we build a representation of the graph in memory. In Python, we can make a dictionary with the city names as the keys, and the values will show that city’s connections to all other cities. These values will themselves be more dictionaries with the connected city names as the keys and the weight of the connections as the values:
graph = {v: {} for v in VERTICES}
for vertex1, vertex2, cost in EDGES:
graph[vertex1][vertex2] = cost
graph[vertex2][vertex1] = cost
A quick sanity check on this code is to verify that you have used both the VERTICES
and the EDGES
data structures to build the graph
. Also, since we have bidirectional edges, we have to connect vertex1
to vertex2
and vice versa. Once constructed, the "Seattle"
component of the graph map will look like this (this can be crosschecked with the image of the board above):
"Seattle": {
"Calgary": 3,
"Helena": 2,
"Salt Lake City": 3,
"San Francisco": 3
},
Implementing the Algorithm
With all of our initialization done, we can start implementing Prim’s algorithm from the first step:
1. Associate with each vertex v of the graph a number C[v] (the cheapest cost of a connection to v) and an edge E[v] (the edge providing that cheapest connection). To initialize these values, set all values of C[v] to +∞ (or to any number larger than the maximum edge weight) and set each E[v] to a special flag value indicating that there is no edge connecting v to earlier vertices.
The variable v refers to a vertex, for which we can use our string labels in VERTICES
. To represent the weights in C and E we can use dictionaries where the keys are vertex labels. The pseudocode also tells us to initialize C
to all positive infinites, and E
to a flag value, for which we’ll use None
:
C = {v: float('Inf') for vin VERTICES}
E = {v: None for vin VERTICES}
Surprising but that’s it. The longest step in the pseudocode was only two short lines of initialization.
Now the second step:
2. Initialize an empty forest F and a set Q of vertices that have not yet been included in F (initially, all vertices).
The “empty forest” just means another set of unconnected nodes, so a set. We can model both of these as set
s in Python. Q
starts off empty, but F
starts off initialized with all vertices:
F = set()
Q = set(VERTICES)
Again, dense text that turns into very simple code.
Moving on to the third step:
Repeat the following steps until Q is empty:
— a. Find and remove a vertex v from Q having the minimum possible value of C[v]
— b. Add v to F
— c. Loop over the edges vw connecting v to other vertices w. For each such edge, if w still belongs to Q and vw has smaller weight than C[w], perform the following steps:
—— i.Set C[w] to the cost of edge vw
—— ii. Set E[w] to point to edge vw.
We want to find the vertex with minimum cost value in C
, which we’ll accomplish with a sort. Then we remove that vertex from Q
and add it to F
(marking the vertex as visited). Finally we check the edges connecting to this newly visited vertex and use any lower edge weights it has to alter the minimum spanning tree:
while Q:
v = list(sorted([(C[v], v) for v in Q]))[0][1]
Q.remove(v) F.add(v) for w, vw in graph[v].items():
if w in Q and vw < C[w]:
C[w] = vw
E[w] = v
I’ve used a little idiomatic Python here, but for the most part it matches one-to-one with the pseudocode. Here’s even a color-coded version showing the exact correspondences between the pseudocode and the real code:
If we wanted to make the code more readable, we could just substitute the algorithm’s single-letter variable names with more-informative ones:
while Q:
vertex_cheapest = list(sorted([(C[v], v) for v in Q]))[0][1]
Q.remove(vertex_cheapest) F.add(vertex_cheapest) for vertex_w, weight_vw in graph[vertex_cheapest].items():
if vertex_w in Q and weight_vw < C[vertex_w]:
C[vertex_w] = weight_vw
E[vertex_w] = vertex_cheapest
Running It
For a small graph of only 19 vertices like this, Prim’s algorithm runs instantaneously. The results:
>>> print(json.dumps(E, indent=2))
{
"Albuquerque": null,
"Atlanta": "Chicago",
"Calgary": "Helena",
"Chicago": "Kansas City",
"Dallas": "Albuquerque",
"Denver": "Albuquerque",
"Duluth": "Chicago",
"Helena": "Denver",
"Kansas City": "Dallas",
"Los Angeles": "Salt Lake City",
"Miami": "Atlanta",
"Montreal": "New York",
"New Orleans": "Atlanta",
"New York": "Washington",
"Salt Lake City": "Helena",
"San Francisco": "Los Angeles",
"Seattle": "Helena",
"Washington": "Atlanta",
"Winnipeg": "Duluth"
}
For a graph with 19 vertices, there should only be 18 edges needed to connect them into a minimum spanning tree. That’s why we have 18 string-to-string correspondences and one null
(which was a None
in Python before the JSON conversion). Plotting it, our MST is this:
If we count manually this uses only 25 tracks—far better than the 50 we started out with. But we can also have the computer count the relevant edge weights just to be sure:
>>> print(sum(graph[v1][v2] for v1, v2 in E.items() if v2))
25
Summary
Graph problems may seem intimidating, but they have simple underpinnings and many practical applications. The takeaways from this exercise are:
- Every graph problem starts with a set of vertices and a list of edges.
- Building the graph in memory makes solutions easier to code.
- Graph algorithms usually process nodes one-by-one, keeping track of them in sets (especially
visited
andunvisited
sets). - As long as you have good
VERTICES
,EDGES
, andgraph
data structures, many pseudocode algorithms have straightforward implementations.
Completed code available here.