Recently I wrote a little program that plays Dots and Boxes while maximizing its chances at a perfect score. I was inspired by mneoneo's expectimax optimization for the game 2048 [1]. Using my bot, I was able to post perfect scores in several popular online versions of Dots and Boxes (Figure 1).
Fig. 1: Scores for 8x8 (64-box) and 14x7 (98-box) versions of the game |
Background
Dots and Boxes is a popular paper and pencil game in which players alternate connecting dots in a rectangular grid using horizontal or vertical line segments. If a player claims a box by adding the fourth line segment in a square, then she must move again. The goal is to obtain more boxes than the opponent.
One interesting way to play the game is to seek to win every single box. Against most players, even those who have never played before, such a feat is near impossible on sufficiently large boards. However, against a perfectly random player (whose moves follow no pattern), there is a non negligible probability of achieving such a win, when using the right strategy. That was the goal of my program.
The version of the game that will be used for demonstrative purposes in this post can be found here. Figure 2 shows what the board looks like when the game starts.
Fig. 2: The starting board: a 14x7 grid of unfilled boxes |
Methods
The Dots game board can be represented as a Graph. Each box, dot, and line is represented as a vertex, and each vertex has edges to the box, dot, or line located to its North, East, South, and West. The line vertices are progressively marked as "closed" as the game progresses. This representation enables a straightforward implementation of depth first search, which forms a crucial part of my program's path finding algorithm.
In order to obtain a perfect score in Dots, a player must create a "chain" of boxes that visits every square of the board. In a chain, each box has exactly two closed sides, and shares its two open sides with the adjacent box(es) in the chain. When the other player moves, he or she will be forced to add a third edge to a box belonging to the chain, leading to a cascade of moves that will grant the winning player every single box on the board.
There are three types of such chains:
1. Open chain
An open chain consists of a non-cyclical Hamiltonian Path; a chain that visits every box but with two distinct endpoints (Figure 3)
Fig. 3: An example of an open chain |
In addition, each box must have two closed sides, so this leaves 2WH - (2W + 2H - 2) sides that must be closed. However, non-boundary sides are shared by two boxes, so filling these sides would take half the number of moves: WH - (W + H - 1)
Adding these two numbers together gives a total of WH + W + H - 1 moves.
This number is odd if W and H are both even, and even otherwise.
2. Closed chain
A closed chain consists of a Hamiltonian Cycle; a cyclical chain that visits every box (Figure 4)
Fig. 4: An example of a closed chain |
This number is even if W and H are both even, and odd otherwise.
3. Closed chain with tail
The last type of chain is a bit of an oddity: it consists of a closed chain, but with an open tail (Figure 5). As a result, the player whose turn it is still has an escape; she can fill in a line in the tail portion, giving away only the boxes in the tail. If, however, she has the misfortune of playing within the closed chain, every single box will be lost.
Fig. 5: An example of a closed chain with a tail. The closed chain is in green, the tail is in red. |
This time, one of the boxes only has one side filled in; the one located at the junction between the closed chain and the tail. As a result, the number of moves required for this kind of configuration is the same as for an open chain: WH + W + H - 1.
For a grid where both the width and height are odd, it is impossible to win a perfect score, because one cannot create a closed chain in a grid with odd height and width, so the only available chains take an even number of moves to create.
For a grid where both the width and the height are even, one must create an open chain, or a closed chain with a tail, in order to win a perfect score.
For a grid whose dimensions add up to an odd number (such as a 14x7 grid, where 14+7 = 21), one must create a closed chain. The parity of the moves of any other ending configuration would require the first player to set off the final chain, resulting in a loss with 0 points (!) Here is an example of one such configuration:
Since the version of the game being used does indeed have a 14x7 grid, my optimization algorithm had to check for available closed chains. I designed and implemented my own polynomial time algorithm to check for Hamiltonian Cycles in Grid Graphs, though others have already been discovered [2].
My bot used a Monte Carlo Tree Search with a depth of four to discover the best move to play at each stage of the game. The best set of four moves was defined as the least "risky" - that is the one that minimized the computer's likelihood of playing a move that ruined chances of a closed chain being created. If such a "game-ending" move (Figure 6) was played, the bot would restart the game and try again.
Risk was simply calculated as the ratio of legal game-ending moves to legal moves. Legal moves are defined as open lines that are adjacent to boxes with at most one closed side.
My bot used a Monte Carlo Tree Search with a depth of four to discover the best move to play at each stage of the game. The best set of four moves was defined as the least "risky" - that is the one that minimized the computer's likelihood of playing a move that ruined chances of a closed chain being created. If such a "game-ending" move (Figure 6) was played, the bot would restart the game and try again.
Fig. 6: An example of a game-ending move. If this move is played at the start, there is no way to create a closed chain. |
Results
When the final algorithm runs, it's able to create closed chains vs. a random opponent, as captured by this sequence of game screenshots:
Over 1000 simulations, a closed chain occurred 9 times. We can thus estimate the likelihood of such a chain occurring at any given iteration of the game to be 0.90% +/- 0.03%.
Since we're using a 14x7 board, this means the game must be able to reach 14x7 + 14 + 7 = 119 moves. Figure 7 shows a break down of how often the game ended on each given move during the 1000 simulations.
Fig. 7: How often each move resulted in a restarted game |
There's a mode at move 49, which can be explained by increasing risk as the game moves on (Figure 8).
Fig. 8: The average risk of a game-ending move being played at each move |
The riskiest move, on average, was move 76. This means that once the bot played, it was the move at which the computer had the highest chance of playing a game-ending move. Most of the time, the game ends before this move is ever reached. As Figure 9 illustrates, move 76 was only reached 40 times.
Fig. 9: How often each move was reached |
But by the time move 76 is reached, even though the risk of failure at each move is high, the bot already has on average a one in four chance of reaching a closed chain. As evidenced by the flattening at around move 100, there eventually comes a point at which reaching move 118 - and thus the closed chain - becomes a shoo-in.
References
2. Umans, C. Lenhart, W. "Hamiltonian Cycles in Solid Grid Graphs" IEEE 1997
No comments:
Post a Comment