Author: E. Kao. Download the program here! 3-Dimensional Tic-Tac-Toe is derived from 2-Dimensional Tic-Tac-Toe. 2D TTT is a very simple game in which a player wins by simply making a line of three stones. The possible combination in 2D TTT is very limited and by just hard-coding in all the possible combination, we can define a pretty much unbeatable AI. There are only 8 lines in the 3x3 board and players often can not even achieve a single line at the end of the game. Extending the 3x3 board to a 3x3x3 cube makes the game much more exciting. With 49 possible lines (including the diagonals of the crossing plane), players often achieve more than 5 lines at the end of the game. The objective is no longer just to get a line but to get multiple lines. To compensate for playing black ( black moves first) in the first game, the player will play white in the second game and the winner is the one who gets the higher score overall. In other words, a match is consisted of two games and the result is the sum of scores from both games. A game automatically ends when all 27 spots are played.
The first thing one must do when composing a gaming AI is to ask himself what thinking process
a human player would go through when playing the game. Normally, we come up with several strategies
for any game. The next step would be to express these strategies into subprograms (functions)
The scores in 3-Dimensional Tic-Tac-Toe are the number of lines achieved. Thus, the strategies
of the game would be to make as many lines as possible while preventing the opponent from
making lines. So the kind of spots best for the next move would be spots making a line or adding
up the potential of making a line and spots blocking the opponent's (potential) line. When
a human player is playing the game, he/she would be trying to compare different spots and
choose the best one to play. This is like an evaluation process on the priority of different
spots. To simulate such thinking process in AI, we give priority score to each spot and let
AI play the spot receiving the highest score. So how do we give the scores? According to the
strategies we mentioned above, a spot that helps making lines or preventing opponent from making lines
should receive a high priority score. To express it more specifically, let's look at the following example:
The amount of points given in each of the 5 possibilities is constant in this version but
the design is made convenient for the modification of these parameters. The modification
of these parameters will change the scoring process on the priority. Thus, the AI will play
differently from before. In other words, the AI "learns" when the five score parameters are
modified. So when does AI increase or decrease a parameter and what is the logics behind the
modification?
Increasing the value of parameters for the aggressive possibilities or decreasing the value of parameters for the defensive possibilities makes the AI play more aggressively (Valuing making lines more than blocking lines). The parameters should be tuned in this direction when the AI achieves too few lines in a match. Reversely, when the human player achieves too many lines in a match, the parameters should be tuned in the opposite direction. "Parameters Modification" is a very common form of learning.
|