Jag kommer att implementera en minimax-algoritm med alfa-beta-beskärning och avklippning i denna handledning. Denna algoritm kommer att implementeras i Python och jag skapar en AI-agent som spelar Tre-i-rad och liknande spel mot dig eller någon annan människa.
Minimax används för att lösa kontradiktoriska sökproblem där agenternas mål är i konflikt, detta är fallet för de flesta spel. I ett spel med två spelare antar algoritmen att en spelare vill maximera (MAX) spelets poängsumma och att den andra spelaren vill minimera (MIN) spelets poängsumma. Algoritmen kan utökas till att omfatta fler än två spelare.
Minimax utvärderar alla möjliga drag utifrån det aktuella speltillståndet och kopierar sedan slutliga poängvärden upp till det aktuella tillståndet i spelträdet. Algoritmen antar att båda agenterna spelar optimalt, MAX-spelaren kommer att välja det drag som ger högst poäng och MIN-spelaren kommer att välja det drag som ger lägst poäng. I ett Tre-i-rad-spel med 3×3 rutor finns det 362 880 (9!) olika tillstånd att undersöka när spelet börjar.
Alpha-beta-beskärning och avklippning används för att minska söktiden för minimax-algoritmen. Alpha-beta-beskärning innebär att vi kan ignorera noder som aldrig kommer att väljas med tanke på den information vi redan har, båda spelarna antas spela optimalt. Avklippning innebär att vi kan använda en snabb heuristisk utvärderingsalgoritm när vi har nått ett visst djup i minimax-sökningen. Avklippning används i de tidiga faserna av ett spel, när det finns ett stort antal möjliga drag.
Tre-i-rad och liknande spel
Den här koden kan användas för Tre-i-rad, Fyra-i-rad och liknande spel. AI-spelaren använder en minimax-algoritm för att fatta beslut när det är hennes tur, en mänsklig spelare får en rekommendation inför varje drag. En utvärderingsfunktion (get_best_move
) används vid avklippning, utvärderingsfunktionen använder vinnande positioner och heuristik för att fatta beslut. Ett Fyra-i-rad-spel skulle vara omöjligt att spela utan utvärderingsfunktionen.
# Import libraries
import sys
import random
# This class represent a tic tac to game
class TicTacToeGame:
# Create a new game
def __init__(self, rows:int, columns:int, goal:int, max_depth:int=4):
# Create the game state
self.state = []
self.tiles = {}
self.inverted_tiles = {}
tile = 0
for y in range(rows):
row = []
for x in range(columns):
row += '.'
tile += 1
self.tiles[tile] = (y, x)
self.inverted_tiles[(y, x)] = tile
self.state.append(row)
# Set the number of noughts and crosses in a row that is needed to win the game
self.goal = goal
# Create vectors
self.vectors = [(1,0), (0,1), (1,1), (-1,1)]
# Set lengths
self.rows = rows
self.columns = columns
self.max_row_index = rows - 1
self.max_columns_index = columns - 1
self.max_depth = max_depth
# Heuristics for cutoff
self.winning_positions = []
self.get_winning_positions()
# Set the starting player at random
#self.player = 'O'
self.player = random.choice(['X', 'O'])
# Get winning positions
def get_winning_positions(self):
# Loop the board
for y in range(self.rows):
for x in range(self.columns):
# Loop vectors
for vector in self.vectors:
# Get the start position
sy, sx = (y, x)
# Get vector deltas
dy, dx = vector
# Create a counter
counter = 0
# Loop until we are outside the board
positions = []
while True:
# Add the position
positions.append(self.inverted_tiles.get((sy, sx)))
# Check if we have a winning position
if (len(positions) == self.goal):
# Add winning positions
self.winning_positions.append(positions)
# Break out from the loop
break
# Update the position
sy += dy
sx += dx
# Check if the loop should terminate
if(sy < 0 or abs(sy) > self.max_row_index or sx < 0 or abs(sx) > self.max_columns_index):
break
# Play the game
def play(self):
# Variables
result = None
# Create an infinite loop
print('Starting board')
while True:
# Draw the state
self.print_state()
# Get a move from a player
if (self.player == 'X'): # AI
# Print AI move
print('Player X moving (AI) ...')
# Get the best move
max, py, px, depth = self.max(-sys.maxsize, sys.maxsize)
# Get a heuristic move at cutoff
print('Depth: {0}'.format(depth))
if(depth > self.max_depth):
py, px = self.get_best_move()
# Make a move
self.state[py][px] = 'X'
# Check if the game has ended, break out from the loop in that case
result = self.game_ended()
if(result != None):
break
# Change turn
self.player = 'O'
elif (self.player == 'O'): # Human player
# Print turn
print('Player O moving (Human) ...')
# Get a recommended move
min, py, px, depth = self.min(-sys.maxsize, sys.maxsize)
# Get a heuristic move at cutoff
print('Depth: {0}'.format(depth))
if(depth > self.max_depth):
py, px = self.get_best_move()
# Print a recommendation
print('Recommendation: {0}'.format(self.inverted_tiles.get((py, px))))
# Get input
number = int(input('Make a move (tile number): '))
tile = self.tiles.get(number)
# Check if the move is legal
if(tile != None):
# Make a move
py, px = tile
self.state[py][px] = 'O'
# Check if the game has ended, break out from the loop in that case
result = self.game_ended()
if(result != None):
break
# Change turn
self.player = 'X'
else:
print('Move is not legal, try again.')
# Print result
self.print_state()
print('Winner is player: {0}'.format(result))
# An evaluation function to get the best move based on heuristics
def get_best_move(self):
# Create an heuristic dictionary
heuristics = {}
# Get all empty cells
empty_cells = []
for y in range(self.rows):
for x in range(self.columns):
if (self.state[y][x] == '.'):
empty_cells.append((y, x))
# Loop empty positions
for empty in empty_cells:
# Get numbered position
number = self.inverted_tiles.get(empty)
# Loop winning positions
for win in self.winning_positions:
# Check if number is in a winning position
if(number in win):
# Calculate the number of X:s and O:s in the winning position
player_x = 0
player_o = 0
start_score = 1
for box in win:
# Get the position
y, x = self.tiles[box]
# Count X:s and O:s
if(self.state[y][x] == 'X'):
player_x += start_score if self.player == 'X' else start_score * 2
start_score *= 10
elif (self.state[y][x] == 'O'):
player_o += start_score if self.player == 'O' else start_score * 2
start_score *= 10
# Save heuristic
if(player_x == 0 or player_o == 0):
# Calculate a score
score = max(player_x, player_o) + start_score
# Update the score
if(heuristics.get(number) != None):
heuristics[number] += score
else:
heuristics[number] = score
# Get the best move from the heuristic dictionary
best_move = random.choice(empty_cells)
best_count = -sys.maxsize
for key, value in heuristics.items():
if(value > best_count):
best_move = self.tiles.get(key)
best_count = value
# Return the best move
return best_move
# Check if the game has ended
def game_ended(self) -> str:
# Check if a player has won
result = self.player_has_won()
if(result != None):
return result
# Check if the board is full
for y in range(self.rows):
for x in range(self.columns):
if (self.state[y][x] == '.'):
return None
# Return a tie
return 'It is a tie!'
# Check if a player has won
def player_has_won(self) -> str:
# Loop the board
for y in range(self.rows):
for x in range(self.columns):
# Loop vectors
for vector in self.vectors:
# Get the start position
sy, sx = (y, x)
# Get vector deltas
dy, dx = vector
# Create counters
steps = 0
player_x = 0
player_o = 0
# Loop until we are outside the board or have moved the number of steps in the goal
while steps < self.goal:
# Add steps
steps += 1
# Check if a player has a piece in the tile
if(self.state[sy][sx] == 'X'):
player_x += 1
elif(self.state[sy][sx] == 'O'):
player_o += 1
# Update the position
sy += dy
sx += dx
# Check if the loop should terminate
if(sy < 0 or abs(sy) > self.max_row_index or sx < 0 or abs(sx) > self.max_columns_index):
break
# Check if we have a winner
if(player_x >= self.goal):
return 'X'
elif(player_o >= self.goal):
return 'O'
# Return None if no winner is found
return None
# Get a min value (O)
def min(self, alpha:int=-sys.maxsize, beta:int=sys.maxsize, depth:int=0):
# Variables
min_value = sys.maxsize
by = None
bx = None
# Check if the game has ended
result = self.game_ended()
if(result != None):
if result == 'X':
return 1, 0, 0, depth
elif result == 'O':
return -1, 0, 0, depth
elif result == 'It is a tie!':
return 0, 0, 0, depth
elif(depth > self.max_depth):
return 0, 0, 0, depth
# Loop the board
for y in range(self.rows):
for x in range(self.columns):
# Check if the tile is empty
if (self.state[y][x] == '.'):
# Make a move
self.state[y][x] = 'O'
# Get max value
max, max_y, max_x, depth = self.max(alpha, beta, depth + 1)
# Set min value to max value if it is lower than curren min value
if (max < min_value):
min_value = max
by = y
bx = x
# Reset the tile
self.state[y][x] = '.'
# Do an alpha test
if (min_value <= alpha):
return min_value, bx, by, depth
# Do a beta test
if (min_value < beta):
beta = min_value
# Return min value
return min_value, by, bx, depth
# Get max value (X)
def max(self, alpha:int=-sys.maxsize, beta:int=sys.maxsize, depth:int=0):
# Variables
max_value = -sys.maxsize
by = None
bx = None
# Check if the game has ended
result = self.game_ended()
if(result != None):
if result == 'X':
return 1, 0, 0, depth
elif result == 'O':
return -1, 0, 0, depth
elif result == 'It is a tie!':
return 0, 0, 0, depth
elif(depth > self.max_depth):
return 0, 0, 0, depth
# Loop the board
for y in range(self.rows):
for x in range(self.columns):
# Check if the current tile is empty
if (self.state[y][x] == '.'):
# Add a piece to the board
self.state[y][x] = 'X'
# Set max value to min value if min value is greater than current max value
min, min_y, min_x, depth = self.min(alpha, beta, depth + 1)
# Adjust the max value
if (min > max_value):
max_value = min
by = y
bx = x
# Reset the tile
self.state[y][x] = '.'
# Do a beta test
if (max_value >= beta):
return max_value, bx, by, depth
# Do an alpha test
if (max_value > alpha):
alpha = max_value
# Return max value
return max_value, by, bx, depth
# Print the current game state
def print_state(self):
for y in range(self.rows):
print('| ', end='')
for x in range(self.columns):
if (self.state[y][x] != '.'):
print(' {0} | '.format(self.state[y][x]), end='')
else:
digit = str(self.inverted_tiles.get((y,x))) if len(str(self.inverted_tiles.get((y,x)))) > 1 else ' ' + str(self.inverted_tiles.get((y,x)))
print('{0} | '.format(digit), end='')
print()
print()
# The main entry point for this module
def main():
# Create a game
#game = TicTacToeGame(7, 6, 4, 1000)
game = TicTacToeGame(3, 3, 3, 1000)
# Play the game
game.play()
# Tell python to run main method
if __name__ == "__main__": main()
Resultat
Starting board
| 1 | 2 | 3 |
| 4 | 5 | 6 |
| 7 | 8 | 9 |
Player X moving (AI) ...
Depth: 1029
| 1 | 2 | 3 |
| 4 | X | 6 |
| 7 | 8 | 9 |
Player O moving (Human) ...
Depth: 1012
Recommendation: 1
Make a move (tile number): 1
| O | 2 | 3 |
| 4 | X | 6 |
| 7 | 8 | 9 |
Player X moving (AI) ...
Depth: 702
| O | X | 3 |
| 4 | X | 6 |
| 7 | 8 | 9 |
Player O moving (Human) ...
Depth: 268
Recommendation: 8
Make a move (tile number): 8
| O | X | 3 |
| 4 | X | 6 |
| 7 | O | 9 |
Player X moving (AI) ...
Depth: 104
| O | X | 3 |
| X | X | 6 |
| 7 | O | 9 |
Player O moving (Human) ...
Depth: 32
Recommendation: 6
Make a move (tile number): 6
| O | X | 3 |
| X | X | O |
| 7 | O | 9 |
Player X moving (AI) ...
Depth: 11
| O | X | X |
| X | X | O |
| 7 | O | 9 |
Player O moving (Human) ...
Depth: 4
Recommendation: 7
Make a move (tile number): 7
| O | X | X |
| X | X | O |
| O | O | 9 |
Player X moving (AI) ...
Depth: 1
| O | X | X |
| X | X | O |
| O | O | X |
Winner is player: It is a tie!