CSCI4511W Problem Set 3 AI Game Playing, Minimax, Alpha-Beta Pruning & Expectimax

On Sale

$25.00

Added to cart

Problem 1. (30 points)

For problem 1, you will put answers in your writeup for a - c. Also make alterations to the given python module.

a. Run the ps3.py code. You will be prompted to play tic-tac-toe against a minimax agent. Common wisdom suggests that the best first move is the center square. The agent doesn’t begin by playing in the center square. Why do you think the agent plays where it does instead of the center? Give an explanation (4-5 sentences) in your writeup.

The agent does not play in the center square because it is trying to minimize its chances of losing and by playing in the corners, the agent can force the player into a position where it is more likely to lose. The reason the agent does this is because it is using a minimax algorithm that is designed to find the best move for the agent, assuming that the player is also trying to find the best move. In other words, the algorithm is trying to find a move that will minimize the chances of the agent losing. One way to do this is to force the player into a position where it is more likely to lose, by playing in the corners, the agent can do this. The reason is that the player is more likely to lose if they are in the middle of the board because the player has less control over the pieces in the middle of the board.

Alter the ps3.py code’s main function so that it calls problem 1b instead of problem 1a. You will be prompted to play connect-4 against an alpha-beta with cutoff agent that is using a silly evaluation function. See if you can defeat the agent in a few games. Describe (3-4 sentences) what your plan to defeat it every time. Explain briefly (2-3 sentences) why your plan works.

I noticed that the agent places its first move in the bottom right corner every time and then its following moves are to stack vertically until the column is full and then it moves to the next column over and repeats. My plan to defeat it is to let it stack 3 of its tokens while I stack 2 in another column vertically or 2 in row horizontally. My third move is to block the agent from reaching 4 by stacking one of my tokens on its column of 3. Then after that is done, my next 2 moves will be to continue stacking in my column or row and I will get 4 in a row before the agent does. This plan works because the agent will require 7 turns at a minimum to reach its goal when I use my third move to cut off its chain of 3 while I only need 5 moves to win. The eval function for this agent does not look at my moves to see how close I am to winning and will not try to purposely block me. If you place your tokens in the right spot, you will win every time.

In the function c4 good eval, write a new evaluation function. Using this function, write code in problem 1c that will play an alpha-beta agent with depth cutoff 4 using

CSCI4511W - Problem Set 3 October 18, 2024

your evaluation function against a random agent. Play once as X and once as O.

(Note: For this problem and also 1(d) and also 2(b), it may help to write a version of your eval function to evaluate as ‘X’ and another version of your eval function to evaluate as ‘O’. This is fine.) Have the function return a tuple (nx, no), where nx is the number of games that your agent wins as X, and no is the number of games that your agent wins as O. So if you win both games, you return (1,1).Describe (2-3 sentences) your evaluation function in your writeup and explain (2-3 sentences) why you think it should do well. If your agent doesn’t usually win against a random agent, then you probably did something wrong.

My c4_good_eval function evaluates the connect four board by calculating a score based on the player’s and opponent's chains or number of X’s or O’s in a row. My c4_good_eval checks for chains of two, three, or four in a row for horizontal, vertical, and diagonal directions by using a helper function called eval_chains. This function will then give different weighted scores to the total score depending on the length of the chain, which encourages longer chains and blocking the opponent's chains if they become a threat. I think this eval function will work because it rewards the player for making longer chains and also penalizes the player if the opponent's chains get long. This prompts the player to play defense and to stop the opponent from getting 4 in a row before the player does. This eval function should be making moves that focus on winning and blocking the opponent.

In the function problem 1d, run 8 games of an alpha-beta-cutoff agent using my eval-uation function versus an alpha-beta-cutoff agent using your evaluation function, with each agent playing 4 times as X and 4 times as O. Run four of games at depth limit 2 for both agents, then four games at depth limit 3 for both agents. Return a tuple (nx, no), where nx is the number of games that your agent wins as X, and no is the number of games that your agent wins as O. So if you win all 8 games, you return (4,4).

CSCI4511W - Problem Set 3 October 18, 2024

Problem 2. (20 points)

For problem 2, you will put answers in your writeup for a. Also make alterations to the given python module.

a. Find the definition for Gomoku (5-in-a-row) in the aima code. Build an evaluation function for gomoku. In your writeup, briefly explain what your evaluation function is doing.

My gomoku_eval function evaluates the Gomoku board by calculating a score based on the player’s and opponent's chains similar to my c4_good_eval. The gomoku eval function penalizes the player when the opponent starts forming chains and this prompts the player to block/play defense. My gomoku_eval is trying to make chains right away and it basically places 5 X’s or O’s in a horizontal row in the top left corner of the board. This should make my gomuku_eval function win every time because the opponent random places 5 X’s or O’s on the board. This makes it so my gomoku_eval function will reach 5 in a row before the opponent does, ensuring victory every time.

In the function problem 2b, run 8 games of a random agent versus an alpha-beta-cutoff agent using your evaluation function, with each agent playing 4 times as X and 4 times as O. Choose a depth cutoff that will allow each game to complete in under 20 seconds. Have the function return a tuple (nx, no), where nx is the number of games that your agent wins as X, and no is the number of games that your agent wins as O. So if your agent wins every single game, it will return (4, 4). (Note: As mentioned above, it may help to write a version of your eval function to evaluate as ‘X’ and another version of your eval function to evaluate as ‘O’. This is fine.)

CSCI4511W - Problem Set 3 October 18, 2024

Problem 3. (10 points)

Consider a balanced game tree with branching factor of 3 and exactly 40 nodes. Sup-pose that this tree would be maximally pruned by alpha-beta pruning, based on the evaluation of its leaf nodes. How many nodes would be pruned in this situation?

Answer: 20 nodes would be pruned. Because the tree is balanced, alpha-beta pruning will cut the search in half, so we won’t need to search 20 of the nodes and 40 - 20 = 20.

Suppose you have an oracle, OM(s), that correctly predicts the opponent’s move in any state. Using this, formulate the definition of a game as a (single-agent) search problem. What algorithm could be used to find an optimal move?

Since the oracle predicts the opponent’s move in any state, we no longer need to think about the opponent’s possible responses and the opponent's moves become deterministic, following the oracle’s predictions. We can formulate the game into a single-agent search problem because the oracle provides deterministic predictions. We can define this problem by a sequence of moves, where each state’s successor is determined by the player’s choice and the oracle's predicted responses. A* search could be used to find the optimal sequence of moves with a heuristic that estimates the distance to a winning state.

CSCI4511W - Problem Set 3 October 18, 2024

Problem 4. (20 points)

An “expectimax” tree consists of a max node at the root with alternating layers of chance and max nodes. At chance nodes, all outcome probabilities are nonzero, and the sum of all probabilities from a given chance node is 1. The goal is to find the value of the root with a bounded-depth search.

If leaf values are all nonnegative, is some version of pruning ever possible in an expec-timax tree? (That is, can we ignore nodes and yet guarantee the best choice is still made?) Give an example, or explain why not.

No pruning is possible in an expectimax tree with nonnegative leaf values because all nodes contribute to the expected values at chance nodes. In a expectimax tree, each chance node’s value depends on all child nodes due to the calculation of the expected value, so ignoring any node could affect the final expected value of that chance node. Even if all leaf values are nonnegative, every child node at a chance node contributes to the expected value, which means we cannot ignore any child without potentially affecting the result. Consider a game tree where the chance node has two children with equal probabilities and positive leaf values. Pruning one of the child nodes prematurely would lead to incorrect results because both nodes contribute to the expected value.

If leaf values are all in the range [0,1], is some version of pruning (same definition as last question) ever possible in an expectimax tree?

Pruning is possible in an expectimax tree with leaf values in the range [0,1]. This is because the agent can decide whether to prune branches based on the boundaries of the values. Consider a game tree where the chance node has two children with equal probabilities and leaf values in the range [0,1]. If one of the children's values is found to be higher than the current value at a max node, the other child's subtree can be pruned because its maximum possible value cannot exceed 1. Similarly, if one of the children's value is lower than the current value at a min node, the other child's subtree can be pruned because its minimum possible value is 0.

You will get a ZIP (2KB) file

CSCI4511W Problem Set 3 AI Game Playing, Minimax, Alpha-Beta Pruning & Expectimax

You Might Also Like