Lab Sheet 3: Mixing it with MixMeta4 - Simple AgentsIn the previous labs we have developed an agent framework which can beapplied to many problem domains. In this lab we will begin applyingthis framework to a more involved and interesting domain:the MixMeta4 environment and particularly the game CheX.The aim of this lab is to gain familiarity with the MixMeta4environment, and to develop some simple agents that are able tointeract with that environment. This lab continues to build towards the project, which will be carried out in the MixMeta4 environment. Getting to know MixMeta4The best way to get started is by playing a game or two. Start bydownloading the latest packages from the Java Code link on the unitwebsite. You will need the packages agent.jar, mixmeta4.jar and samplePlayers.jar, as well as the archives player.zip and search.zip and the file config.txt (some of these you will have already). Three sample agents, Arnie, Marvin (the Paranoid Android) and Hal, areprovided to get you started. You can run games by executing theclass Play in package player. Give this a try to makesure you have everything working. Make sure you are in the directory containing the packages and execute: java -classpath ".:agent.jar:etc.." player.Play. Note that you can leave the classpath out of the command if you set it in your shell resource (eg .zshrc) file. The MixMeta4 system is highly configurable and able to support a wide range of strategy games determined by the board configuration, the pieces, and the success criteria. This year it is preconfigured with 3 games: - Finish First - the winner is the first player to get all remaining pieces to the far side of the board (note that this means if all the players pieces are taken they also win).
- TakeAll - the winner is the first player to take all the opponent's pieces.
- CheX (pronounced like , in honour of Donald Knuth's typesetting wonder) - the winner is the first player to take the opponent's king.
We will primarily use the game CheX. You can try out different games in two ways. - You can directly edit the file Play.java. This allows you to choose the game and clock times. You can also choose whatever board configuration and pieces you like by editing the string setup in this file. You will need to recompile after editing the file. Try a few variations.
- More efficient is to use the version PlayCF which gets its configuration information from the file config.txt. After editing the file you need to rerun PlayCF, but you do not need to recompile it. With this version you can also enter as many players as you like, and choose new games between them from the running program.
Try some games between Arnie, Marvin and Hal. Can you guess what strategy Marvin is using? Hint: It is aone-step look-ahead utility based strategy (see below). Building Agents Agents that can interact with the MixMeta4 environment can be developed by subclassing the Player class (which in turn implements the Agent interface), and then editing Play or config.txt to use your agent. Note that you should not need to alter any provided classes other than Play or (config.txt) to run your agents. (The other classes should not be altered as this will prevent your agent playing against others.) As discussed in lectures, the interface between the environment and an agent is via the methods getPercept(), getAction(Percept p) and update(Action a). The first and third methods are required by the Environment interface, and provided by its implementation Board. The second is required by the Agent interface, and must be provided by your implementation of an agent. This is the "agent function" referred to in the text Russell & Norvig. The top-level calling of these methods is handled by the Game class and its graphical user interface. You simply need to edit the Play class (or config.txt) to pass the appropriate players to Game and it does the rest. The percept that is passed to your agent's getAction method is a deep clone of the Board (which in turn contains Squares, Pieces, etc) excluding the graphical user interface. The Board class also implements the State interface, and provides a range of methods that you can use in your agent code, for example for examining the outcome of moves. You should use accessor methods rather than instance variables, as instance variable access is deprecated and will be made private in later releases. (If you need accessor methods that are not provided please send a request to Cara.) In addition, to allow you to concentrate on the agent's strategies rather than lower-level details, I have provided a method board.getActions() that returns all the valid actions for any given state of the board. In case that was hard to follow in prose, here is the first few lines of Arnie's code: package samplePlayer; import mixmeta4.*; import agent.*; import search.*; import java.util.*;
public class Arnie extends Player implements Agent {
public Arnie (boolean isRed) { super(isRed,"Arnie"); }
public Action getAction (Percept percept) { Board board = (Board) percept; Actions moves = board.getActions();
...code to select and return an action... }
Aside: For those familiar with the Model-View-Controller (MVC) software engineering principle, the software is designed according to this principle. The environment (in particular the board and its subcomponents) defines the model, and exists independently of any particular view - this is what your program interacts with. A view is provided by the GUI, but this is not necessary for the program to work. It can also be run, for example, with output to the terminal (shell). You can think of your agents as the controllers. ExercisesNote: It is a good idea to keep a record of your work (copies of your agents etc) as you may wish to refer to work you have done in the labs in your practical assignment. - A simple reflex agent
A simple reflex agent simply chooses a valid action for a given percept. The maximal set of valid "condition-action" pairs is provided implicitly by the board.getActions() method, since only actions which are valid (that is, the "condition" is satisfied) are returned. - Implement a simple reflex player by implementing the getAction method to randomly choose a valid action. You can add your player to the player package.
- Try your agent out against one of the other agents to check that it works.
This simple reflex agent clearly has no skills that will help it win the game. It can be improved by refining the conditions under which actions are applied. An example strategy for Finish First (which may or may not help) would be to move the piece that has the longest distance to go. An example strategy for TakeAll would be to move the more mobile pieces (in the hope of taking before being taken). - Try refining the conditions that select the action that is chosen. See if you can develop an agent that beats the random reflex agent.
- A simple utility-based agent
The simple reflex agent has limited scope for developing skills that will help it win the game. In subsequent labs we will be developing utility-based agents that plan ahead. However even a fairly simple one-step look-ahead can provide a significant improvement in performance. A one-step look-ahead agent will try each of the possible moves using a model of the environment - in this case the Board that has been provided to you as the percept - and evaluate the resulting positions to select the most promising one. This evaluation function is an example of a utility function (hence this is a simple utility-based agent). For example, one strategy would be to move to a position that maximises your options prior to the opponent's move. In this case the utility function returns the number of options. - Implement an agent with this evaluation function. You can do this by using the Board as your implementation of State - cloning a new copy of the Board for each available move (so you don't change the actual game), and trying the move out by passing it the cloned board's update function. The resulting position can be examined in an implementation of NodeInfo.
Try the new agent out against your reflex agent to see if it is more successful. The extra competency may be more apparent with some board sizes, configurations, and number of pieces than others, so you may wish to try a few. - What other one-step strategies can you think of? Note that you have access to the current pieces through board.getRedPieces() and board.getBlackPieces() methods. Try to develop the optimal one-step strategy/evalutation function. Try your proposals out by building a player and playing off against the players provided and other student's players and see if they appear successful.
Note: Although this is only a short look-ahead, it is not a trivial problem. As we discuss in later lectures, most realistic strategy games have search spaces far too large for goal-directed planning, and therefore agents are restricted to trying to achieve the best behaviour possible with an n-step look-ahead. The quality of the agent depends critically on the quality of the evaluation function. In other words, we are seeking to achieve the best long-term behaviour from limited information and resources. This equally applies to "real-world" problems. |