Week 5: The Testing Plan

This week, Team Tactica met to discuss our upcoming testing plan for our project!

Team Tactica in a meeting room.
Team Tactica.

Testing our reinforcement learning agent is a little tricky. We want our model to have a high “win rate”, but Catan is a multiplayer game, meaning we need at least one more player to play against. Additionally, Catan is designed to be 3-4 players; scores from 1v1 games are less meaningful since Catan is not normally played this way.

We plan to use many different types of players in our tests:

  • Random choice players
    • Chooses randomly among the possible actions.
    • Useful for determining if our model has learned anything.
  • AlphaBeta players
    • The current best-performing player in Catanatron.
    • Useful for challenging our model to the highest standard.
  • Tactica Value Function players
    • Players that use a value function tuned to a specific strategy to decide on moves.
    • Useful for determining if our model outperforms any single strategy.
Brian working at a table with a laptop.
Brian working on Catanatron.

Furthermore, whatever testing plan we devise, we need to be able to test the model quickly. Right now, we can run a command on our personal computers to play 1 set of 1,000 games. But we may need to run many sets of games against many different player types. As our models become more complex, this process will become more tedious and time-consuming. What we need is a way to automate our tests and have them run on more powerful hardware. This way, we can iterate upon our designs quickly.

That’s all for now. See you next week!

Leave a Reply

Your email address will not be published. Required fields are marked *