{"id":1107,"date":"2025-02-15T04:24:11","date_gmt":"2025-02-15T04:24:11","guid":{"rendered":"https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/?p=1107"},"modified":"2025-02-15T04:24:12","modified_gmt":"2025-02-15T04:24:12","slug":"week-5-the-testing-plan","status":"publish","type":"post","link":"https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/2025\/02\/15\/week-5-the-testing-plan\/","title":{"rendered":"Week 5: The Testing Plan"},"content":{"rendered":"\n<p>This week, Team Tactica met to discuss our upcoming testing plan for our project!<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"768\" src=\"https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-content\/uploads\/sites\/177\/2025\/02\/20250211_172102-1024x768.jpg\" alt=\"Team Tactica in a meeting room.\" class=\"wp-image-1109\" srcset=\"https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-content\/uploads\/sites\/177\/2025\/02\/20250211_172102-1024x768.jpg 1024w, https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-content\/uploads\/sites\/177\/2025\/02\/20250211_172102-300x225.jpg 300w, https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-content\/uploads\/sites\/177\/2025\/02\/20250211_172102-768x576.jpg 768w, https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-content\/uploads\/sites\/177\/2025\/02\/20250211_172102-1536x1152.jpg 1536w, https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-content\/uploads\/sites\/177\/2025\/02\/20250211_172102-2048x1536.jpg 2048w, https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-content\/uploads\/sites\/177\/2025\/02\/20250211_172102-280x210.jpg 280w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">Team Tactica.<\/figcaption><\/figure>\n\n\n\n<p>Testing our reinforcement learning agent is a little tricky. We want our model to have a high &#8220;win rate&#8221;, but <em>Catan<\/em> is a multiplayer game, meaning we need at least one more player to play against. Additionally, <em>Catan<\/em> is designed to be 3-4 players; scores from 1v1 games are less meaningful since <em>Catan<\/em> is not normally played this way.<\/p>\n\n\n\n<p>We plan to use many different types of players in our tests:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Random choice players\n<ul class=\"wp-block-list\">\n<li>Chooses randomly among the possible actions.<\/li>\n\n\n\n<li>Useful for determining if our model has learned anything.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>AlphaBeta players\n<ul class=\"wp-block-list\">\n<li>The current best-performing player in Catanatron.<\/li>\n\n\n\n<li>Useful for challenging our model to the highest standard.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Tactica Value Function players\n<ul class=\"wp-block-list\">\n<li>Players that use a value function tuned to a specific strategy to decide on moves.<\/li>\n\n\n\n<li>Useful for determining if our model outperforms any single strategy.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"2560\" height=\"1920\" src=\"https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-content\/uploads\/sites\/177\/2025\/02\/20250213_145030-edited-scaled.jpg\" alt=\"Brian working at a table with a laptop.\" class=\"wp-image-1113\" style=\"width:651px;height:auto\" srcset=\"https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-content\/uploads\/sites\/177\/2025\/02\/20250213_145030-edited-scaled.jpg 2560w, https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-content\/uploads\/sites\/177\/2025\/02\/20250213_145030-edited-300x225.jpg 300w, https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-content\/uploads\/sites\/177\/2025\/02\/20250213_145030-edited-1024x768.jpg 1024w, https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-content\/uploads\/sites\/177\/2025\/02\/20250213_145030-edited-768x576.jpg 768w, https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-content\/uploads\/sites\/177\/2025\/02\/20250213_145030-edited-1536x1152.jpg 1536w, https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-content\/uploads\/sites\/177\/2025\/02\/20250213_145030-edited-2048x1536.jpg 2048w, https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-content\/uploads\/sites\/177\/2025\/02\/20250213_145030-edited-280x210.jpg 280w\" sizes=\"(max-width: 2560px) 100vw, 2560px\" \/><figcaption class=\"wp-element-caption\">Brian working on Catanatron.<\/figcaption><\/figure>\n\n\n\n<p>Furthermore, whatever testing plan we devise, we need to be able to test the model quickly. Right now, we can run a command on our personal computers to play 1 set of 1,000 games. But we may need to run many sets of games against many different player types. As our models become more complex, this process will become more tedious and time-consuming. What we need is a way to automate our tests and have them run on more powerful hardware. This way, we can iterate upon our designs quickly.<\/p>\n\n\n\n<p>That&#8217;s all for now. See you next week!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This week, Team Tactica met to discuss our upcoming testing plan for our project! Testing our reinforcement learning agent is a little tricky. We want our model to have a high &#8220;win rate&#8221;, but Catan is a multiplayer game, meaning&hellip; <a href=\"https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/2025\/02\/15\/week-5-the-testing-plan\/\" aria-label=\"Read \\\"Week 5: The Testing Plan\\\" class=\"read-more\">Read&nbsp;More<\/a><\/p>\n","protected":false},"author":617,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":"","_links_to":"","_links_to_target":""},"categories":[12],"tags":[],"class_list":["post-1107","post","type-post","status-publish","format-standard","hentry","category-springsemester"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-json\/wp\/v2\/posts\/1107"}],"collection":[{"href":"https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-json\/wp\/v2\/users\/617"}],"replies":[{"embeddable":true,"href":"https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-json\/wp\/v2\/comments?post=1107"}],"version-history":[{"count":1,"href":"https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-json\/wp\/v2\/posts\/1107\/revisions"}],"predecessor-version":[{"id":1115,"href":"https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-json\/wp\/v2\/posts\/1107\/revisions\/1115"}],"wp:attachment":[{"href":"https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-json\/wp\/v2\/media?parent=1107"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-json\/wp\/v2\/categories?post=1107"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.ippd.ufl.edu\/blogs\/ay2425team03\/wp-json\/wp\/v2\/tags?post=1107"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}