Course Home | Syllabus | Assignments | Schedule | Downloads | [print]
CS 4300: Artificial Intelligence
Assignment: Local Search for Agent Policy
Reflex agents can implemented with a lookup table, or policy, that maps states to actions. In this assignment, you will use local search to find a policy to use in a reflex agent. You are allowed to use any of the various local search algorithms discussed in class. If you would like to use a different local search algorithm, please clear it with the instructor first.
You will be searching for a policy that can be used in the Cliff Walking environment (cliffwalking.py).
Remember, you are searching for a policy that can be used in this environment. You are not searching for a path from the start to the goal.
Required Process/Files
Use the GitHub repository available for this course to store your
solution. Make a directory named cliff-policy-local-search
. Store the
local search program in cliff-policy-local-search/policy_search.py
.
This file must have a function policy_search()
that returns a
policy for the Cliff Walking environment. The policy must be a
Python list with 37 elements, one for each reachable state
in the environment. Each element is an integer, representing
the action that should be taken at that state.
You must also provide an agent class in the file cliff-policy-local-search/cliff_policy_agent.py
.
This class must be called PolicyAgent
, and have the following methods:
def __init__(self, policy):
stores the policy in a data member.def reset(self):
does nothing, but is required.def agent_function(self, state):
returns the action specified by the policy for the state.
The cliff_policy_search_runner.py
program can be used to run
your policy search, and save the policy to a file in JSON format.
It can also be used to measure the quality of the policy.
Use --help
to look at the command line options available.
For example, you could run these two commands to create a policy, and to measure it.
./cliff_policy_search_runner.py policy-search -p policy.json
./cliff_policy_search_runner.py policy-measure -p policy.json
Note that the policy_search()
function does not take any parameters.
It must be configured such that a call to it will find a good policy in less than
120 seconds, and return it.
Report
The report should describe which local search algorithm(s) was(were) used. It should also describe the objective function used. If the function has multiple components, explain each one, and how the components are combined to produce a useful objective function.
Scoring
Your agent will be scored by the following criteria:
- 5 policies will be found using your
policy_search()
function - all policies will be measured
- the highest and lowest scores will be ignored
- the average of the remaining three will be used as the score.
Required Submissions
- Code submitted to github.
- A PDF file containing your report submitted to Canvas.
Hints and Resources
You’ll likely spend more time coming up with ideas for objective function components and testing those ideas than writing the actual local search algorithm.
Remember, you’re searching for a policy. You are not searching for a path from the start to the goal.
The Cliff Walker environment gives an observation as a single integer, computed by (row*12+column). This makes for an index of a compact policy table (python list or np.array).
A CliffWalker model is recommended. It would have the methods necessary to support the generic local search algorithm you choose.
Last Updated 10/01/2024