### Abstract Architecture for Agents

• Assume the environment may be in any of a finite set E of discrete, instantaneous states:
• Agents are assumed to have a repertoire of possible actions available to them, which transform the state of the environment:
• A run , r , of an agent in an environment is a sequence of interleaved environment states and actions:

### Abstract Architecture for Agents

• Let:
• R be the set of all such possible finite sequences (over E and Ac )
• R Ac be the subset of these that end with an action
• R E be the subset of these that end with an environment state

### State Transformer Functions

• A state transformer function represents behavior of the environment:
• Note that environments are…
• history dependent
• non-deterministic
• If τ( r )=∅, then there are no possible successor states to r . In this case, we say that the system has ended its run
• Formally, we say an environment Env is a triple Env = ⟨ E , e0 , τ ⟩ where: E is a set of environment states, e0 E is the initial state, and τ is a state transformer function

### Agents

• Agent is a function which maps runs to actions:
• An agent makes a decision about what action to perform based on the history of the system that it has witnessed to date. Let AG be the set of all agents

### Systems

• A system is a pair containing an agent and an environment
• Any system will have associated with it a set of possible runs; we denote the set of runs of agent Ag in environment Env by R (Ag, Env)
• (We assume R (Ag, Env) contains only terminated runs)

### Systems

• Formally, a sequence
represents a run of an agent Ag in environment Env =⟨ E , e0 ,τ ⟩ if:
1. e0 is the initial state of Env
2. α0 = Ag ( e0 ); and
3. For u > 0,

### Purely Reactive Agents

• Some agents decide what to do without reference to their history — they base their decision making entirely on the present, with no reference at all to the past
• We call such agents purely reactive :
• A thermostat is a purely reactive agent

### Perception

• Now introduce perception system:

### Perception

• The see function is the agent’s ability to observe its environment, whereas the action function represents the agent’s decision making process
• Output of the see function is a percept :
• see : E Per
which maps environment states to percepts, and action is now a function
action : Per* A
which maps sequences of percepts to actions

### Agents with State

• We now consider agents that maintain state:

### Agents with State

• These agents have some internal data structure, which is typically used to record information about the environment state and history.Let I be the set of all internal states of the agent.
• The perception function see for a state-based agent is unchanged:
• see : E Per
The action-selection function action is now defined as a mapping
action : I Ac
from internal states to actions. An additional function next is introduced, which maps an internal state and percept to an internal state:
next : I × Per I

### Agent Control Loop

1. Agent starts in some initial internal state i0
2. Observes its environment state e , and generates a percept see ( e )
3. Internal state of the agent is then updated via next function, becoming next ( i0 , see ( e ))
4. The action selected by the agent is action ( next ( i0 , see ( e )))
5. Goto 2

• We build agents in order to carry out tasks for us
• The task must be specified by us…
• But we want to tell agents what to do without telling them how to do it

### Utility Functions over States

• One possibility: associate utilities with individual states — the task of the agent is then to bring about states that maximize utility
• A task specification is a function
• u : E → ◊
which associates a real number with every environment state

### Utility Functions over States

• But what is the value of a run…
• minimum utility of state on run?
• maximum utility of state on run?
• sum of utilities of states on run?
• average?
• Disadvantage: difficult to specify a long term view when assigning utilities to individual states (One possibility: a discount for states later on.)

### Utilities over Runs

• Another possibility: assigns a utility not to individual states, but to runs themselves:
• u : R → ◊
• Such an approach takes an inherently long term view
• Other variations: incorporate probabilities of different states emerging
• Difficulties with utility-based approaches:
• where do the numbers come from?
• we don’t think in terms of utilities!
• hard to formulate tasks in these terms

### Expected Utility & Optimal Agents

• Write P ( r | Ag , Env ) to denote probability that run r occurs when agent Ag is placed in environment Env

Note:
• Then optimal agent Ag opt in an environment Env is the one that maximizes expected utility :

### Bounded Optimal Agents

• Some agents cannot be implemented on some computers
(A function Ag : R E Ac may need more than available memory to implement)
• Write AG m to denote the agents that can be implemented on machine (computer) m :
• We can replace equation (1) with the following, which defines the bounded optimal agent Ag opt :

• A special case of assigning utilities to histories is to assign 0 (false) or 1 (true) to a run
• If a run is assigned 1, then the agent succeeds on that run, otherwise it fails
• Call these predicate task specifications
• Denote predicate task specification by Ψ.
Thus Ψ : R → {0, 1}.

• A task environment is a pair ⟨ Env , Ψ ⟩ where Env is an environment,
• Ψ : R → {0, 1}
is a predicate over runs.Let TE be the set of all task environments.
• the properties of the system the agent will inhabit
• the criteria by which an agent will be judged to have either failed or succeeded

• Write R Ψ ( Ag , Env ) to denote set of all runs of the agent Ag in environment Env that satisfy Ψ:
• We then say that an agent Ag succeeds in task environment ⟨ Env , Ψ ⟩ if

1. Achievement tasks are those of the form “achieve state of affairs Φ”
2. Maintenance tasks are those of the form “maintain state of affairs Ψ”

• An achievement task is specified by a set G of “good” or “goal” states: G E

The agent succeeds if it is guaranteed to bring about at least one of these states (we do not care which one — they are all considered equally good).

• A maintenance goal is specified by a set B of “bad” states: B E

The agent succeeds in a particular environment if it manages to avoid all states in B — if it never performs actions which result in any state in B occurring

### Agent Synthesis

• Agent synthesis is automatic programming: goal is to have a program that will take a task environment, and from this task environment automatically generate an agent that succeeds in this environment:
(Think of ⊥ as being like null in Java.)
• Synthesis algorithm is:
• sound if, whenever it returns an agent, then this agent succeeds in the task environment that is passed as input
• complete if it is guaranteed to return an agent whenever there exists an agent that will succeed in the task environment given as input

### Agent Synthesis

• Synthesis algorithm syn is sound if it satisfies the following condition:
and complete if: