Goal Oriented Action Planning for a Smarter AI

by Brent Owens23 Apr 2014

Goal Oriented Action Planning (GOAP) is an AI system that will easily give your agents choices and the tools to make smart decisions without having to maintain a large and complex finite state machine.

View the Demo

In this demo, there are four character classes, each using tools that break after being used for a while:

  • Miner: Mines ore at rocks. Needs a tool to work.
  • Logger: Chops trees to produce logs. Needs a tool to work.
  • Wood Cutter: Cuts up trees into usable wood. Needs a tool to work.
  • Blacksmith: Forges tools at the forge. Everyone uses these tools.

Each class will figure out automatically, using goal oriented action planning, what actions they need to perform to reach their goals. If their tool breaks, they will go to a supply pile that has one made by the blacksmith.

What is GOAP?

Goal oriented action planning is an artificial intelligence system for agents that allows them to plan a sequence of actions to satisfy a particular goal. The particular sequence of actions depends not only on the goal but also on the current state of the world and the agent. This means that if the same goal is supplied for different agents or world states, you can get a completely different sequence of actions., which makes the AI more dynamic and realistic. Lets look at an example, as seen in the demo above.

We have an agent, a wood chopper, that takes logs and chops them up into firewood. The chopper can be supplied with the goal MakeFirewood, and has the actions ChopLog, GetAxe, and CollectBranches.

The ChopLog action will turn a log into firewood, but only if the wood cutter has an axe. The GetAxe action will give the wood cutter an axe. Finally, theCollectBranches action will produce firewood as well, without requiring an axe, but the firewood will not be as high in quality.

When we give the agent the MakeFirewood goal, we get these two different action sequences:

  • Needs firewood -> GetAxe -> ChopLog = makes firewood
  • Needs firewood -> CollectBranches = makes firewood

If the agent can get an axe, then they can chop a log to make firewood. But maybe they cannot get an axe; then, they can just go and collect branches. Each of these sequences will fulfill the goal of MakeFirewood.

GOAP can choose the best sequence based on what preconditions are available. If there is no axe handy, then the wood cutter has to resort to picking up branches. Picking up branches can take a really long time and yield poor quality firewood, so we don’t want it to run all the time, only when it has to.

Who GOAP is For

You are, by now, probably familiar with Finite State Machines (FSM), but if not, then take a look at this terrific tutorial.

You might have run into very large and complex states for some of your FSM agents, where you eventually get to a point where you do not want to add new behaviours because they cause too many side effects and gaps in the AI.

GOAP turns this:

Finite State Machine states: connected everywhere.

Into this:

GOAP: nice and manageable.

By decoupling the actions from each other, we can now focus on each action individually. This makes the code modular, and easy to test and to maintain. If you want to add in another action, you can just plunk it in, and no other actions have to be changed. Try doing that with an FSM!

Also, you can add or remove actions on the fly to change the behaviour of an agent to make them even more dynamic. Have an ogre that suddenly started raging? Give them a new "rage attack" action that gets removed when they calm down. Simply adding the action to the list of actions is all you have to do; the GOAP planner will take care of the rest.

If you find you have a very complex FSM for your agents, then you should give GOAP a try. One sign your FSM is getting too complex is when every state has a myriad of if-else statements testing what state they should go to next, and adding in a new state makes you groan at all the implications it might have.

If you have a very simple agent that just performs one or two tasks, then GOAP might be a little heavy-handed and an FSM will suffice. However, it is worth looking at the concepts here and seeing whether they would be easy enough for you to plug into your agent.

Actions

An action is something that the agent does. Usually it is just playing an animation and a sound, and changing a little bit of state (for instance, adding firewood). Opening a door is a different action (and animation) than picking up a pencil. An action is encapsulated, and should not have to worry about what the other actions are.

To help GOAP determine what actions we want to use, each action is given acost. A high cost action will not be chosen over a lower cost action. When we sequence the actions together, we add up the costs and then choose the sequence with the lowest cost.

Lets assign some costs to the actions:

  • GetAxe Cost: 2
  • ChopLog Cost: 4
  • CollectBranches Cost: 8

If we look at the sequence of actions again and add up the total costs, we will see what the cheapest sequence is:

  • Needs firewood -> GetAxe (2) -> ChopLog (4) = makes firewood (total: 6)
  • Needs firewood -> CollectBranches (8) = makes firewood (total: 8)

Getting an axe and chopping a log produces firewood at the lower cost of 6, while collecting the branches produces wood at the higher cost of 8. So, our agent chooses to get an axe and chop wood.

But won’t this same sequence run all the time? Not if we introducepreconditions...

Preconditions and Effects

Actions have preconditions and effects. A precondition is the state that is required for the action to run, and the effects are the change to the state after the action has run.

For example, the ChopLog action requires the agent to have an axe handy. If the agent does not have an axe, it needs to find another action that can fulfill that precondition in order to let the ChopLog action run. Luckily, the GetAxe action does that—this is the effect of the action.

The GOAP Planner

The GOAP planner is a piece of code that looks at actions‘ preconditions and effects, and creates queues of actions that will fulfill a goal. That goal is supplied by the agent, along with a world state, and a list of actions the agent can perform. With this information the GOAP planner can order the actions, see which can run and which can’t, and then decide which actions are the best to perform. Luckily for you, I’ve written this code, so you don’t have to.

To set this up, lets add preconditions and effects to our wood chopper’s actions:

  • GetAxe Cost: 2. Preconditions: "an axe is available", "doesn’t have an axe". Effect: "has an axe".
  • ChopLog Cost: 4. Preconditions: "has an axe". Effect: "make firewood"
  • CollectBranches Cost: 8. Preconditions: (none). Effect: "make firewood".

The GOAP planner now has the information needed to order the sequence of actions to make firewood (our goal).

We start by supplying the GOAP Planner with the current state of the world and the state of the agent. This combined world state is:

  • "doesn’t have an axe"
  • "an axe is available"
  • "the sun is shining"

Looking at our current available actions, the only part of the states that are relevant to them is the "doesn’t have an axe" and the "an axe is available" states; the other one might be used for other agents with other actions.

Okay, we have our current world state, our actions (with their preconditions and effects), and the goal. Let’s plan!


01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18


GOAL: "make firewood"

Current State: "doesn’t have an axe", "an axe is available"

Can action ChopLog run?

NO - requires precondition "has an axe"

Cannot use it now, try another action.

Can action GetAxe run?

YES, preconditions "an axe is available" and "doesn’t have an axe" are true.

PUSH action onto queue, update state with action’s effect

New State

"has an axe"

Remove state "an axe is available" because we just took one.

Can action ChopLog run?

YES, precondition "has an axe" is true

PUSH action onto queue, update state with action’s effect

New State

"has an axe", "makes firewood"

We have reached our GOAL of  "makes firewood"

Action sequence: GetAxe -> ChopLog

The planner will run through the other actions, too, and it won’t just stop when it finds a solution to the goal. What if another sequence has a lower cost? It will run through all possibilities to find the best solution.

When it plans, it builds up a tree. Every time an action is applied, it is popped off the list of available actions, so we don’t have a string of 50 GetAxe actions back-to-back. The state is changed with that action’s effect.

The tree that the planner builds up looks like this:

We can see that it will actually find three paths to the goal with their total costs:

  • GetAxe -> ChopLog (total: 6)
  • GetAxe -> CollectBranches (total: 10)
  • CollectBranches (total: 8)

Although GetAxe -> CollectBranches works, the cheapest path is GetAxe ->ChopLog, so this one is returned.

What do preconditions and effects actually look like in code? Well, that‘s up to you, but I have found it easiest to store them as a key-value pair, where the key is always a String and the value is an object or primitive type (float, int, Boolean, or similar). In C#, that could look like this:


1

2

3


HashSet< KeyValuePair<string,object> > preconditions;

HashSet< KeyValuePair<string,object> > effects;

When the action is performing, what do these effects actually look like and what do they do? Well, they don’t have to do anything—they are really just used for planning, and don’t affect the real agent’s state until they run for real.

This is worth emphasising: planning actions is not the same as running them. When an agent performs the GetAxe action, it will probably be near a pile of tools, play a bend-down-and-pick-up animation, and then store an axe object in its backpack. This changes the state of the agent. But, during GOAP planning, the state change is just temporary, so that the planner can figure out the optimal solution.

Procedural Preconditions

Sometimes, actions need to do a little more to determine whether they can run. For instance, the GetAxe action has the precondition of "an axe is available" that will need to search the world, or the immediate vicinity, to see whether there is an axe the agent can take. It might determine that the nearest axe is just too far away or behind enemy lines, and will say that it cannot run. This precondition is procedural and needs to run some code; it‘s not a simple Boolean operator that we can just toggle.

Obviously, some of these procedural preconditions can take a while to run, and should be performed on something other than the render thread, ideally as a background thread or as Coroutines (in Unity).

You could have procedural effects too, if you so desire. And if you want to introduce even more dynamic results, you can change the cost of actions on the fly!

GOAP and State

Our GOAP system will need to live in a small Finite State Machine (FSM), for the sole reason that, in many games, actions will need to be close to a target in order to perform. We end up with three states:

  • Idle
  • MoveTo
  • PerformAction

When idle, the agent will figure out which goal they want
to fulfill. This part is handled outside of GOAP; GOAP will just tell you which
actions you can run to perform that goal. When a goal is chosen it is passed to
the GOAP Planner, along with the world and agent starting state, and the
planner will return a list of actions (if it can fulfill that goal).

When the planner is done and the agent has its list of
actions, it will try to perform the first action. All actions will need to know
if they must be in range of a target. If they do, then the FSM will push on the
next state: MoveTo.

The MoveTo state will tell the agent that it needs to
move to a specific target. The agent will do the moving (and play the walk
animation), and then let the FSM know when it is within range of the target.
This state is then popped off, and the action can perform.

The PerformAction state will run the next action in the queue
of actions returned by the GOAP Planner. The action can be instantaneous or
last over many frames, but when it is done it gets popped off and then the next
action is performed (again, after checking whether that next action needs to be
performed within range of an object).

This all repeats until there are no actions left to
perform, at which point we go back to the Idle state,
get a new goal, and plan again.

A Real Code Example

It‘s time to take a look at a real example! Don’t
worry; it isn’t that complicated, and I have provided a working copy in Unity
and C# for you to try out. I will just talk about it briefly here so you get a
feel for the architecture. The code uses some of the same WoodChopper examples
as above.

If you want to dig right in, head here for the code:http://github.com/sploreg/goap

We have four labourers:

  • Blacksmith: turns iron ore into tools.
  • Logger: uses a tool to chop down trees to produce logs.
  • Miner: mines rocks with a tool to produce iron ore.
  • Wood cutter: uses a tool to chop logs to produce firewood.

Tools wear out over time and will need to be replaced.
Fortunately, the Blacksmith makes tools. But iron ore is needed to make tools;
that’s where the Miner comes in (who also needs tools). The Wood Cutter needs
logs, and those come from the Logger; both need tools as well.

Tools and resources are stored on supply piles. The
agents will collect the materials or tools they need from the piles, and
also drop off their product at them.

The code has six main GOAP classes:

  • GoapAgent: understands state and
    uses the FSM and GoapPlanner to operate.
  • GoapAction: actions that agents can
    perform.
  • GoapPlanner: plans the
    actions for the GoapAgent.
  • FSM: the finite state machine.
  • FSMState: a state in the FSM.
  • IGoap: the interface that our
    real Labourer actors use. Ties into events for GOAP and the FSM.

Lets
look at the GoapAction class, since that is the one you will subclass:


001

002

003

004

005

006

007

008

009

010

011

012

013

014

015

016

017

018

019

020

021

022

023

024

025

026

027

028

029

030

031

032

033

034

035

036

037

038

039

040

041

042

043

044

045

046

047

048

049

050

051

052

053

054

055

056

057

058

059

060

061

062

063

064

065

066

067

068

069

070

071

072

073

074

075

076

077

078

079

080

081

082

083

084

085

086

087

088

089

090

091

092

093

094

095

096

097

098

099

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115


public abstract class GoapAction : MonoBehaviour {

private HashSet<KeyValuePair<string,object>> preconditions;

private HashSet<KeyValuePair<string,object>> effects;

private bool inRange = false;

/*
The cost of performing the action.

*
Figure out a weight that suits the action.

*
Changing it will affect what actions are chosen during planning.*/

public float cost = 1f;

/**

*
An action often has to perform on an object. This is that object. Can be
null. */

public GameObject target;

public GoapAction() {

preconditions
= new HashSet<KeyValuePair<string, object>>
();

effects
= new HashSet<KeyValuePair<string, object>>
();

}

public void doReset() {

inRange
= false;

target
= null;

reset
();

}

/**

*
Reset any variables that need to be reset before planning happens again.

*/

public abstract void reset();

/**

*
Is the action done?

*/

public abstract bool isDone();

/**

*
Procedurally check if this action can run. Not all actions

*
will need this, but some might.

*/

public abstract bool checkProceduralPrecondition(GameObject
agent);

/**

*
Run the action.

*
Returns True if the action performed successfully or false

*
if something happened and it can no longer perform. In this case

*
the action queue should clear out and the goal cannot be reached.

*/

public abstract bool perform(GameObject agent);

/**

*
Does this action need to be within range of a target game object?

*
If not then the moveTo state will not need to run for this action.

*/

public abstract bool requiresInRange ();

/**

*
Are we in range of the target?

*
The MoveTo state will set this and it gets reset each time this action is
performed.

*/

public bool isInRange () {

return inRange;

}

public void setInRange(bool inRange)
{

this.inRange
= inRange;

}

public void addPrecondition(string key,
object value) {

preconditions.Add
(new KeyValuePair<string, object>(key, value) );

}

public void removePrecondition(string key) {

KeyValuePair<string,
object> remove = default(KeyValuePair<string,object>);

foreach (KeyValuePair<string, object> kvp in preconditions)
{

if (kvp.Key.Equals (key))

remove
= kvp;

}

if ( !default(KeyValuePair<string,object>).Equals(remove) )

preconditions.Remove
(remove);

}

public void addEffect(string key,
object value) {

effects.Add
(new KeyValuePair<string, object>(key, value) );

}

public void removeEffect(string key) {

KeyValuePair<string,
object> remove = default(KeyValuePair<string,object>);

foreach (KeyValuePair<string, object> kvp in effects)
{

if (kvp.Key.Equals (key))

remove
= kvp;

}

if ( !default(KeyValuePair<string,object>).Equals(remove) )

effects.Remove
(remove);

}

public HashSet<KeyValuePair<string, object>> Preconditions {

get {

return preconditions;

}

}

public HashSet<KeyValuePair<string, object>> Effects {

get {

return effects;

}

}

}

Nothing too fancy here: it stores preconditions and
effects. It also knows whether it must be in range of a target, and, if so,
then the FSM knows to push the MoveTo state when needed. It knows when it is
done, too; that is determined by the implementing action class.

Here
is one of the actions:


01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81


public class MineOreAction : GoapAction

{

private bool mined = false;

private IronRockComponent targetRock; // where we get the ore from

private float startTime = 0;

public float miningDuration = 2; // seconds

public MineOreAction () {

addPrecondition
("hasTool", true); // we need a tool to do this

addPrecondition
("hasOre", false); // if we have ore we don‘t want more

addEffect
("hasOre", true);

}

public override void reset ()

{

mined
= false;

targetRock
= null;

startTime
= 0;

}

public override bool isDone ()

{

return mined;

}

public override bool requiresInRange ()

{

return true; // yes we need to be near a rock

}

public override bool checkProceduralPrecondition
(GameObject agent)

{

//
find the nearest rock that we can mine

IronRockComponent[]
rocks = FindObjectsOfType ( typeof(IronRockComponent) ) as IronRockComponent[];

IronRockComponent
closest = null;

float closestDist = 0;

foreach (IronRockComponent rock in rocks) {

if (closest == null) {

//
first one, so choose it for now

closest
= rock;

closestDist
= (rock.gameObject.transform.position - agent.transform.position).magnitude;

}
else {

//
is this one closer than the last?

float dist = (rock.gameObject.transform.position -
agent.transform.position).magnitude;

if (dist < closestDist) {

//
we found a closer one, use it

closest
= rock;

closestDist
= dist;

}

}

}

targetRock
= closest;

target
= targetRock.gameObject;

return closest != null;

}

public override bool perform (GameObject agent)

{

if (startTime == 0)

startTime
= Time.time;

if (Time.time - startTime > miningDuration) {

//
finished mining

BackpackComponent
backpack = (BackpackComponent)agent.GetComponent(typeof(BackpackComponent));

backpack.numOre
+= 2;

mined
= true;

ToolComponent
tool = backpack.tool.GetComponent(typeof(ToolComponent)) as ToolComponent;

tool.use(0.5f);

if (tool.destroyed()) {

Destroy(backpack.tool);

backpack.tool
= null;

}

}

return true;

}

}

The largest part of the action is the checkProceduralPreconditions method. It looks for the nearest game
object with an IronRockComponent, and saves this target rock. Then,
when it performs, it gets that saved target rock and will perform the action on
it. When the action is re-used in planning again, all of its fields are reset
so that they can be calculated again.

These are all components that are added to the Miner entity object in Unity:

In order for your agent to work, you must add the
following components to it:

  • GoapAgent.
  • A class that implements IGoap (in the above example, that‘s Miner.cs).
  • Some actions.
  • A backpack (only because the actions use it; it is unrelated to
    GOAP).

You
can add whatever actions you want, and this would change how the agent
behaves. You could even give it all actions so it can mine ore, forge tools,
and chop wood.

Here is the demo in action again:

Each labourer goes to the target that they need to
fulfill their action (tree, rock, chopping block, or whatever), performs the
action, and often returns to the supply pile to drop off their goods. The
Blacksmith will wait a little while until there is iron ore in one of the
supply piles (added to by the Miner). The Blacksmith then goes off and makes
tools, and will drop off the tools at the supply pile nearest him. When a
labourer’s tool breaks they will head off to the supply pile near the
Blacksmith where the new tools are.

You can grab the code and the full app here: http://github.com/sploreg/goap.

Conclusion

With
GOAP, you can create a large series of actions without the headache
of interconnected states that often comes with a Finite State Machine.
Actions can be added and removed from an agent to produce dynamic results, as
well as to keep you sane when maintaining the code. You will end up
with a flexible, smart, and dynamic AI.

时间: 2024-11-06 07:33:22

Goal Oriented Action Planning for a Smarter AI的相关文章

先搜集下关于行动列表和行为树的AI制作

作为人类,我们似乎总倾向于使用自己所熟悉的解决方法.我们总是会按照我们所知道的方式去做某些事,而不是按照做这些事的“最佳”方式.因为总是带着这种想法,所以我们很容易使用一些过时的技术,并使用那些同时代人所不理解,或者并不是那么有效的方式去执行特定功能.所以我希望通过本文以及之后的文章向广大读者们介绍更多能够在编程中带给你们帮助的解决方法.今天我要分享的便是行动列表! 行动列表是所有游戏开发者都必须清楚的简单但却强大的AI.尽管不能与巨大的AI网络相匹敌,但是它们允许相对复杂的突发行为,并且执行起

基于百度AI开放平台的人脸识别及语音合成

基于百度AI的人脸识别及语音合成课题 课题需求 (1)人脸识别 在Web界面上传人的照片,后台使用Java技术接收图片,然后对图片进行解码,调用云平台接口识别人脸特征,接收平台返回的人员年龄.性别.颜值等信息,将信息返回到Web界面进行显示. (2)人脸比对 在Web界面上传两张人的照片,后台使用Java技术接收图片,然后对图片进行解码,调用云平台接口比对照片信息,返回相似度. (3)语音识别 在Web页面上传语音文件,判断语音文件格式,如果不是wav格式进行转码处理,然后调用平台接口进行识别,

智能规划发展趋势

规划算法/思想规划语言动作表示规划和执行规划的发展趋势攻击规划的发展趋势参考文献 TOC 规划算法/思想 1975年之前,大部分是状态空间规划 1975年之后,出现了一些局部规划空间规划,规划空间中的点变成了局部规划 之后出现了 action-ordering 表示方法,这种规划方法描述的是 action 之间的关系而不是通过状态之间的前提之类的,可以在不需要明确中间状态的 情况下对两个动作进行排序.相反,state-based plan structures 需要完整的描述中间状态,使得这种表

攻城狮初学用LoadRunner11来做性能测试

第一步,打开Virtual User Generator.在创建脚本之前,要在Tools里选择Record Optoins中把如下图在选项勾去掉: 点击脚本,并创建脚本,注意协议要选择Web/HTTP/HTML的. 接着,在弹出的录制窗口里,填写要录制的信息.注意录制前要将安全管家或者其他杀毒软件关闭. 1.Program to record是选择你要用的浏览器,默认会是IE,但是如果IE版本过高会出现录制时会有一个弹出不浏览器的错误以致于无法进行正常录制.如果没有IE9或者其以下的浏览器,需要

Software Engineer(百赴美)

http://talent.baidu.com/component1000/corp/baidu/html/BFM.html http://talent.baidu.com/baidu/web/templet1000/index/corpwebPosition1000baidu!getOnePosition?postIdEnc=13C2D8E668CF1B7ACD9200FA4D0CA727&brandCode=1&recruitType=1&lanType=1&opera

[LoadRunner]初识LoadRunner12之Controller

在上一章介绍了使用VuGen录制脚本,接下来便要用Controller给录制好的脚本提供一个测试场景. 在本章介绍一下Controller的一些简单用法 在VuGen中打开之前录制的脚本,点击Tools - Create Controller Scenario用已有脚本 创建一个场景 Controller的场景分两种 目标场景(Goal Oriented Scenario):主要测试脚本能否达到某一项标准 手工场景(Manual Scenario):长时间多状态运行脚本查看性能瓶颈 首先介绍目标

ROS actionlib学习(三)

下面这个例子将展示用actionlib来计算随机变量的均值和标准差.首先在action文件中定义goal.result和feedback的数据类型,其中goal为样本容量,result为均值和标准差,feedback为样本编号.当前样本数据.均值和标准差. #goal definition int32 samples --- #result definition float32 mean float32 std_dev --- #feedback int32 sample float32 dat

【转】A*算法解决八数码问题

from utils import ( PriorityQueue) import copy infinity = float('inf') def best_first_graph_search(problem, f): #定义初始节点 node = Node(problem.initial) node.fvalue=f(node) #如果是最终结果,返回节点 if problem.goal_test(node): return node #frotier是一个顺序队列,从小到大排列,排列比较

[Knowledge-based AI] {ud409} Lesson 13: 13 - Planning

Block Problem Revisited Painting a Ceiling States Optional Reading: Winston Chapter 15, pages 323-336 Operators Planning and State Spaces plan: Planning Partial Planning Detecting Conflicts Open Preconditions Hierarchical Task Network Planning Hierar