Contents Menu Expand Light mode Dark mode Auto light/dark, in light mode Auto light/dark, in dark mode Skip to content
MARP 0.1 documentation
MARP 0.1 documentation

User guide

  • Quick Start
  • Basic Usage

Formulations

  • Reinforcement Learning
  • Search
  • Opponent Modelling

Benchmarks

  • MAPF
  • MAPD
  • Warehouse

API

  • marp.ma_env
  • marp.mapf
  • marp.mapd
  • marp.warehouse
  • marp.rl
  • marp.search
  • marp.animator
  • marp.utils
Back to top
View this page

Search¶

As we mentioned in the Quick Start, in addition to RL-related interfaces, we have also provide with search-related interfaces.

The search problem underneath the environment is to find collision-free paths for each agent, a path for an agent is a sequence of actions from her initial location to her designated goal location. To formulate a search problem for each agent, we again follow the Formulation as Wrappers principle by encoding wrappers.

Path-Finding for Individual Agents¶

Suppose a MARP environment has been intialized, one can induce a single-agent search problem by ingoring others.

env.reset()
from marp.search import SingleAgentSearchWrapper, astar

plans = {}
for agent in env.agents:
    plans[agent] = astar(SingleAgentSearchWrapper(env, agent))

while env.agents:
    actions = {}
    for agent in env.agents:
        path = plans[agent]
        if len(path):
            actions[agent] = path.pop(0)
        else:
            actions[agent] = 0
    observations, rewards, terminations, truncations, infos = env.step(actions)

The above code has done the following things:

  1. Formulate a single-agent search problem for each agent by the wrapper SingleAgentSearchWrapper

  2. Call A star search algorithm astar() to compute a plan (path) for each agent.

  3. Execute the plans.

One will probably get similar visual results as follows. Note that there might be collisions (agents turning red).

_images/sapf.gif

Joint Path-Finding¶

To deal with collisions explicitly, one can instead formulate a search problem over joint states and joint actions, i.e., view all agents as one single joint agent.

env.reset()
from marp.search import MultiAgentSearchWrapper, astar

joint_plan = astar(MultiAgentSearchWrapper(env))

while env.agents:
    actions = joint_plan.pop(0)
    env.step(actions)
env.render()

The above code has down the following things:

  1. Formulate a joint search problem by the wrapper MultiAgentSearchWrapper, where a state with any collision is specified as a dead end.

  2. Again, call A star search astar(), but this time the algorithm will return a sequence of joint-actions

  3. Execute the joint-plan.

One will probably get similar visual results as follows. Now there will not be collisions anymore.

_images/sapf.gif

Detailed Usage¶

Functions¶

marp.search.astar(env)¶

Given a search environment, return the optimal plan.

Parameters:

env (SearchEnv) – a search environment with the transition function provided.

Raises:

RuntimeError – If no plan can be found.

Returns:

The optimal plan.

Return type:

list[Action]

Classes¶

class marp.search.SingleAgentSearchWrapper(ma_env, agent)¶

Given a MARP environment, wrap it as inducing a single-agent search problem.

Parameters:
  • ma_env (MARP) – a multi-agent environment with the transition function provided.

  • agent (str) – the agent that the problem is induced for

get_state()¶

Obtain the underlying system state.

Returns:

the underlying state

Return type:

dict

heuristic(state)¶

A domain dependent heuristic

Parameters:

state (dict) – the state to enquire

Returns:

the Euclidean distance from the current state to the goal state

Return type:

float

is_goal_state(state)¶

Check whether it is a goal state

Parameters:

state (dict) – the state to enquire

Returns:

whether it is a goal state

Return type:

bool

transit(state, action)¶

Given a state and an action, return the successor state and the cost.

Parameters:
  • state (dict) – the state to enquire

  • action (Action) – the action to enquire

Returns:

(the successor state, the cost)

Return type:

(dict, float)

class marp.search.MultiAgentSearchWrapper(ma_env)¶

Given a MARP environment, wrap it as a multi-agent joint search problem.

Parameters:

ma_env (MARP) – a multi-agent environment with the transition function provided.

get_state()¶

Obtain the joint state.

Returns:

(system state, collision-free or not)

Return type:

(dict, bool)

heuristic(state)¶

A domain dependent heuristic: return 99999 if any collision, otherwise the sum of all agents’ costs.

Parameters:

state (dict) – the joint state to enquire

Returns:

the sum of the Euclidean distance of all agents’ locations to their goals

Return type:

float

is_goal_state(state)¶

Check whether it is a goal state

Parameters:

state (dict) – the joint state to enquire

Returns:

whether it is a goal state

Return type:

bool

transit(state, actions)¶

Given a joint state and a joint action, return the successor joint state and the aggreated cost.

Parameters:
  • state (dict) – the joint state to enquire

  • action (dict[str, Actions]) – the joint action to enquire

Returns:

(the successor state, the cost)

Return type:

(dict, cost)

Next
Opponent Modelling
Previous
Reinforcement Learning
Copyright © 2024, Fengming (Fernando) Zhu
Made with Sphinx and @pradyunsg's Furo
On this page
  • Search
    • Path-Finding for Individual Agents
    • Joint Path-Finding
    • Detailed Usage
      • Functions
        • astar()
      • Classes
        • SingleAgentSearchWrapper
          • SingleAgentSearchWrapper.get_state()
          • SingleAgentSearchWrapper.heuristic()
          • SingleAgentSearchWrapper.is_goal_state()
          • SingleAgentSearchWrapper.transit()
        • MultiAgentSearchWrapper
          • MultiAgentSearchWrapper.get_state()
          • MultiAgentSearchWrapper.heuristic()
          • MultiAgentSearchWrapper.is_goal_state()
          • MultiAgentSearchWrapper.transit()