Basic Usage

class marp.ma_env.MARP(N, layout, orthogonal_actions=True, one_shot=True, battery=False, render_mode=None, **kwargs)

The base MARP environment that unifies the APIs of the supporting tasks

Parameters:
  • N (int) – the number of agents

  • layout (str) – the file name of the layout configuration

  • one_shot (bool) – one-shot path finding or lifelong

  • render_mode (str or None) – will visualize if ‘human’, otherwise, only print in the console

get_state()

Summarize the current system state

is_goal_state(state)

Check if each agent has reached her designated goal.

render()

Render the history

reset(seed=None, options=None)

Reset agent loations

save(file_name, speed=1)

Save the visualized result

Note

Run conda install conda-forge::ffmpeg first, if ffmpeg is not installed

Parameters:
  • file_name (str) – output file path

  • speed (int) – speedup rate

step(actions)

Proceed to the next step by the given action profile according to the underlying specific task.

Parameters:

actions (dict[str, Action]) – the action profile, i.e., joint actions.

Returns:
  • observations (dict) – observation profile

  • rewards (dict) – rewards for each agent

  • terminations (dict) – whether the task is accomplished for each agent

  • truncations (dict) – True if timeout, otherwise False

  • infos (dict) – auxiliary infomation including collisions and action masks

transit(state, actions)

Given a state and an action profile, return the successor state.

Note

Calls to this method will not change any internal state of the environment

Parameters:
  • state (dict) – system state obtained by enquiring get_state()

  • actions (dict[str, Action]) – an action profile

Returns:

succ_state (dict) – the successor state

Formulation as Wrappers

By implementing a base environment, we provide standard interfaces and the minimal set of infomations needed. It only provides a multi-agent simulation environment but does not restrict the exact problem that one may want to solve. We hereby claim one possible principle formulation as wrappers. That is, a downstream problem can be simulated and investigated by implementing an appropriate light-weight wrapper out of the basic MARP environment. For example, if one wants to simulate and solve a centralized multi-agent search problem, then she can have a customized formulation wrapper as follows

class CustomizedFormulationWrapper():

    def __init__(self, ma_env, options=None):
        self.ma_env = ma_env
        self.options = options

    def get_state(self):
        """
        State enquiry
        """

    def transit(self, state, action):
        """
        Returns the successor state and the associated cost
        """

    def is_goal_state(self, state):
        """
        Check whether it is a goal state
        """

    def heuristic(self, state):
        """
        A domain dependent heuristic
            """