Quick Start =========== Initializing environments in MARP is very similar to doing that in `PettingZoo `_ and `Gym `_. .. code-block:: python from marp.ma_env import MARP env = MARP(N=3, layout='small', orthogonal_actions=True, one_shot=True, render_mode='human') This creates a multi-agent environment where each agent takes actions simultaneously. In other words, every time the environment takes as input an action profile (i.e., an joint-action) and proceeds to the next step. We provide similar interfaces as PettingZoo .. code-block:: python from marp.ma_env import MARP env = MARP(N=3, layout='small', orthogonal_actions=True, one_shot=True, render_mode='human') observations, infos = env.reset() while env.agents: actions = { agent: env.action_space(agent).sample(infos[agent]['action_mask']) for agent in env.agents } observations, rewards, terminations, truncations, infos = env.step(actions) In addition to the conventional ``step()`` interface that is commonly used in the RL community, we also provide interfaces that help obtain the explicit transition between (global or system) states. .. code-block:: python from marp.ma_env import MARP env = MARP(N=3, layout='small', orthogonal_actions=True, one_shot=True, render_mode='human') env.reset() curr_state = env.get_state() actions = { agent: env.action_space(agent).sample(infos[agent]['action_mask']) for agent in env.agents } succ_state = env.transit(curr_state, actions) Compared to the ``step()`` interface, the ``transit()`` interface explicitly takes in a state, which can be aquired by ``get_state()`` in advance, and an action profile, and returns a successor state. Note that, calls to this function will not change the internal state of the environment, therefore, can be used to implement search algorithms that plan ahead.