Abmarl API Specification¶

Abmarl Simulations¶

class abmarl.sim.PrincipleAgent(id=None, seed=None, **kwargs)¶

Principle Agent class for agents in a simulation.

property configured¶: All agents must have an id.

finalize(**kwargs)¶

property id¶

property seed¶: Seed for random number generation.

class abmarl.sim.ObservingAgent(observation_space=None, **kwargs)¶

ObservingAgents can observe the state of the simulation.

The agent’s observation must be in its observation space. The SimulationManager will send the observation to the Trainer, which will use it to produce actions.

property configured¶: Observing agents must have an observation space.

finalize(**kwargs)¶: Wrap all the observation spaces with a Dict and seed it if the agent was created with a seed.

property observation_space¶

class abmarl.sim.ActingAgent(action_space=None, **kwargs)¶

ActingAgents can act in the simulation.

The Trainer will produce actions for the agents and send them to the SimulationManager, which will process those actions in its step function.

property action_space¶

property configured¶: Acting agents must have an action space.

finalize(**kwargs)¶: Wrap all the action spaces with a Dict if applicable and seed it if the agent was created with a seed.

class abmarl.sim.Agent(observation_space=None, **kwargs)¶

Bases: abmarl.sim.agent_based_simulation.ObservingAgent, abmarl.sim.agent_based_simulation.ActingAgent

An Agent that can both observe and act.

class abmarl.sim.AgentBasedSimulation¶

AgentBasedSimulation interface.

Under this design model the observations, rewards, and done conditions of the agents is treated as part of the simulations internal state instead of as output from reset and step. Thus, it is the simulations responsibility to manage rewards and dones as part of its state (e.g. via self.rewards dictionary).

This interface supports both single- and multi-agent simulations by treating the single-agent simulation as a special case of the multi-agent, where there is only a single agent in the agents dictionary.

property agents¶: A dict that maps the Agent’s id to the Agent object. An Agent must be an instance of PrincipleAgent. A multi-agent simulation is expected to have multiple entries in the dictionary, whereas a single-agent simulation should only have a single entry in the dictionary.

finalize()¶: Finalize the initialization process. At this point, every agent should be configured with action and observation spaces, which we convert into Dict spaces for interfacing with the trainer.

abstract get_all_done(**kwargs)¶: Return the simulation’s done status.

abstract get_done(agent_id, **kwargs)¶: Return the agent’s done status.

abstract get_info(agent_id, **kwargs)¶: Return the agent’s info.

abstract get_obs(agent_id, **kwargs)¶: Return the agent’s observation.

abstract get_reward(agent_id, **kwargs)¶: Return the agent’s reward.

abstract render(**kwargs)¶: Render the simulation for vizualization.

abstract reset(**kwargs)¶: Reset the simulation simulation to a start state, which may be randomly generated.

abstract step(action, **kwargs)¶: Step the simulation forward one discrete time-step. The action is a dictionary that contains the action of each agent in this time-step.

Abmarl Simulation Managers¶

class abmarl.managers.SimulationManager(sim)¶

Control interaction between Trainer and AgentBasedSimulation.

A Manager implmenents the reset and step API, by which it calls the AgentBasedSimulation API, using the getters within reset and step to accomplish the desired control flow.

sim¶: The AgentBasedSimulation.

agents¶: The agents that are in the AgentBasedSimulation.

render(**kwargs)¶

abstract reset(**kwargs)¶

Reset the simulation.

Returns: The first observation of the agent(s).

abstract step(action_dict, **kwargs)¶

Step the simulation forward one discrete time-step.

Parameters

action_dict – Dictionary mapping agent(s) to their actions in this time step.

Returns

The observations, rewards, done status, and info for the agent(s) whose actions we expect to receive next.

Note: We do not necessarily return anything for the agent whose actions we just received in this time-step. This behavior is defined by each Manager.

class abmarl.managers.TurnBasedManager(sim)¶

The TurnBasedManager allows agents to take turns. The order of the agents is stored and the obs of the first agent is returned at reset. Each step returns the info of the next agent “in line”. Agents who are done are removed from this line. Once all the agents are done, the manager returns all done.

reset(**kwargs)¶: Reset the simulation and return the observation of the first agent.

step(action_dict, **kwargs)¶: Assert that the incoming action does not come from an agent who is recorded as done. Step the simulation forward and return the observation, reward, done, and info of the next agent. If that next agent finished in this turn, then include the obs for the following agent, and so on until an agent is found that is not done. If all agents are done in this turn, then the wrapper returns all done.

class abmarl.managers.AllStepManager(sim)¶

The AllStepManager gets the observations of all agents at reset. At step, it gets the observations of all the agents that are not done. Once all the agents are done, the manager returns all done.

reset(**kwargs)¶: Reset the simulation and return the observation of all the agents.

step(action_dict, **kwargs)¶: Assert that the incoming action does not come from an agent who is recorded as done. Step the simulation forward and return the observation, reward, done, and info of all the non-done agents, including the agents that were done in this step. If all agents are done in this turn, then the manager returns all done.

Abmarl External Integration¶

class abmarl.external.GymWrapper(sim)¶

Wrap an AgentBasedSimulation object with only a single agent to the gym.Env interface. This wrapper exposes the single agent’s observation and action space directly in the simulation.

render(**kwargs)¶: Forward render calls to the composed simulation.

reset(**kwargs)¶: Return the observation from the single agent.

step(action, **kwargs)¶: Wrap the action by storing it in a dict that maps the agent’s id to the action. Pass to sim.step. Return the observation, reward, done, and info from the single agent.

property unwrapped¶: Fall through all the wrappers and obtain the original, completely unwrapped simulation.

class abmarl.external.MultiAgentWrapper(sim)¶

Enable connection between SimulationManager and RLlib Trainer.

Wraps a SimulationManager and forwards all calls to the manager. This class is boilerplate and needed because RLlib checks that the simulation is an instance of MultiAgentEnv.

sim¶: The SimulationManager.

render(*args, **kwargs)¶: See SimulationManager.

reset()¶: See SimulationManager.

step(actions)¶: See SimulationManager.

property unwrapped¶

Fall through all the wrappers to the SimulationManager.

Returns: The wrapped SimulationManager.