Abmarl API Specification

Abmarl Simulations

class abmarl.sim.PrincipleAgent(id=None, seed=None, **kwargs)

Principle Agent class for agents in a simulation.

property active

True if the agent is still active in the simulation.

Active means that the agent is in a valid state. For example, suppose agents in our Simulation can die. Then active is True if the agents are alive or False if they’re dead.

property configured: All agents must have an id.

finalize(**kwargs)

property id: The agent’s unique identifier.

property seed: Seed for random number generation.

class abmarl.sim.ObservingAgent(observation_space=None, **kwargs)

ObservingAgents can observe the state of the simulation.

The agent’s observation must be in its observation space. The SimulationManager will send the observation to the Trainer, which will use it to produce actions.

property configured: Observing agents must have an observation space.

finalize(**kwargs): Wrap all the observation spaces with a Dict and seed it if the agent was created with a seed.

property observation_space

class abmarl.sim.ActingAgent(action_space=None, **kwargs)

ActingAgents can act in the simulation.

The Trainer will produce actions for the agents and send them to the SimulationManager, which will process those actions in its step function.

property action_space

property configured: Acting agents must have an action space.

finalize(**kwargs): Wrap all the action spaces with a Dict if applicable and seed it if the agent was created with a seed.

class abmarl.sim.Agent(observation_space=None, **kwargs)

Bases: ObservingAgent, ActingAgent

An Agent that can both observe and act.

class abmarl.sim.AgentBasedSimulation

AgentBasedSimulation interface.

Under this design model the observations, rewards, and done conditions of the agents is treated as part of the simulations internal state instead of as output from reset and step. Thus, it is the simulations responsibility to manage rewards and dones as part of its state (e.g. via self.rewards dictionary).

This interface supports both single- and multi-agent simulations by treating the single-agent simulation as a special case of the multi-agent, where there is only a single agent in the agents dictionary.

property agents: A dict that maps the Agent’s id to the Agent object. An Agent must be an instance of PrincipleAgent. A multi-agent simulation is expected to have multiple entries in the dictionary, whereas a single-agent simulation should only have a single entry in the dictionary.

finalize(): Finalize the initialization process. At this point, every agent should be configured with action and observation spaces, which we convert into Dict spaces for interfacing with the trainer.

abstract get_all_done(**kwargs): Return the simulation’s done status.

abstract get_done(agent_id, **kwargs): Return the agent’s done status.

abstract get_info(agent_id, **kwargs): Return the agent’s info.

abstract get_obs(agent_id, **kwargs): Return the agent’s observation.

abstract get_reward(agent_id, **kwargs): Return the agent’s reward.

abstract render(**kwargs): Render the simulation for vizualization.

abstract reset(**kwargs): Reset the simulation simulation to a start state, which may be randomly generated.

abstract step(action, **kwargs): Step the simulation forward one discrete time-step. The action is a dictionary that contains the action of each agent in this time-step.

Abmarl Simulation Managers

class abmarl.managers.SimulationManager(sim)

Control interaction between Trainer and AgentBasedSimulation.

A Manager implmenents the reset and step API, by which it calls the AgentBasedSimulation API, using the getters within reset and step to accomplish the desired control flow.

sim: The AgentBasedSimulation.

agents: The agents that are in the AgentBasedSimulation.

render(**kwargs)

abstract reset(**kwargs)

Reset the simulation.

Returns:: The first observation of the agent(s).

abstract step(action_dict, **kwargs)

Step the simulation forward one discrete time-step.

Parameters:

action_dict – Dictionary mapping agent(s) to their actions in this time step.

Returns:

The observations, rewards, done status, and info for the agent(s) whose actions we expect to receive next.

Note: We do not necessarily return anything for the agent whose actions we just received in this time-step. This behavior is defined by each Manager.

class abmarl.managers.TurnBasedManager(sim)

The TurnBasedManager allows agents to take turns. The order of the agents is stored and the obs of the first agent is returned at reset. Each step returns the info of the next agent “in line”. Agents who are done are removed from this line. Once all the agents are done, the manager returns all done.

reset(**kwargs): Reset the simulation and return the observation of the first agent.

step(action_dict, **kwargs): Assert that the incoming action does not come from an agent who is recorded as done. Step the simulation forward and return the observation, reward, done, and info of the next agent. If that next agent finished in this turn, then include the obs for the following agent, and so on until an agent is found that is not done. If all agents are done in this turn, then the wrapper returns all done.

class abmarl.managers.AllStepManager(sim)

The AllStepManager gets the observations of all agents at reset. At step, it gets the observations of all the agents that are not done. Once all the agents are done, the manager returns all done.

reset(**kwargs): Reset the simulation and return the observation of all the agents.

step(action_dict, **kwargs): Assert that the incoming action does not come from an agent who is recorded as done. Step the simulation forward and return the observation, reward, done, and info of all the non-done agents, including the agents that were done in this step. If all agents are done in this turn, then the manager returns all done.

Abmarl External Integration

class abmarl.external.GymWrapper(sim)

Wrap an AgentBasedSimulation object with only a single agent to the gym.Env interface. This wrapper exposes the single agent’s observation and action space directly in the simulation.

property action_space: The agent’s action space is the environment’s action space.

property observation_space: The agent’s observation space is the environment’s observation space.

render(**kwargs): Forward render calls to the composed simulation.

reset(**kwargs): Return the observation from the single agent.

step(action, **kwargs): Wrap the action by storing it in a dict that maps the agent’s id to the action. Pass to sim.step. Return the observation, reward, done, and info from the single agent.

property unwrapped: Fall through all the wrappers and obtain the original, completely unwrapped simulation.

class abmarl.external.MultiAgentWrapper(sim)

Enable connection between SimulationManager and RLlib Trainer.

Wraps a SimulationManager and forwards all calls to the manager. This class is boilerplate and needed because RLlib checks that the simulation is an instance of MultiAgentEnv.

sim: The SimulationManager.

render(*args, **kwargs): See SimulationManager.

reset(): See SimulationManager.

step(actions): See SimulationManager.

property unwrapped

Fall through all the wrappers to the SimulationManager.

Returns:: The wrapped SimulationManager.

Abmarl GridWorld Simulation Framework

Base

class abmarl.sim.gridworld.base.GridWorldSimulation

GridWorldSimulation interface.

Extends the AgentBasedSimulation interface for the GridWorld. We provide builders for streamlining the building process.

classmethod build_sim(rows, cols, **kwargs)

Build a GridSimulation.

Specify the number of row, the number of cols, a dictionary of agents, and any additional parameters.

Parameters:

rows – The number of rows in the grid. Must be a positive integer.
cols – The number of cols in the grid. Must be a positive integer.
agents – The dictionary of agents in the grid.

Returns:

A GridSimulation configured as specified.

classmethod build_sim_from_file(file_name, object_registry, **kwargs)

Build a GridSimulation from a text file.

Parameters:

file_name – Name of the file that specifies the initial grid setup. In the file, each cell should be a single alphanumeric character indicating which agent will be at that position (from the perspective of looking down on the grid). That agent will be given that initial position. 0’s are reserved for empty space.
object_registry – A dictionary that maps characters from the file to a function that generates the agent. This must be a function because each agent must have unique id, which is generated here.

Returns:

A GridSimulation built from the file.

render(fig=None, **kwargs)

Draw the grid and all active agents in the grid.

Agents are drawn at their positions using their respective shape and color.

Parameters:: fig – The figure on which to draw the grid. It’s important to provide this figure because the same figure must be used when drawing each state of the simulation. Otherwise, a ton of figures will pop up, which is very annoying.

class abmarl.sim.gridworld.base.GridWorldBaseComponent(agents=None, grid=None, **kwargs)

Component base class from which all components will inherit.

Every component has access to the dictionary of agents and the grid.

property agents: A dict that maps the Agent’s id to the Agent object. All agents must be GridWorldAgents.

property cols: The number of columns in the grid.

property grid

The grid indexes the agents by their position.

For example, an agent whose position is (3, 2) can be accessed through the grid with self.grid[3, 2]. Components are responsible for maintaining the connection between agent position and grid index.

property rows: The number of rows in the grid.

class abmarl.sim.gridworld.grid.Grid(rows, cols, overlapping=None, **kwargs)

A Grid stores the agents at indices in a numpy array.

Components can interface with the Grid. Each index in the grid is a dictionary that maps the agent id to the agent object itself. If agents can overlap, then there may be more than one agent per cell.

Parameters:

rows – The number of rows in the grid.
cols – The number of columns in the grid.
overlapping – Dictionary that maps the agents’ encodings to a list of encodings with which they can occupy the same cell. To avoid undefined behavior, the overlapping should be symmetric, so that if 2 can overlap with 3, then 3 can also overlap with 2.

property cols: The number of columns in the grid.

place(agent, ndx)

Place an agent at an index.

If the cell is available, the agent will be placed at that index in the grid and the agent’s position will be updated. The placement is successful if the new position is unoccupied or if the agent already occupying that position is overlappable AND this agent is overlappable.

Parameters:

agent – The agent to place.
ndx – The new index for this agent.

Returns:

The successfulness of the placement.

query(agent, ndx)

Query a cell in the grid to see if is available to this agent.

The cell is available for the agent if it is empty or if both the occupying agent and the querying agent are overlappable.

Parameters:

agent – The agent for which we are checking availabilty.
ndx – The cell to query.

Returns:

The availability of this cell.

remove(agent, ndx)

Remove an agent from an index.

Parameters:

agent – The agent to remove
ndx – The old index for this agent

reset(**kwargs): Reset the grid to an empty state.

property rows: The number of rows in the grid.

Agents

class abmarl.sim.gridworld.agent.GridWorldAgent(initial_position=None, blocking=False, encoding=None, render_shape='o', render_color='gray', **kwargs)

The base agent in the GridWorld.

property blocking: Specify if this agent blocks other agent’s observations and actions.

property configured: All agents must have an id.

property encoding

The numerical value that identifies the type of agent.

The value does not necessarily identify the agent itself. For example, other agents who observe this agent will see this value.

property initial_position: The agent’s initial position at reset.

property position: The agent’s position in the grid.

property render_color: The agent’s color in the rendered grid.

property render_shape: The agent’s shape in the rendered grid.

class abmarl.sim.gridworld.agent.GridObservingAgent(view_range=None, **kwargs)

Observe the grid up to view range cells away.

property configured: Observing agents must have an observation space.

property view_range: The number of cells away this agent can observe in each step.

class abmarl.sim.gridworld.agent.MovingAgent(move_range=None, **kwargs)

Move up to move_range cells.

property configured: Acting agents must have an action space.

property move_range: The maximum number of cells away that the agent can move.

class abmarl.sim.gridworld.agent.HealthAgent(initial_health=None, **kwargs)

Agents have health points and can die.

Health is bounded between 0 and 1.

property active: The agent is active if its health is greater than 0.

property health

The agent’s health throughout the simulation trajectory.

The health will always be between 0 and 1.

property initial_health: The agent’s initial health between 0 and 1.

class abmarl.sim.gridworld.agent.AttackingAgent(attack_range=None, attack_strength=None, attack_accuracy=None, **kwargs)

Agents that can attack other agents.

property attack_accuracy

The effective accuracy of the agent’s attack.

Should be between 0 and 1. To make deterministic attacks, use 1.

property attack_range: The maximum range of the attack.

property attack_strength

The strength of the attack.

Should be between 0 and 1.

property configured: Acting agents must have an action space.

State

class abmarl.sim.gridworld.state.StateBaseComponent(agents=None, grid=None, **kwargs)

Abstract State Component base from which all state components will inherit.

abstract reset(**kwargs): Resets the part of the state for which it is responsible.

class abmarl.sim.gridworld.state.PositionState(agents=None, grid=None, **kwargs)

Manage the agents’ positions in the grid.

reset(**kwargs)

Give agents their starting positions.

We use the agent’s initial position if it exists. Otherwise, we randomly place the agents in the grid.

class abmarl.sim.gridworld.state.HealthState(agents=None, grid=None, **kwargs)

Manage the state of the agents’ healths.

Every HealthAgent has a health. If that health falls to zero, that agent dies and is remove from the grid.

reset(**kwargs)

Give HealthAgents their starting healths.

We use the agent’s initial health if it exists. Otherwise, we randomly assign a value between 0 and 1.

Actors

class abmarl.sim.gridworld.actor.ActorBaseComponent(agents=None, grid=None, **kwargs)

Abstract Actor Component class from which all Actor Components will inherit.

abstract property key

The key in the action dictionary.

The action space of all acting agents in the gridworld framework is a dict. We can build up complex action spaces with multiple components by assigning each component an entry in the action dictionary. Actions will be a dictionary even if your simulation only has one Actor.

abstract process_action(agent, action_dict, **kwargs)

Process the agent’s action.

Parameters:

agent – The acting agent.
action_dict – The action dictionary for this agent in this step. The dictionary may have different entries, each of which will be processed by different Actors.

abstract property supported_agent_type

The type of Agent that this Actor works with.

If an agent is this type, the Actor will add its entry to the agent’s action space and will process actions for this agent.

class abmarl.sim.gridworld.actor.MoveActor(**kwargs)

Agents can move to unoccupied nearby squares.

property key: This Actor’s key is “move”.

process_action(agent, action_dict, **kwargs)

The agent can move to nearby squares.

The agent’s new position must be within the grid and the cell-occupation rules must be met.

Parameters:

agent – Move the agent if it is a MovingAgent.
action_dict – The action dictionary for this agent in this step. If the agent is a MovingAgent, then the action dictionary will contain the “move” entry.

Returns:

True if the move is successful, False otherwise.

property supported_agent_type: This Actor works with MovingAgents.

class abmarl.sim.gridworld.actor.AttackActor(attack_mapping=None, **kwargs)

Agents can attack other agents.

property attack_mapping

Dict that dictates which agents the attacking agent can attack.

The dictionary maps the attacking agents’ encodings to a list of encodings that they can attack.

property key: This Actor’s key is “attack”.

process_action(attacking_agent, action_dict, **kwargs)

If the agent has chosen to attack, then we process their attack.

The processing goes through a series of checks. The attack is possible if there is an attacked agent such that:

The attacked agent is active.
The attacked agent is within range.
The attacked agent is valid according to the attack_mapping.

If the attack is possible, then we determine the success of the attack based on the attacking agent’s accuracy. If the attack is successful, then the attacked agent’s health is depleted by the attacking agent’s strength, possibly resulting in its death.

property supported_agent_type: This Actor works with AttackingAgents.

Observers

class abmarl.sim.gridworld.observer.ObserverBaseComponent(agents=None, grid=None, **kwargs)

Abstract Observer Component base from which all observer components will inherit.

abstract get_obs(agent, **kwargs)

Observe the state of the simulation.

Parameters:: agent – The agent for which we return an observation.
Returns:: This agent’s observation.

abstract property key

The key in the observation dictionary.

The observation space of all observing agents in the gridworld framework is a dict. We can build up complex observation spaces with multiple components by assigning each component an entry in the observation dictionary. Observations will be a dictionary even if your simulation only has one Observer.

abstract property supported_agent_type

The type of Agent that this Observer works with.

If an agent is this type, the Observer will add its entry to the agent’s observation space and will produce observations for this agent.

class abmarl.sim.gridworld.observer.SingleGridObserver(observe_self=True, **kwargs)

Observe a subset of the grid centered on the agent’s position.

The observation is centered around the observing agent’s position. Each agent in the “observation window” is recorded in the relative cell using its encoding. If there are multiple agents on a single cell with different encodings, the agent will observe only one of them chosen at random.

get_obs(agent, **kwargs)

The agent observes a sub-grid centered on its position.

The observation may include other agents, empty spaces, out of bounds, and masked cells, which can be blocked from view by other blocking agents.

Returns:: The observation as a dictionary.

property key: This Observer’s key is “grid”.

property observe_self: Agents can observe themselves, which may hide important information if overlapping is important. This can be turned off by setting observe_self to False.

property supported_agent_type: This Observer works with GridObservingAgents.

class abmarl.sim.gridworld.observer.MultiGridObserver(**kwargs)

Observe a subset of the grid centered on the agent’s position.

The observation is centered around the observing agent’s position. The observing agent sees a stack of observations, one for each positive encoding, where the number of agents of each encoding is given rather than the encoding itself. Out of bounds and masked indicators appear in every grid.

get_obs(agent, **kwargs)

The agent observes one or more sub-grids centered on its position.

The observation may include other agents, empty spaces, out of bounds, and masked cells, which can be blocked from view by other blocking agents. Each grid records the number of agents on a particular cell correlated to a specific encoding.

Returns:: The observation as a dictionary.

property key: This Observer’s key is “grid”.

property supported_agent_type: This Observer works with GridObservingAgents.

Done

class abmarl.sim.gridworld.done.DoneBaseComponent(agents=None, grid=None, **kwargs)

Abstract Done Component class from which all Done Components will inherit.

abstract get_all_done(**kwargs)

Determine if all the agents are done and/or if the simulation is done.

Returns:: True if all agents are done or if the simulation is done. Otherwise False.

abstract get_done(agent, **kwargs)

Determine if an agent is done in this step.

Parameters:: agent – The agent we are querying.
Returns:: True if the agent is done, otherwise False.

class abmarl.sim.gridworld.done.ActiveDone(agents=None, grid=None, **kwargs)

Inactive agents are indicated as done.

get_all_done(**kwargs): Return True if all agents are inactive. Otherwise, return False.

get_done(agent, **kwargs): Return True if the agent is inactive. Otherwise, return False.

class abmarl.sim.gridworld.done.OneTeamRemainingDone(agents=None, grid=None, **kwargs)

Inactive agents are indicated as done.

If the only active agents are those who are all of the same encoding, then the simulation ends.

get_all_done(**kwargs): Return true if all active agents have the same encoding. Otherwise, return false.

Wrappers

class abmarl.sim.gridworld.wrapper.ComponentWrapper(agents=None, grid=None, **kwargs)

Wraps GridWorldBaseComponent.

Every wrapper must be able to wrap the respective space and points to/from that space. Agents and Grid are referenced directly from the wrapped component rather than received as initialization parameters.

property agents: The agent dictionary is directly taken from the wrapped component.

abstract check_space(space): Verify that the space can be wrapped.

property grid: The grid is directly taken from the wrapped component.

property unwrapped: Fall through all the wrappers and obtain the original, completely unwrapped component.

abstract wrap_point(space, point)

Wrap a point to the space.

Parameters:

space – The space into which to wrap the point.
point – The point to wrap.

abstract wrap_space(space)

Wrap the space.

Parameters:: space – The space to wrap.

abstract property wrapped_component: Get the first-level wrapped component.

class abmarl.sim.gridworld.wrapper.ActorWrapper(component)

Wraps an ActorComponent.

Modify the action space of the agents involved with the Actor, namely the specific actor’s channel. The actions recieved from the trainer are in the wrapped space, so we need to unwrap them to send them to the actor. This is the opposite from how we wrap and unwrap observations.

property key: The key is the same as the wrapped actor’s key.

process_action(agent, action_dict, **kwargs)

Unwrap the action and pass it to the wrapped actor to process.

Parameters:

agent – The acting agent.
action_dict – The action dictionary for this agent in this step. The action in this channel comes in the wrapped space.

property supported_agent_type: The supported agent type is the same as the wrapped actor’s supported agent type.

property wrapped_component: Get the wrapped actor.

class abmarl.sim.gridworld.wrapper.RavelActionWrapper(component)

Use numpy’s ravel capabilities to convert space and points to Discrete.

check_space(space): Ensure that the space is of type that can be ravelled to discrete value.

wrap_point(space, point)

Unravel a single discrete point to a value in the space.

Recall that the action from the trainer arrives in the wrapped discrete space, so we need to unravel it so that it is in the unwrapped space before giving it to the actor.

wrap_space(space): Convert the space into a Discrete space.