citylearn.agents.marlisa module

class citylearn.agents.marlisa.MARLISA(*args, regression_buffer_capacity: int = None, start_regression_time_step: int = None, regression_frequency: int = None, information_sharing: bool = None, pca_compression: float = None, iterations: int = None, **kwargs)[source]

Bases: SAC

property batch_size: int: Batch size.

property coordination_variables_history: List[float]

get_encoded_regression_targets(index: int, observations: List[float]) → float[source]

get_encoded_regression_variables(index: int, observations: List[float]) → List[float][source]

get_exploration_prediction(observations: List[List[float]]) → List[List[float]][source]: Return randomly sampled actions from action_space multiplied by action_scaling_coefficient.

get_exploration_prediction_with_information_sharing(observations: List[List[float]]) → Tuple[List[List[float]], List[List[float]]][source]

get_exploration_prediction_without_information_sharing(observations: List[List[float]]) → Tuple[List[List[float]], List[List[float]]][source]

get_post_exploration_prediction(observations: List[List[float]], deterministic: bool) → List[List[float]][source]: Action sampling using policy, post-exploration time step

get_post_exploration_prediction_with_information_sharing(observations: List[List[float]], deterministic: bool) → Tuple[List[List[float]], List[List[float]]][source]

get_post_exploration_prediction_without_information_sharing(observations: List[List[float]], deterministic: bool) → Tuple[List[List[float]], List[List[float]]][source]

get_regression_variables(index: int, observations: List[float], actions: List[float]) → List[float][source]

property hidden_dimension: List[float]: Hidden dimension.

property information_sharing: bool

property iterations: int

property pca_compression: float

predict_demand(index: int, observations: List[float], actions: List[float]) → float[source]

property regression_buffer_capacity: int

property regression_frequency: int

reset()[source]

Reset environment to initial state.

Calls reset_time_step.

Notes

Override in subclass for custom implementation when reseting environment.

set_energy_coefficients()[source]

set_networks()[source]

set_pca()[source]

set_regression_encoders() → List[List[Encoder]][source]

Get observation value transformers/encoders for use in MARLISA agent internal regression model.

The encoder classes are defined in the preprocessing.py module and include PeriodicNormalization for cyclic observations, OnehotEncoding for categorical obeservations, RemoveFeature for non-applicable observations given available storage systems and devices and Normalize for observations with known minimum and maximum boundaries.

Returns:: encoders – Encoder classes for observations ordered with respect to active_observations.
Return type:: List[Encoder]

property start_regression_time_step: int

update(observations: List[List[float]], actions: List[List[float]], reward: List[float], next_observations: List[List[float]], terminated: bool, truncated: bool)[source]

Update replay buffer.

Parameters:

observations (List[List[float]]) – Previous time step observations.
actions (List[List[float]]) – Previous time step actions.
reward (List[float]) – Current time step reward.
next_observations (List[List[float]]) – Current time step observations.
terminated (bool) – Indication that episode has ended.
truncated (bool) – If episode truncates due to a time limit or a reason that is not defined as part of the task MDP.

class citylearn.agents.marlisa.MARLISARBC(env: CityLearnEnv, rbc: RBC = None, **kwargs: Any)[source]

Bases: MARLISA, SACRBC

Uses citylearn.agents.rbc.RBC to select action during exploration before using citylearn.agents.marlisa.MARLISA.

Parameters:

env (CityLearnEnv) – CityLearn environment.
rbc (RBC) – citylearn.agents.rbc.RBC or child class, used to select actions during exploration.
**kwargs (Any) – Other keyword arguments used to initialize super class.

get_exploration_prediction_without_information_sharing(observations: List[List[float]]) → Tuple[List[List[float]], List[List[float]]][source]