Skip to content

SafeRoboticsLab/TreeReach

Repository files navigation

TreeReach: A Tree-Based Forward Reachability Approach for Enhanced Safety Monitoring

Examples

ISAACS Robust Safety Filter


Intervention rate: 0.4846

TreeReach Filter (PLS & furthest)


Intervention rate: 0.1765

TreeReach Filter with bad control policy

Installation

  1. Create conda environment If you are using Linux, run the following commands on your computer
bash install_packages.sh

Alternatively, if your computer is not running Linux, run the following commands:

conda create -n gameplay python=3.8 -y
conda activate gameplay
conda install cuda -c nvidia/label/cuda-11.8.0 -y
conda install pytorch pytorch-cuda=11.8 -c pytorch -c nvidia/label/cuda-11.8.0 -y
conda install -c conda-forge suitesparse jupyter notebook omegaconf numpy tqdm gym dill plotly shapely wandb matplotlib pybullet pandas -y
pip install --upgrade jax==0.4.8 jaxlib==0.4.7+cuda11.cudnn86 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
pip install -e .
  1. Install pyspline, optimizedDP and quickzonoreach

For reference, a conda environment file is provided in gameplay_conda_env.yaml.

Usage

Note: You will need to setup Weights & Biases API key via conda if you want to visualize the training progress.

  1. Create a Weight and Biases account
  2. Get your API key from your account
  3. Run the following command with your API key:
conda env config vars set WANDB_API_KEY=<your api key>

Alternatively, you can also run:

wandb login

And paste your API key when prompted.

On the other hand, if you do not want to log your training progress, open the .yaml file in config/ that you will be used for training and set USE_WANDB to False.

Synthesize Learned Control and Disturbance Policies

Run the following command:

python3 script/train_isaacs.py -cf config/isaacs_race_car.yaml

The output will be 3 models: the safety fallback policy, the adversarial policy, and the critic. The result will be put in the folder train_result/OUT_FOLDER as stated in OUT_FOLDER in the config file (solver section). The training progress will be logged onto wandb, under the project PROJECT_NAME with the name NAME as stated in the config file.

Synthesize Ground Truth Control and Disturbance Policies using OptimizedDP

Run the following command:

python3 script/generate_odp.py -cf config/isaacs_race_car.yaml

The result will be put in the folder train_result/OUT_FOLDER as stated in OUT_FOLDER in the config file (odp section).

Synthesize Best-Response Disturbance Policies to the Learned Control Policy using OptimizedDP

Run the following command:

python3 script/generate_br_odp.py -cf config/isaacs_race_car.yaml

The result will be put in the folder train_result/OUT_FOLDER as stated in OUT_FOLDER in the config file (odp.exploiter section).

Evaluate TreeReach Filter

Run the following command:

python3 script/eval_filters.py -cf config/isaacs_race_car.yaml

The result will be put in the folder train_result/OUT_FOLDER as stated in OUT_FOLDER in the config file (eval_filter section).

Different evaluations are implemented in eval_filters.py:

  • test_safety_policy: Simulate and visualize trajectories of all the different control policies against all the different disturbance policies, and compare their safety performance.
  • confusion_matrix: Obtain the confusion matrix between the safety fallback policy and the optimal control policy, and visualize it.
  • control_matrix: Obtain the control matrix of the different control policies, and visualize it.
  • initial_states: Compute initial states that are safe under the optimal control policy but unsafe under the task policy, and visualize them. These states are used to evaluate the safety monitors and filters.
  • test_safety_monitors_off_policy: Off-Policy Safety Monitor test. Evaluate the ability of the safety monitors to validate the safety of the task policy under the initial states obtained from initial_states. The monitors are only evaluated on the initial states and no steps are taken in the environment.
  • test_safety_monitors_on_policy: On-Policy Safety Monitor test. Evaluate the ability of the safety monitors to validate the safety of the task policy under the initial states obtained from initial_states. The monitors are evaluated on the trajectories rolled out in the environment starting from the initial states.
  • test_rollout_filter: Simulate trajectories with the gameplay rollout filter, and evaluate the safety performance of the filter.
  • test_frs_filter: Simulate trajectories with the ISAACS Robust Safety Filter, and evaluate the safety performance of the filter.
  • test_frs_tree_filter: Simulate trajectories with the TreeReach filter, and evaluate the safety performance of the filter. To run a specific evaluation, set eval to True in the corresponding section in the config file.

Visualize trajectories

To visualize trajectories generated with test_rollout_filter, test_frs_filter or test_frs_tree_filter run the following command:

python3 script/visualize.py -cf config/isaacs_race_car.yaml

Select the trajectory you want to visualize in the section visualization of the config file.

Citation

  • The original source code of ISAACS can be found here.
  • The original source code of Gameplay-Filters can be found here.

If you find our paper or code useful, please consider citing us with:

@inproceedings{nguyen2024gameplayfiltersrobustzeroshot,
    title={Gameplay Filters: Robust Zero-Shot Safety through Adversarial Imagination}, 
    author={Duy P. Nguyen and Kai-Chieh Hsu and Wenhao Yu and Jie Tan and Jaime F. Fisac},
    year={2024},
    eprint={2405.00846},
    archivePrefix={arXiv},
    primaryClass={cs.RO},
    url={https://arxiv.org/abs/2405.00846}, 
}
@inproceedings{hsunguyen2023isaacs,
    title={ISAACS: Iterative Soft Adversarial Actor-Critic for Safety},
    author={Kai-Chieh Hsu and Duy P. Nguyen and Jaime F. Fisac},
    booktitle={Proceedings of the 5th Conference on Learning for Dynamics and Control},
    year={2023},
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors