View on GitHub

HC-Driving

High-Conflict Driving Scenarios

Learning to Robustly Negotiate Bi-Directional Lane Usage in High-Conflict Driving Scenarios

Paper - Conference video - CMU press release

Recently, autonomous driving has made substantial progress in addressing the most common traffic scenarios like intersection navigation and lane changing. However, most of these successes have been limited to scenarios with well-defined traffic rules and require minimal negotiation with other vehicles.

In this work, we introduce a previously unconsidered, yet every-day, high-conflict driving scenario requiring negotiations between agents of equal rights and priorities. There exist no centralized control structure and we do not allow communications. Therefore, it is unknown if other drivers are willing to cooperate, and if so to what extent. We show an example of such a scenario below.

hc_driving

We train policies to robustly negotiate with opposing vehicles of an unobservable degree of cooperativeness using multi-agent reinforcement learning. We show that we are able to successfully negotiate and traverse the scenario considered over 99% of the time.

Cooperativeness and varying driver behaviour

We use a reward function parametrized by a cooperativeness parameter c (c ∈ [0, 0.5]). Uncooperative agents (c = 0) are only rewarded for their own progress. Highly cooperative agents (c = 0.5) are indifferent to which vehicle makes progress first. (Compare to the reward function Eq. 3 in the paper) By tuning the cooperativeness parameter, we achieve a corresponding and interpretable change in an agent’s behavior. Highly cooperative agents drive with more anticipation while less cooperative ones tend to prioritize their own progress.

Below, we show three examples of resulting interactions. Note that we have the same environment (parking pattern) in all cases and control all vehicles with the same policy, resulting in a self-play situation. The only varying parameter is the cooperativeness assigned to each vehicle. As you can observe, the behaviours exhibited are intuitively explainable with more cooperative agents yielding early on.

Influence of Cooperativeness

What could possibly go wrong?

Challenges arise in pairings of very uncooperative and very cooperative agents.

Playing a game of chicken

Consider two uncooperative agents. Each is interested in maximizing their own progress. Some human passengers might be rather unhappy with the resulting interactions.

Influence of Cooperativeness

Indecisive behaviour

Consider two highly cooperative agents. These do not have a preference on which agent makes progress first. This extremely cautious and indecisive approach is equally inefficient.

Influence of Cooperativeness

Final thoughts

In conclusion, our agents are robust to an unknown timing of opponent decisions, an unobservable degree of cooperativeness of the opposing vehicle, and previously unencountered policies. Furthermore, they learn to exhibit human-like behaviors such as defensive driving, anticipating solution options and interpreting the behavior of other agents.

If you found this helpful for your work please consider citing us:

@INPROCEEDINGS{killing-hc-driving,
  author={Killing, Christoph and Villaflor, Adam and Dolan, John M.},
  booktitle={2021 IEEE International Conference on Robotics and Automation (ICRA)}, 
  title={Learning to Robustly Negotiate Bi-Directional Lane Usage in High-Conflict Driving Scenarios}, 
  year={2021},
  volume={},
  number={},
  pages={8090-8096},
  doi={10.1109/ICRA48506.2021.9561071}}