Teaching Cars to Drive with Foresight

Good drivers anticipate dangerous situations and adjust their driving before things get dicey. Researchers at the University of Bonn now also want to teach this skill to self-driving cars. They will present a corres­ponding algorithm at the Inter­national Conference on Computer Vision which is held at Friday, November 1st, in Seoul. They will also present a data set that they used to train and test their approach. It will make it much easier to develop and improve such processes in the future.

A new approach provides dense point-wise annotations for the complete field-of-view of an employed automotive Lidar. (Source: AG Computer Vision, U. Bonn)

An empty street, a row of parked cars at the side: nothing to indicate that you should be careful. But wait: Isn’t there a side street up ahead, half covered by the parked cars? Maybe I better take my foot off the gas – who knows if someone’s coming from the side. We constantly encounter situations like these when driving. Inter­preting them correctly and drawing the right conclusions requires a lot of experience. In contrast, self-driving cars sometimes behave like a learner driver in his first lesson. “Our goal is to teach them a more anti­cipatory driving style,” explains computer scientist Jürgen Gall. “This would then allow them to react much more quickly to dangerous situations.”

Gall chairs the Computer Vision working group at the Uni­versity of Bonn, which, in coopera­tion with his university colleagues from the Institute of Photo­grammetry and the Autonomous Intelligent Systems working group, is researching a solution to this problem. The scientists now present a first step on the way to this goal at the leading symposium of Gall’s discipline. “We have refined an algorithm that completes and inter­prets Lidar data,” he explains. “This allows the car to anti­cipate potential hazards at an early stage.”

A rotating laser is mounted on the roof of most self-driving cars. The laser beam is reflected by the sur­roundings. The Lidar system measures when the reflected light falls on the sensor and uses this time to calcu­late the distance. “The system detects the distance to around 120,000 points around the vehicle per revolution,” says Gall. The problem with this: The measuring points become dilute as the distance increases – the gap between them widens. This is like painting a face on a balloon: When you inflate it, the eyes move further and further apart. Even for a human being it is therefore almost impossible to obtain a correct under­standing of the sur­roundings from a single Lidar scan.

“A few years ago, the Uni­versity of Karlsruhe recorded large amounts of Lidar data, a total of 43,000 scans,” explains Jens Behley of the Institute of Photo­grammetry. “We have now taken sequences from several dozen scans and super­imposed them.” The data obtained in this way also contain points that the sensor had only recorded when the car had already driven a few dozen yards further down the road. Put simply, they show not only the present, but also the future.

“These super­imposed point clouds contain important information such as the geometry of the scene and the spatial dimensions of the objects it contains, which are not available in a single scan,” emphasizes Martin Garbade, who is currently doing his doctorate at the Institute of Computer Science. “Addi­tionally, we have labeled every single point in them, for example: There’s a sidewalk, there’s a pedestrian and back there’s a motor­cyclist.” The scientists fed their software with a data pair: a single Lidar scan as input and the associated overlay data including semantic information as desired output. They repeated this process for several thousands of such pairs.

“During this training phase, the algorithm learned to complete and interpret individual scans,” explains Gall. “This meant that it could plausibly add missing measure­ments and interpret what was seen in the scans.” The scene com­pletion already works relatively well: The process can complete about half of the missing data correctly. The semantic inter­pretation, i.e. deducing which objects are hidden behind the measuring points, does not work quite as well: Here, the computer achieves a maximum accuracy of 18 percent.

However, the scientists consider this branch of research to still be in its infancy. “Until now, there has simply been a lack of extensive data sets with which to train corres­ponding arti­ficial intelli­gence methods,” stresses Gall. “We are closing a gap here with our work. I am opti­mistic that we will be able to signi­ficantly increase the accuracy rate in semantic inter­pretation in the coming years.” He considers 50 percent to be quite realistic, which could have a huge influence on the quality of auto­nomous driving. (Source: U. Bonn)

Reference: J. Behley et al.: SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences, arXiv:1904.01416 [cs.CV] (2019)

Links: International Conference on Computer Vision 2019, Seoul, Korea Project 

Speak Your Mind