Skip to main content

Research & Development
Industrial AI blog

Data-driven perception and planning methodologies for autonomous vehicles

16 March 2020

By Ioannis Souflas, Ph.D.
European R&D Centre, Hitachi Europe Ltd.


Accepting autonomous vehicles as a reliable and safe transportation service requires the realization of smooth and natural vehicle control. The plethora of driving data captured from modern cars is a key enabler for solving this problem. Our work on data-driven perception and planning systems for autonomous vehicles was recently presented at the AutoSens conference in Brussels.[1] In this article, we summarize the key points of this presentation with the purpose of highlighting the benefits and limitations of data-driven solutions for autonomous vehicles.

Autonomous Driving Software Paradigms

Autonomous vehicle software is a topic that has drawn a lot of attention over the last decade. Although several different software architecture approaches can be found in literature, there are two major paradigms[2] dominating the research and development community (Fig. 1):

  • Mediated Perception, the problem is decomposed into separate logical sub-modules to solve the perception, localization, planning and control tasks. The functionality of each sub-module is usually derived from first principles and pre-defined rules that might require significant calibration effort depending the use case. This approach is advantageous for software reconfigurability and debuggability whilst it is providing the necessary transparency to comply with well-known automotive grade functional safety standards. On the other hand, high software complexity and fixed driving behavior are some of the key disadvantages.
  • Behavior Reflex, the aim is to construct a direct mapping from sensory inputs e.g. cameras, lidars, GPS, etc. to a driving action such as acceleration/braking or steering. The input/output mapping is mainly done using a machine/deep learning model which is trained using expert demonstration labelled data.[3] The key advantages of this approach are the low software complexity and the easy reconfiguration of driving behavior based on the needs of the passengers. However, this approach suffers from the ambiguity in the underlying functionality leading to challenges when it comes to software debuggability and safety assurance.

Figure 1: Autonomous driving software paradigms

Considering the benefits and limitations of each approach and based on our experimentation and recent findings from other researchers in the field[4,5], we have concluded that combining the two paradigms would allow us to have better reconfigurability, transparency, debuggability and maintainability which are all necessary in order to produce high quality and safe software for autonomous driving. In further details and relating to how humans drive, we know that Perception and Planning are two key sub-modules responsible for “Environmental Cognition” and “Decision Making”. Hence, we have combined the Mediated Perception and Behavior Reflex paradigms by enriching the Perception and Planning sub-modules with state-of-the-art AI and data science techniques (Fig. 2).

Figure 2: Proposed autonomous driving software paradigm combining the benefits of Mediated Perception and Behavior Reflex approaches

Reliable Data-Driven Autonomous Driving Software

Our work on data-driven autonomous driving software was part of the HumanDrive project undertaken in the UK where we led the prototyping of pioneering AI technology to develop natural human-like vehicle control using machine learning.[6] As part of this activity we have developed a full software stack that enabled us to deploy data-driven perception and planning solutions. At the top level (Fig. 3), the software is comprised of the following sub-modules:

  • Localization, which is responsible to localize on a given map by fusing sensor data e.g. GPS, IMU, wheel odometry, cameras etc.,
  • Perception, aiming to perceive the environment using deep learning based on various perception sensors e.g. Camera, LiDAR etc.,
  • Planning, which uses deep learning to produce a collision free and smooth, human-inspired trajectory towards the desired goal location,
  • Safety, that is responsible to check the validity of the planning module based on some predefined boundaries,
  • Control, that converts the desired trajectory and speed demand to the corresponding actuator demands (i.e. steering, accelerator, brake),
  • Diagnostic, which is responsible to monitor the health status of each sub-module.

Figure 3: Top-level system architecture - block diagram

Emphasizing in the perception system, we have split the software into three distinct layers, the Machine/Deep Learning layer, the Engineering layer and lastly the Output layer (Fig. 4). The Machine Deep Learning layer consists of multiple deep neural networks that use raw sensor data from cameras, LiDARs or both and output information about the surrounding objects, occupied area and drivable road. The Engineering Layer is responsible to fuse the information provided from the different deep neural networks, filter/track their movement based on prior knowledge about the kinematics/physics of the objects, and finally provide a short-term prediction about the future state of the objects based on their history. At last the Output Layer converts the processed information into the appropriate data representations e.g. grid format, vector format etc., in order to be used by the planning module.

Figure 4: Data-driven perception system architecture

Moving to the planning system (Fig. 5), the core of the solution is a multiple input/output Recurrent Convolutional Neural Network (RCNN) that is responsible to imitate and predict the human driving behavior in terms of future yaw rate and speed demands. The planning network, namely PlanNet, uses current and historical information about the environment perception in the form of occupancy grid sequence and information about the desired route ahead of the ego-vehicle to predict the best set of yaw rate and speed demands. Following, this step a Trajectory Generator is responsible to convert the predicted sequence of yaw rates and speeds into a trajectory based on a physics-based vehicle model.[7]

Figure 5: Data-driven planning system architecture

As it can be understood from the above explanation of the data-driven perception and planning sub-modules, the backend of these systems consists of deep neural networks that need to be “fueled” by data. For this purpose, we have developed a complete pre-processing pipeline of analysis, synthesis and labelling[8] tools that allow us to create unbiased, balanced datasets with high information content prior to any machine learning training and validation activities.

Figure 6: Pipeline for accessing and improving machine learning models

Software Integration and Testing

Finally, a software architecture that has at its core state-of-the-art deep learning and data science tools requires efficient software development, dependencies management and testing practices. With respect to software development and dependencies management we have used container technologies which is essential for seamless integration, compatibility and maintainability of the software. Regarding the testing practices, we have followed a working pipeline which starts with early functionality prototyping and testing using Software-in-the-Loop (SiL) to reassure bug free deployment and then move to real-world testing for refinement, calibration and system verification (Fig. 7).

Figure 7: Software-in-the-Loop (SiL) and real-world testing for system verification


Our research and real-world experimentation of data-driven solutions for autonomous vehicles allowed us to identify some of the key benefits and limitations of this approach. Below we summarize the main lessons of this activity:

  • The performance/accuracy of data-driven perception and planning solutions for autonomous vehicles is linked with the quality and is proportional to the quantity of the training data
  • Not all data are equal, finding rich data with high information content is essential for the successful deployment of data-driven solutions
  • Data-driven based solutions can be scaled without major software reconfigurations, “fueling” of machine learning models with appropriate data leads to the desired software updates
  • Data science and engineering is a necessary step prior to any machine learning activity
  • Machine learning do not replace rigorous system engineering – it is an enabler rather than a disruptor
  • System safety cannot be guaranteed with pure machine-learning based approaches, rigorous system engineering and ruled based systems are vital part of the overall solution
  • Data-driven engineering has the potential to unlock personalized autonomous vehicles


I. Souflas, “Data-Driven Perception and Planning Methodologies for Autonomous Vehicles,” AutoSens 2019, Brussels, Belgium, Sep. 2019, URL:
C. Chen, A. Seff, A. Kornhauser and J. Xiao, “DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving,” arXiv: 1505.00256, 2015.
M. Bojarski, D. D. Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao and K. Zieba, “End to End Learning for Self-Driving Cars,” arXiv:1604.07316, 2016.
M. Bansal, A. Krizhevsky and A. Ogale, “ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst,” arXiv:1812.03079v1, 2018.
Y. Glassner, L. Gispan, A. Ayash and T. F. Shohet, “Closing the gap towards end-to-end autonomous vehicle system,” arXiv:1901.00114v2, 2019.
“HumanDrive,” 2019. [Online]. Available:
Patent Pending: I. Souflas, et al., "Autonomous driving control system and Apparatus for determining vehicle travelling data usable in said autonomous driving control system". Patent EP19178108.7, 4 June 2019.
Hitachi open source semantic segmentation editor: