tailieunhanh - Learning to Drive a Real Car in 20 Minutes

India’s entire production of passenger cars and MUVs rose in the 1960s to 1980s only slowly to around vehicles annually (see figure 1). Low production volumes and high prices put passenger car ownership quite deliberately out of reach of average middle class consumers. The stagnation was above all related to India’s post independence State-led investment regime that favoured capital goods production (favouring commercial vehicle production and busses), restricting market competition through a licensing system and shielding of the national economy by a protectionist trade and FDI regime. Thus, while the demand for passenger cars – even for. | FBIT 2007 Learning to Drive a Real Car in 20 Minutes Martin Riedmiller Neuroinformatics Group Univ. of Osnabrueck Email Mike Montemerlo Hendrik Dahlkamp AI Lab Stanford University Email montemerlo dahlkamp @ Abstract The paper describes our first experiments on Reinforcement Learning to steer a real robot car. The applied method Neural Fitted Q Iteration NFQ is purely data-driven based on data directly collected from real-life experiments . no transition model and no simulation is used. The RL approach is based on learning a neural Q value function which means that no prior selection of the structure of the control law is required. We demonstrate that the controller is able to learn a steering task in less than 20 minutes directly on the real car. We consider this as an important step towards the competitive application of neural Q function based RL methods in real-life environments. Figure 1. The car used is a VW Passat equipped with additional sensors. 1 Introduction The interest in applying Reinforcement Learning RL methods to real life control applications is growing rapidly . 7 14 9 5 . In this paper we focus on situations where the controller should learn by interacting with the real system only. In particular for the design of the controller we will not assume that a system model is available neither in form of system equations nor in form of a simulator the latter approach was successfully applied in a number of applications see . 4 7 . In contrast to that here we only assume that the controller is able to collect state action transitions by observing the real system behaviour while controlling it. Learning by interacting with the real system directly has an important advantage the controller is tailored exactly to the behaviour of the real system at hand instead of a more or less exact model of it. The big challenge in learning with real systems lies in the fact that learning must occur in a reasonable amount .

TỪ KHÓA LIÊN QUAN