Reinforcement Learning Agents for human navigation models




4: Architecture

Faculty Supervisor:

Takehiko Nagakura

Faculty email:


Apply by:

September 15 (13 to prepare funding application)


Paloma Gonzalez: palomagr@mit.edu

Project Description

3 UROPS needed. We are using human trajectory data from security camera videos from the Media Lab Building to feed to a Reinforcement Learning Model in Unity3D. We need data processing which involves extracting the trajectories from aerial videos with OpenCV tools and preparing it for the model. First the videos will be classified by valid trajectories: full trajectories, half trajectories, etc. Then the trajectories need to be extracted, using x,y, coordinates and time stamp. The data then is visualized to inspect it. Indicators will be extracted from the data. For example if a person stays static in the same position for a certain amount of time, it counts as a hot spot in the plan. Then the data needs to be discretized in order to be able to feed it to the Reinforcement model in Unity3D, following a model provided by me. It is a parallel goal of the project to co-create a routine to automate the data processing.


Python Medium Level for use with OpenCV. Experience with OpenCv and other Computer Vision algorithms is desirable. Video editing basic skills and documentation skills.