Deep Reinforcement Learning for Visual Semantic Navigation with Memory

Published in Dissertação (Mestrado em Ciências de Computação e Matemática Computacional) - Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo, São Carlos, 2020, 2020

Recommended citation: SANTOS, Iury Batista de Andrade. "e Aprendizado por reforço profundo para navegação visual semântica com memória."e; Dissertação (Mestrado em Ciências de Computação e Matemática Computacional) - Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo, São Carlos, 2020. 1(1). https://www.teses.usp.br/teses/disponiveis/55/55134/tde-16122020-164714/pt-br.php

The navigation of mobile robots is a subject vastly studied in the last decades, being a crucial task for the insertion of robots in diverse scenarios. However, complex and changeable environments, as indoors of houses, still shows challengers to be transpassed, being an object of study in several works that adopts approaches as computer vision without topological or metric maps. This work proposes an architecture for the navigation of mobile robots aiming target-object search in indoor ambiances of houses, using computer vision methods and semantic information with memory. The proposed architecture can generalize through a priori acknowledgment of detect objects in scenes and reinforce relationships over experiences of the past, in a learning-based navigation approach. Therefore, the following models of machine learning will be adopted: neural convolutional netwoks, graph neural networks, recorrent neural networks and deep reinforcement learning, in a targetobject approach. This architecture has trained in several domestic ambiances, adopting a photo-realistic simulated environment. The architecture was evaluated through qualitative analysis, executing episodes of the agent in the simulated environment with visual insight, and quantitative analysis, adopting metrics like success rate and success rate weighted by path length. Policies learn by the proposed architecture were compared with agents using random policies, agents using only reinforcement learning, and, finally, agents with navigation semantic policies without memory. The experiments performed showed a more exploratory behavior of the proposed architecture when compared with the nonmemory approaches. reaching better success rates in the tasks for both metrics. When exposed to restrict scenarios, consequently being of greater difficulty, the policies learn by such models demonstrated better results, with a lower decrease in its performance when compared with less restrictive executions and other models. Thus, the proposed model presented consistent results with better policies learn by the agents, resulting in behaviors more successful in the task of target-object search in indoor-home environments.

Download paper here

Recommended citation: SANTOS, Iury Batista de Andrade. “Aprendizado por reforço profundo para navegação visual semântica com memória”. 2020. Dissertação (Mestrado em Ciências de Computação e Matemática Computacional) - Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo, São Carlos, 2020. doi:10.11606/D.55.2020.tde-16122020-164714.