Sai Akshay Menta

2023

Improving Reinfocement Learning Agent Training using Text based Guidance: A study using Commands in Dravidian Languages
Nikhil Chowdary Paleti | Sai Aravind Vadlapudi | Sai Aashish Menta | Sai Akshay Menta | Vishnu Vardhan Gorantla V N S L | Janakiram Chandu | Soman K P | Sachin Kumar S
Proceedings of the Third Workshop on Speech and Language Technologies for Dravidian Languages

Reinforcement learning (RL) agents have achieved remarkable success in various domains, such as game-playing and protein structure prediction. However, most RL agents rely on exploration to find optimal solutions without explicit guidance. This paper proposes a methodology for training RL agents using text-based instructions in Dravidian Languages, including Telugu, Tamil, and Malayalam along with using the English language. The agents are trained in a modified Lunar Lander environment, where they must follow specific paths to successfully land the lander. The methodology involves collecting a dataset of human demonstrations and textual instructions, encoding the instructions into numerical representations using text-based embeddings, and training RL agents using state-of-the-art algorithms. The results demonstrate that the trained Soft Actor-Critic (SAC) agent can effectively understand and generalize instructions in different languages, outperforming other RL algorithms such as Proximal Policy Optimization (PPO) and Deep Deterministic Policy Gradient (DDPG).

Co-authors

Sachin Kumar S 1

Sai Aravind Vadlapudi 1

Venues

Fix author