Nikhil Chowdary Paleti
2023
Improving Reinfocement Learning Agent Training using Text based Guidance: A study using Commands in Dravidian Languages
Nikhil Chowdary Paleti
|
Sai Aravind Vadlapudi
|
Sai Aashish Menta
|
Sai Akshay Menta
|
Vishnu Vardhan Gorantla V N S L
|
Janakiram Chandu
|
Soman K P
|
Sachin Kumar S
Proceedings of the Third Workshop on Speech and Language Technologies for Dravidian Languages
Reinforcement learning (RL) agents have achieved remarkable success in various domains, such as game-playing and protein structure prediction. However, most RL agents rely on exploration to find optimal solutions without explicit guidance. This paper proposes a methodology for training RL agents using text-based instructions in Dravidian Languages, including Telugu, Tamil, and Malayalam along with using the English language. The agents are trained in a modified Lunar Lander environment, where they must follow specific paths to successfully land the lander. The methodology involves collecting a dataset of human demonstrations and textual instructions, encoding the instructions into numerical representations using text-based embeddings, and training RL agents using state-of-the-art algorithms. The results demonstrate that the trained Soft Actor-Critic (SAC) agent can effectively understand and generalize instructions in different languages, outperforming other RL algorithms such as Proximal Policy Optimization (PPO) and Deep Deterministic Policy Gradient (DDPG).
Search