Frontier Research Overview
Cognitive Ability enable robots to work in uncontrolled environments and alongside humans.
We study and design cognitive systems that enable a robot to behave robustly in a variety of tasks and environments. The research topics include (i) Meta-learning, (ii) Collaborative agents, (iii) Representation and Reasoning, (iv) Reinforcement Learning, and (v) Explainable AI (XAI).
We study the problem of creating agents that can cooperate with other agents even though they have not been trained with each others before. We found that better diversification in the trianing can give the agent a better generalisation ability . We also found that the cooperative skill could be learn via meta-reinforcement learning .
Generating Diverse Cooperative Agents by Learning Incompatible Policies 
In this work, we propose to learn diverse behaviors via policy compatibility. Conceptually, policy compatibility measures whether policies of interest can coordinate effectively. We theoretically show that incompatible policies are not similar. Thus, policy compatibility—which has been used exclusively as a measure of robustness—can be used as a proxy for learning diverse behaviors. Then, we incorporate the proposed objective into a population-based training scheme to allow concurrent training of multiple agents. Additionally, we use state-action information to induce local variations of each policy. Empirically, the proposed method consistently discovers more solutions than baseline methods across various multi-goal cooperative environments. In multi-recipe Overcooked, we show that our method produces populations of behaviorally diverse agents, which enables generalist agents trained with such a population to be more robust. Finally, in high-dimensional complex SMAC environments, LIPO learns diverse winning strategies.
NeuroVis: Real-Time Neural Information Measurement and Visualization of Embodied Neural Systems 
We propose for the first time a tool “NeuroVis” to read a robot brain and to see what the robot thinks. NeuroVis is real-time neural spatial-temporal information measurement and visualization. It can measure temporal neural activities and their propagation throughout the network. By using this neural information along with the connection strength and plasticity, NeuroVis can visualize neural structure (NS), neural dynamics (ND), neural plasticity (NP), and neural memory (NM). We have demonstrated the use of NeuroVis to analyze and visualize “robot brain” during behaving. NeuroVis will offer the opportunity to better understand embodied dynamic neural information processes, boost efficient neural technology development, and enhance user trust. It can be used as a tool for explainable AI and cognitive robotics research.
Learning to Cooperate with Unseen Agents Through Meta-Reinforcement Learning 
Ad hoc teamwork problem describes situations where an agent has to cooperate with previously unseen agents to achieve a common goal. For an agent to be successful in these scenarios, it has to have cooperative skills. One could implement cooperative skills into an agent by using domain knowledge (e.g., goals, roles, and protocols) to design the agent’s behaviours. However, in complex domains, domain knowledge might not be available. Therefore, it is interesting to explore how to directly learn cooperative skills from data. In this work, we apply meta-reinforcement learning (meta-RL) formulation in the context of ad hoc teamwork problem. Our experiments show that such a method could produce cooperative agents in two cooperative environments with different cooperative circumstances.
Investigating Partner Diversification Methods in Cooperative Multi-agent Deep Reinforcement Learning 
Overfitting to learning partners is a known problem, in multi-agent reinforcement learning (MARL), due to the co-evolution of learning agents. Previous works explicitly add diversity to learning partners for mitigating this problem. However, since there are many approaches for introducing diversity, it is not clear which one should be used under what circumstances. In this work, we clarify the situation and reveal that widely used methods such as partner sampling and population-based training are unreliable at introducing diversity under fully cooperative multi-agent Markov decision process. We find that generating pre-trained partners is a simple yet effective procedure to achieve diversity. Finally, we highlight the impact of diversified learning partners on the generalization of learning agents using cross-play and ad-hoc team performance as evaluation metrics.
For more details, see :
 Srisuchinnawong, A., Homchanthanakul, J., Manoonpong P. (2021) NeuroVis: Real-time Neural Information Measurement and Visualization of Embodied Neural Systems, Front. Neural Circuits. doi: 10.3389/fncir.2021.743101 (JIF = 3.492, SJR = 1.91, SCIE, Q1)
 Charakorn, R., Manoonpong, P., Dilokthanakul, N. (2023) Generating Diverse Cooperative Agents by Learning Incompatible Policies, The Eleventh International Conference on Learning Representations (ICLR, A* conf.)