DSpace at KIST: Learning disentangled skills for hierarchical reinforcement learning through trajectory autoencoder with weak labels

Browse

DSpace at KISTKIST Article 2023

Full metadata record

DC Field	Value	Language
dc.contributor.author	Song, Wonil	-
dc.contributor.author	Jeon, Sangryul	-
dc.contributor.author	Choi, Hyesong	-
dc.contributor.author	Sohn, Kwanghoon	-
dc.contributor.author	Min, Dongbo	-
dc.date.accessioned	2024-01-19T08:30:54Z	-
dc.date.available	2024-01-19T08:30:54Z	-
dc.date.created	2023-09-07	-
dc.date.issued	2023-11	-
dc.identifier.issn	0957-4174	-
dc.identifier.uri	https://pubs.kist.re.kr/handle/201004/113150	-
dc.description.abstract	Typically, hierarchical reinforcement learning (RL) requires skills that are applicable to various downstream tasks. Although several recent studies have proposed the supervised and unsupervised learning of such skills, the learned skills are often entangled, which hinders their interpretation. To alleviate this, we propose a novel method to use weak labels for learning disentangled skills from the continuous latent representations of trajectories. To this end, we extended a trajectory variational autoencoder (VAE) to impose an inductive bias using weak labels, which explicitly enforces the disentangling of the trajectory representations into factors of interest intended for the model to learn. Using the latent representations as skills, a skill-based policy network is trained to generate trajectories similar to the learned decoder of the trajectory VAE. Furthermore, using the disentangled skill, we propose a skill repetition that can expand the entire trajectories generated by the policy at test time, resulting in an effective planning strategy. Experiments were performed on several challenging navigation tasks in mazes, and the results demonstrate the effectiveness of our method at solving hierarchical RL problems even with a long horizon and sparse rewards.	-
dc.language	English	-
dc.publisher	Pergamon Press Ltd.	-
dc.title	Learning disentangled skills for hierarchical reinforcement learning through trajectory autoencoder with weak labels	-
dc.type	Article	-
dc.identifier.doi	10.1016/j.eswa.2023.120625	-
dc.description.journalClass	1	-
dc.identifier.bibliographicCitation	Expert Systems with Applications, v.230	-
dc.citation.title	Expert Systems with Applications	-
dc.citation.volume	230	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.identifier.wosid	001054200000001	-
dc.identifier.scopusid	2-s2.0-85161724253	-
dc.relation.journalWebOfScienceCategory	Computer Science, Artificial Intelligence	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.relation.journalWebOfScienceCategory	Operations Research & Management Science	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalResearchArea	Operations Research & Management Science	-
dc.type.docType	Article	-
dc.subject.keywordPlus	EMERGENCY-DEPARTMENT VISITS	-
dc.subject.keywordPlus	DEEP	-
dc.subject.keywordAuthor	Deep reinforcement learning	-
dc.subject.keywordAuthor	Hierarchical reinforcement learning	-
dc.subject.keywordAuthor	Skill learning	-
dc.subject.keywordAuthor	Variational autoencoder	-
dc.subject.keywordAuthor	Disentangled representation	-
dc.subject.keywordAuthor	Weak label	-
dc.subject.keywordAuthor	Planning	-

Appears in Collections:: KIST Article > 2023

Files in This Item:

Export: RIS (EndNote); XLS (Excel); XML

Show Simple Item Record

KIST Library Institutional Repository

Browse

BROWSE