Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Song, Wonil | - |
dc.contributor.author | Jeon, Sangryul | - |
dc.contributor.author | Choi, Hyesong | - |
dc.contributor.author | Sohn, Kwanghoon | - |
dc.contributor.author | Min, Dongbo | - |
dc.date.accessioned | 2024-01-19T08:30:54Z | - |
dc.date.available | 2024-01-19T08:30:54Z | - |
dc.date.created | 2023-09-07 | - |
dc.date.issued | 2023-11 | - |
dc.identifier.issn | 0957-4174 | - |
dc.identifier.uri | https://pubs.kist.re.kr/handle/201004/113150 | - |
dc.description.abstract | Typically, hierarchical reinforcement learning (RL) requires skills that are applicable to various downstream tasks. Although several recent studies have proposed the supervised and unsupervised learning of such skills, the learned skills are often entangled, which hinders their interpretation. To alleviate this, we propose a novel method to use weak labels for learning disentangled skills from the continuous latent representations of trajectories. To this end, we extended a trajectory variational autoencoder (VAE) to impose an inductive bias using weak labels, which explicitly enforces the disentangling of the trajectory representations into factors of interest intended for the model to learn. Using the latent representations as skills, a skill-based policy network is trained to generate trajectories similar to the learned decoder of the trajectory VAE. Furthermore, using the disentangled skill, we propose a skill repetition that can expand the entire trajectories generated by the policy at test time, resulting in an effective planning strategy. Experiments were performed on several challenging navigation tasks in mazes, and the results demonstrate the effectiveness of our method at solving hierarchical RL problems even with a long horizon and sparse rewards. | - |
dc.language | English | - |
dc.publisher | Pergamon Press Ltd. | - |
dc.title | Learning disentangled skills for hierarchical reinforcement learning through trajectory autoencoder with weak labels | - |
dc.type | Article | - |
dc.identifier.doi | 10.1016/j.eswa.2023.120625 | - |
dc.description.journalClass | 1 | - |
dc.identifier.bibliographicCitation | Expert Systems with Applications, v.230 | - |
dc.citation.title | Expert Systems with Applications | - |
dc.citation.volume | 230 | - |
dc.description.isOpenAccess | N | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.identifier.wosid | 001054200000001 | - |
dc.identifier.scopusid | 2-s2.0-85161724253 | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Artificial Intelligence | - |
dc.relation.journalWebOfScienceCategory | Engineering, Electrical & Electronic | - |
dc.relation.journalWebOfScienceCategory | Operations Research & Management Science | - |
dc.relation.journalResearchArea | Computer Science | - |
dc.relation.journalResearchArea | Engineering | - |
dc.relation.journalResearchArea | Operations Research & Management Science | - |
dc.type.docType | Article | - |
dc.subject.keywordPlus | EMERGENCY-DEPARTMENT VISITS | - |
dc.subject.keywordPlus | DEEP | - |
dc.subject.keywordAuthor | Deep reinforcement learning | - |
dc.subject.keywordAuthor | Hierarchical reinforcement learning | - |
dc.subject.keywordAuthor | Skill learning | - |
dc.subject.keywordAuthor | Variational autoencoder | - |
dc.subject.keywordAuthor | Disentangled representation | - |
dc.subject.keywordAuthor | Weak label | - |
dc.subject.keywordAuthor | Planning | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.