Learning to Run with Actor-Critic Ensemble

Huang, Zhewei; Zhou, Shuchang; Zhuang, BoEr; Zhou, Xinyu

Computer Science > Machine Learning

arXiv:1712.08987 (cs)

[Submitted on 25 Dec 2017]

Title:Learning to Run with Actor-Critic Ensemble

Authors:Zhewei Huang, Shuchang Zhou, BoEr Zhuang, Xinyu Zhou

View PDF

Abstract:We introduce an Actor-Critic Ensemble(ACE) method for improving the performance of Deep Deterministic Policy Gradient(DDPG) algorithm. At inference time, our method uses a critic ensemble to select the best action from proposals of multiple actors running in parallel. By having a larger candidate set, our method can avoid actions that have fatal consequences, while staying deterministic. Using ACE, we have won the 2nd place in NIPS'17 Learning to Run competition, under the name of "Megvii-hzwer".

Comments:	3 pages, 4 figures
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:1712.08987 [cs.LG]
	(or arXiv:1712.08987v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1712.08987

Submission history

From: Zhewei Huang [view email]
[v1] Mon, 25 Dec 2017 02:03:12 UTC (213 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2017-12

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Zhewei Huang
Shuchang Zhou
BoEr Zhuang
Xinyu Zhou

export BibTeX citation

Computer Science > Machine Learning

Title:Learning to Run with Actor-Critic Ensemble

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Learning to Run with Actor-Critic Ensemble

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators