- Why does MADDPG use action log prob for Q (Critic) instead of sampled action? https://arxiv.org/pdf/1706.02275.pdf 4 comments reinforcementlearning
Linking pages
- GitHub - openai/multiagent-particle-envs: Code for a multi-agent particle environment used in the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments" https://github.com/openai/multiagent-particle-envs 4 comments
- Model-Based RL for Decentralized Multi-agent Navigation – Google AI Blog https://ai.googleblog.com/2021/04/model-based-rl-for-decentralized-multi.html 0 comments
- GitHub - openai/maddpg: Code for the MADDPG algorithm from the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments" https://github.com/openai/maddpg/ 0 comments
- GitHub - opendilab/DI-engine: OpenDILab Decision AI Engine https://github.com/opendilab/DI-engine 0 comments
Related searches:
Search whole site: site:arxiv.org
Search title: Why does MADDPG use action log prob for Q (Critic) instead of sampled action?
See how to search.