Hana's Blog
Blog
Research
Technical
Daily Life
Game
Academic
Projects
Links
About
Travellings
🚇
Search
切换到English
中文
Dark Theme
Menu
Back
Tags:
#reinforce
Dec 18, 2025
RL笔记(9):REINFORCE
从价值到策略:详解策略梯度 (Policy Gradient) 定理的完整数学推导,并介绍最基础的策略梯度算法——REINFORCE。
7 min read
强化学习
rl笔记
reinforce