Last active
May 2, 2018 18:55
-
-
Save awjuliani/86ae316a231bceb96a3e2ab3ac8e646a to your computer and use it in GitHub Desktop.
Reinforcement Learning Tutorial 2 (Cart Pole problem)
log(P(y|x)) = (1-input_y)log(probability) + input_ylog(1-probability) , and the loss function in above code happened to the same result as this maximum likelihood.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
tGrad,bggg = sess.run([newGrads,begining],feed_dict={observations: epx, input_y: epy, advantages: discounted_epr})The variable
beginingdid not defined.I change the code like this.
tGrad = sess.run(newGrads, feed_dict={observations: epx, input_y: epy, advantages: discounted_epr})And it works well.