pinocchio  2.7.1
A fast and flexible implementation of Rigid Body Dynamics algorithms and their analytical derivatives
qtable Namespace Reference

Functions

def rendertrial (maxiter=100)
 

Variables

float DECAY_RATE = 0.99
 
 env = DPendulum()
 — Environment
 
list h_rwd = []
 
float LEARNING_RATE = 0.85
 
int NEPISODES = 500
 — Hyper paramaters
 
int NSTEPS = 50
 
 NU = env.nu
 
 NX = env.nx
 
 Q = np.zeros([env.nx,env.nu])
 
float Qref = reward + DECAY_RATE*np.max(Q[x2,:])
 
 RANDOM_SEED = int((time.time()%10)*1000)
 — Random seed
 
 reward
 
float rsum = 0.0
 
 u = np.argmax(Q[x,:] + np.random.randn(1,NU)/episode)
 
 x = env.reset()
 
 x2
 

Detailed Description

Example of Q-table learning with a simple discretized 1-pendulum environment.

Function Documentation

◆ rendertrial()

def qtable.rendertrial (   maxiter = 100)
Roll-out from random state using greedy policy.

Definition at line 29 of file qtable.py.