Distributions

Distributions for Stochastic Policy in RLzoo

Definition of parametrized distributions. Adapted from openai/baselines

class rlzoo.common.distributions.Categorical(ndim, logits=None)[source]

Bases: rlzoo.common.distributions.Distribution

Creates a categorical distribution

entropy()[source]

Calculate the entropy of distribution.

get_param()[source]
greedy_sample()[source]

Get actions greedily

kl(logits)[source]
Args:
logits (tensor): logits variables of another distribution
logp(x)[source]

Calculate log probability of a sample.

ndim
neglogp(x)[source]

Calculate negative log probability of a sample.

sample()[source]

Sample actions from distribution, using the Gumbel-Softmax trick

set_param(logits)[source]
Args:
logits (tensor): logits variables to set
class rlzoo.common.distributions.DiagGaussian(ndim, mean_logstd=None)[source]

Bases: rlzoo.common.distributions.Distribution

Creates a diagonal Gaussian distribution

entropy()[source]

Calculate the entropy of distribution.

get_param()[source]

Get parameters

greedy_sample()[source]

Get actions greedily/deterministically

kl(mean_logstd)[source]
Args:
mean_logstd (tensor): mean and logstd of another distribution
logp(x)[source]

Calculate log probability of a sample.

ndim
neglogp(x)[source]

Calculate negative log probability of a sample.

sample()[source]

Get actions in deterministic or stochastic manner

set_param(mean_logstd)[source]
Args:
mean_logstd (tensor): mean and log std
class rlzoo.common.distributions.Distribution[source]

Bases: object

A particular probability distribution

entropy()[source]

Calculate the entropy of distribution.

kl(*parameters)[source]

Calculate Kullback–Leibler divergence

logp(x)[source]

Calculate log probability of a sample.

neglogp(x)[source]

Calculate negative log probability of a sample.

sample(*args, **kwargs)[source]

Sampling from distribution. Allow explore parameters.

set_param(*args, **kwargs)[source]
rlzoo.common.distributions.expand_dims(func)[source]
rlzoo.common.distributions.make_dist(ac_space)[source]

Get distribution based on action space

Parameters:ac_space – gym.spaces.Space