During training, randomly zeroes some of the elements of the input tensor with probability p
.
The zeroed elements are chosen independently for each forward call and are sampled from a Bernoulli distribution.
Each channel will be zeroed out independently on every forward call.
This has proven to be an effective technique for regularization and preventing the co-adaptation of neurons as described in the paper Improving neural networks by preventing co-adaptation of feature detectors .
Furthermore, the outputs are scaled by a factor of 1 1 − p \frac{1}{1-p} 1−p1 during training. This means that during evaluation the module simply computes an identity function.
Input: ( ∗ ) (*) (∗). Input can be of any shape
Output: ( ∗ ) (*) (∗). Output is of the same shape as input
Examples:
>>> m = nn.Dropout(p=0.2) >>> input = torch.randn(20, 16) >>> output = m(input)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4