Doc styling (#8067)
* Important files * Styling them all * Revert "Styling them all" This reverts commit 7d029395fdae8513b8281cbc2a6c239f8093503e. * Syling them for realsies * Fix syntax error * Fix benchmark_utils * More fixes * Fix modeling auto and script * Remove new line * Fixes * More fixes * Fix more files * Style * Add FSMT * More fixes * More fixes * More fixes * More fixes * Fixes * More fixes * More fixes * Last fixes * Make sphinx happy
This commit is contained in:
@@ -35,7 +35,8 @@ logger = logging.get_logger(__name__)
|
||||
|
||||
@jax.jit
|
||||
def gelu(x):
|
||||
r"""Gaussian error linear unit activation function.
|
||||
r"""
|
||||
Gaussian error linear unit activation function.
|
||||
|
||||
Computes the element-wise function:
|
||||
|
||||
@@ -43,9 +44,8 @@ def gelu(x):
|
||||
\mathrm{gelu}(x) = \frac{x}{2} \left(1 + \mathrm{tanh} \left(
|
||||
\sqrt{\frac{2}{\pi}} \left(x + 0.044715 x^3 \right) \right) \right)
|
||||
|
||||
We explicitly use the approximation rather than the exact formulation for
|
||||
speed. For more information, see `Gaussian Error Linear Units (GELUs)
|
||||
<https://arxiv.org/abs/1606.08415>`_, section 2.
|
||||
We explicitly use the approximation rather than the exact formulation for speed. For more information, see
|
||||
`Gaussian Error Linear Units (GELUs) <https://arxiv.org/abs/1606.08415>`_, section 2.
|
||||
"""
|
||||
return x * 0.5 * (1.0 + jax.lax.erf(x / jnp.sqrt(2.0)))
|
||||
|
||||
|
||||
Reference in New Issue
Block a user