diff --git a/README.md b/README.md index a5234bd9ba..4c5b45ea2d 100644 --- a/README.md +++ b/README.md @@ -1116,22 +1116,22 @@ An overview of the implemented schedules: - `ConstantLR`: always returns learning rate 1. - `WarmupConstantSchedule`: Linearly increases learning rate from 0 to 1 over `warmup` fraction of training steps. Keeps learning rate equal to 1. after warmup. - ![](docs/imgs/warmup_constant_schedule.png) + ![](docs/source/imgs/warmup_constant_schedule.png) - `WarmupLinearSchedule`: Linearly increases learning rate from 0 to 1 over `warmup` fraction of training steps. Linearly decreases learning rate from 1. to 0. over remaining `1 - warmup` steps. - ![](docs/imgs/warmup_linear_schedule.png) + ![](docs/source/imgs/warmup_linear_schedule.png) - `WarmupCosineSchedule`: Linearly increases learning rate from 0 to 1 over `warmup` fraction of training steps. Decreases learning rate from 1. to 0. over remaining `1 - warmup` steps following a cosine curve. If `cycles` (default=0.5) is different from default, learning rate follows cosine function after warmup. - ![](docs/imgs/warmup_cosine_schedule.png) + ![](docs/source/imgs/warmup_cosine_schedule.png) - `WarmupCosineWithHardRestartsSchedule`: Linearly increases learning rate from 0 to 1 over `warmup` fraction of training steps. If `cycles` (default=1.) is different from default, learning rate follows `cycles` times a cosine decaying learning rate (with hard restarts). - ![](docs/imgs/warmup_cosine_hard_restarts_schedule.png) + ![](docs/source/imgs/warmup_cosine_hard_restarts_schedule.png) - `WarmupCosineWithWarmupRestartsSchedule`: All training progress is divided in `cycles` (default=1.) parts of equal length. Every part follows a schedule with the first `warmup` fraction of the training steps linearly increasing from 0. to 1., followed by a learning rate decreasing from 1. to 0. following a cosine curve. Note that the total number of all warmup steps over all cycles together is equal to `warmup` * `cycles` - ![](docs/imgs/warmup_cosine_warm_restarts_schedule.png) + ![](docs/source/imgs/warmup_cosine_warm_restarts_schedule.png) ## Examples diff --git a/docs/source/_static/css/code-snippets.css b/docs/source/_static/css/code-snippets.css new file mode 100644 index 0000000000..4d525e95d7 --- /dev/null +++ b/docs/source/_static/css/code-snippets.css @@ -0,0 +1,12 @@ + +.highlight .c1{ + color: #999 +} + +.highlight .nn, .highlight .k, .highlight .s1, .highlight .nb, .highlight .bp { + color: #FB8D68; +} + +.highlight .kn, .highlight .nv, .highlight .s2 { + color: #6670FF; +} \ No newline at end of file diff --git a/docs/source/_static/css/huggingface.css b/docs/source/_static/css/huggingface.css new file mode 100644 index 0000000000..f50726b57d --- /dev/null +++ b/docs/source/_static/css/huggingface.css @@ -0,0 +1,144 @@ +/* The literal code blocks */ +.rst-content tt.literal, .rst-content tt.literal, .rst-content code.literal { + color: #6670FF; +} + +/* To keep the logo centered */ +.wy-side-scroll { + width: auto; +} + +/* The div that holds the Hugging Face logo */ +.HuggingFaceDiv { + width: 100% +} + +/* The research field on top of the toc tree */ +.wy-side-nav-search{ + background-color: #6670FF; +} + +/* The toc tree */ +.wy-nav-side{ + background-color: #6670FF; +} + +/* The selected items in the toc tree */ +.wy-menu-vertical li.current{ + background-color: #A6B0FF; +} + +/* When a list item that does belong to the selected block from the toc tree is hovered */ +.wy-menu-vertical li.current a:hover{ + background-color: #FB8D68; +} + +/* When a list item that does NOT belong to the selected block from the toc tree is hovered. */ +.wy-menu-vertical li a:hover{ + background-color: #FB8D68; +} + +/* The text items on the toc tree */ +.wy-menu-vertical a { + color: #FFFFDD; + font-family: Calibre-Light; +} +.wy-menu-vertical header, .wy-menu-vertical p.caption{ + color: white; + font-family: Calibre-Light; +} + +/* The color inside the selected toc tree block */ +.wy-menu-vertical li.toctree-l2 a, .wy-menu-vertical li.toctree-l3 a, .wy-menu-vertical li.toctree-l4 a { + color: black; +} + +/* Inside the depth-2 selected toc tree block */ +.wy-menu-vertical li.toctree-l2.current>a { + background-color: #B6C0FF +} +.wy-menu-vertical li.toctree-l2.current li.toctree-l3>a { + background-color: #C6D0FF +} + +/* Inside the depth-3 selected toc tree block */ +.wy-menu-vertical li.toctree-l3.current li.toctree-l4>a{ + background-color: #D6E0FF +} + +/* Inside code snippets */ +.rst-content dl:not(.docutils) dt{ + font-size: 15px; +} + +/* Links */ +a { + color: #6670FF; +} + +/* Content bars */ +.rst-content dl:not(.docutils) dt { + background-color: rgba(251, 141, 104, 0.1); + border-right: solid 2px #FB8D68; + border-left: solid 2px #FB8D68; + color: #FB8D68; + font-family: Calibre-Light; + border-top: none; + font-style: normal !important; +} + +/* Expand button */ +.wy-menu-vertical li.toctree-l2 span.toctree-expand, +.wy-menu-vertical li.on a span.toctree-expand, .wy-menu-vertical li.current>a span.toctree-expand, +.wy-menu-vertical li.toctree-l3 span.toctree-expand{ + color: black; +} + +/* Max window size */ +.wy-nav-content{ + max-width: 1200px; +} + +/* Mobile header */ +.wy-nav-top{ + background-color: #6670FF; +} + +/* FONTS */ +body{ + font-family: Calibre; + font-size: 20px; +} + +h1 { + font-family: Calibre-Thin; + font-size: 70px; +} + +h2, .rst-content .toctree-wrapper p.caption, h3, h4, h5, h6, legend{ + font-family: Calibre-Medium; +} + +@font-face { + font-family: Calibre-Medium; + src: url(./Calibre-Medium.otf); + font-weight:400; +} + +@font-face { + font-family: Calibre; + src: url(./Calibre-Regular.otf); + font-weight:400; +} + +@font-face { + font-family: Calibre-Light; + src: url(./Calibre-Light.ttf); + font-weight:400; +} + +@font-face { + font-family: Calibre-Thin; + src: url(./Calibre-Thin.otf); + font-weight:400; +} diff --git a/docs/source/_static/js/custom.js b/docs/source/_static/js/custom.js new file mode 100644 index 0000000000..9ddbbb7c49 --- /dev/null +++ b/docs/source/_static/js/custom.js @@ -0,0 +1,18 @@ +function addIcon() { + const huggingFaceLogo = "http://lysand.re/huggingface_logo.svg"; + const image = document.createElement("img"); + image.setAttribute("src", huggingFaceLogo) + + + const div = document.createElement("div") + div.appendChild(image); + div.style.textAlign = 'center'; + div.style.paddingTop = '30px'; + div.style.backgroundColor = '#6670FF' + + const scrollDiv = document.getElementsByClassName("wy-side-scroll")[0]; + scrollDiv.prepend(div) +} + +window.addEventListener("load", addIcon) + diff --git a/docs/source/_static/js/huggingface_logo.svg b/docs/source/_static/js/huggingface_logo.svg new file mode 100644 index 0000000000..84974866ce --- /dev/null +++ b/docs/source/_static/js/huggingface_logo.svg @@ -0,0 +1,47 @@ + + + icon + Created with Sketch. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/docs/source/conf.py b/docs/source/conf.py index 7675393807..978b204466 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -176,5 +176,9 @@ epub_title = project # A list of files that should not be packed into the epub file. epub_exclude_files = ['search.html'] +def setup(app): + app.add_stylesheet('css/huggingface.css') + app.add_stylesheet('css/code-snippets.css') + app.add_js_file('js/custom.js') # -- Extension configuration ------------------------------------------------- diff --git a/docs/imgs/warmup_constant_schedule.png b/docs/source/imgs/warmup_constant_schedule.png similarity index 100% rename from docs/imgs/warmup_constant_schedule.png rename to docs/source/imgs/warmup_constant_schedule.png diff --git a/docs/imgs/warmup_cosine_hard_restarts_schedule.png b/docs/source/imgs/warmup_cosine_hard_restarts_schedule.png similarity index 100% rename from docs/imgs/warmup_cosine_hard_restarts_schedule.png rename to docs/source/imgs/warmup_cosine_hard_restarts_schedule.png diff --git a/docs/imgs/warmup_cosine_schedule.png b/docs/source/imgs/warmup_cosine_schedule.png similarity index 100% rename from docs/imgs/warmup_cosine_schedule.png rename to docs/source/imgs/warmup_cosine_schedule.png diff --git a/docs/imgs/warmup_cosine_warm_restarts_schedule.png b/docs/source/imgs/warmup_cosine_warm_restarts_schedule.png similarity index 100% rename from docs/imgs/warmup_cosine_warm_restarts_schedule.png rename to docs/source/imgs/warmup_cosine_warm_restarts_schedule.png diff --git a/docs/imgs/warmup_linear_schedule.png b/docs/source/imgs/warmup_linear_schedule.png similarity index 100% rename from docs/imgs/warmup_linear_schedule.png rename to docs/source/imgs/warmup_linear_schedule.png diff --git a/docs/source/index.rst b/docs/source/index.rst index d7b60bd660..49df768561 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -1,4 +1,4 @@ -Pytorch-Transformers: The Big & Extending Repository of pretrained Transformers +Pytorch-Transformers ================================================================================================================================================ diff --git a/docs/source/model_doc/overview.rst b/docs/source/model_doc/overview.rst index 8f5e94baf1..7c426aa798 100644 --- a/docs/source/model_doc/overview.rst +++ b/docs/source/model_doc/overview.rst @@ -39,10 +39,8 @@ configuration files. The respective configuration classes are: These configuration classes contains a few utilities to load and save configurations: -* ``from_dict(cls, json_object)``\ : A class method to construct a configuration from a Python dictionary of parameters. - Returns an instance of the configuration class. -* ``from_json_file(cls, json_file)``\ : A class method to construct a configuration from a json file of parameters. -Returns an instance of the configuration class. +* ``from_dict(cls, json_object)``\ : A class method to construct a configuration from a Python dictionary of parameters. Returns an instance of the configuration class. +* ``from_json_file(cls, json_file)``\ : A class method to construct a configuration from a json file of parameters. Returns an instance of the configuration class. * ``to_dict()``\ : Serializes an instance to a Python dictionary. Returns a dictionary. * ``to_json_string()``\ : Serializes an instance to a JSON string. Returns a string. * ``to_json_file(json_file_path)``\ : Save an instance to a json file. @@ -247,40 +245,44 @@ An overview of the implemented schedules: * ``ConstantLR``\ : always returns learning rate 1. -* ``WarmupConstantSchedule``\ : Linearly increases learning rate from 0 to 1 over ``warmup`` fraction of training steps. +* ``WarmupConstantSchedule`` : Linearly increases learning rate from 0 to 1 over ``warmup`` fraction of training steps. Keeps learning rate equal to 1. after warmup. - .. image:: docs/imgs/warmup_constant_schedule.png - :target: docs/imgs/warmup_constant_schedule.png + .. image:: /imgs/warmup_constant_schedule.png + :target: /imgs/warmup_constant_schedule.png :alt: -* ``WarmupLinearSchedule``\ : Linearly increases learning rate from 0 to 1 over ``warmup`` fraction of training steps. + +* ``WarmupLinearSchedule`` : Linearly increases learning rate from 0 to 1 over ``warmup`` fraction of training steps. Linearly decreases learning rate from 1. to 0. over remaining ``1 - warmup`` steps. - .. image:: docs/imgs/warmup_linear_schedule.png - :target: docs/imgs/warmup_linear_schedule.png + .. image:: /imgs/warmup_linear_schedule.png + :target: /imgs/warmup_linear_schedule.png :alt: -* ``WarmupCosineSchedule``\ : Linearly increases learning rate from 0 to 1 over ``warmup`` fraction of training steps. + +* ``WarmupCosineSchedule`` : Linearly increases learning rate from 0 to 1 over ``warmup`` fraction of training steps. Decreases learning rate from 1. to 0. over remaining ``1 - warmup`` steps following a cosine curve. If ``cycles`` (default=0.5) is different from default, learning rate follows cosine function after warmup. - .. image:: docs/imgs/warmup_cosine_schedule.png - :target: docs/imgs/warmup_cosine_schedule.png + .. image:: /imgs/warmup_cosine_schedule.png + :target: /imgs/warmup_cosine_schedule.png :alt: -* ``WarmupCosineWithHardRestartsSchedule``\ : Linearly increases learning rate from 0 to 1 over ``warmup`` fraction of training steps. + +* ``WarmupCosineWithHardRestartsSchedule`` : Linearly increases learning rate from 0 to 1 over ``warmup`` fraction of training steps. If ``cycles`` (default=1.) is different from default, learning rate follows ``cycles`` times a cosine decaying learning rate (with hard restarts). - .. image:: docs/imgs/warmup_cosine_hard_restarts_schedule.png - :target: docs/imgs/warmup_cosine_hard_restarts_schedule.png + .. image:: /imgs/warmup_cosine_hard_restarts_schedule.png + :target: /imgs/warmup_cosine_hard_restarts_schedule.png :alt: -* ``WarmupCosineWithWarmupRestartsSchedule``\ : All training progress is divided in ``cycles`` (default=1.) parts of equal length. + +* ``WarmupCosineWithWarmupRestartsSchedule`` : All training progress is divided in ``cycles`` (default=1.) parts of equal length. Every part follows a schedule with the first ``warmup`` fraction of the training steps linearly increasing from 0. to 1., followed by a learning rate decreasing from 1. to 0. following a cosine curve. Note that the total number of all warmup steps over all cycles together is equal to ``warmup`` * ``cycles`` - .. image:: docs/imgs/warmup_cosine_warm_restarts_schedule.png - :target: docs/imgs/warmup_cosine_warm_restarts_schedule.png + .. image:: /imgs/warmup_cosine_warm_restarts_schedule.png + :target: /imgs/warmup_cosine_warm_restarts_schedule.png :alt: \ No newline at end of file