diff --git a/.circleci/TROUBLESHOOT.md b/.circleci/TROUBLESHOOT.md new file mode 100644 index 0000000000..c662a921ba --- /dev/null +++ b/.circleci/TROUBLESHOOT.md @@ -0,0 +1,7 @@ +# Troubleshooting + +This is a document explaining how to deal with various issues on Circle-CI. The entries may include actually solutions or pointers to Issues that cover those. + +## Circle CI + +* pytest worker runs out of resident RAM and gets killed by `cgroups`: https://github.com/huggingface/transformers/issues/11408 diff --git a/.github/workflows/TROUBLESHOOT.md b/.github/workflows/TROUBLESHOOT.md new file mode 100644 index 0000000000..616ba8e55b --- /dev/null +++ b/.github/workflows/TROUBLESHOOT.md @@ -0,0 +1,9 @@ +# Troubleshooting + +This is a document explaining how to deal with various issues on github-actions self-hosted CI. The entries may include actually solutions or pointers to Issues that cover those. + +## GitHub Actions (self-hosted CI) + +* Deepspeed + + - if jit build hangs, clear out `rm -rf ~/.cache/torch_extensions/` reference: https://github.com/huggingface/transformers/pull/12723