fix for negative learning rate with warmup_linear in BertAdam (happens when t_total is specified incorrectly)

+ copied BERT optimization warmup functions to OpenAI optimization file + added comments
This commit is contained in:
lukovnikov
2019-02-26 17:16:06 +01:00
parent e04bab59e1
commit da2d8ca265
2 changed files with 2 additions and 2 deletions

View File

@@ -37,7 +37,7 @@ def warmup_linear(x, warmup=0.002):
After `t_total`-th training step, learning rate is zero. """
if x < warmup:
return x/warmup
return max(1.0 - x, 0)
return max((x-1.)/(warmup-1.), 0)
SCHEDULES = {
'warmup_cosine':warmup_cosine,

View File

@@ -37,7 +37,7 @@ def warmup_linear(x, warmup=0.002):
After `t_total`-th training step, learning rate is zero. """
if x < warmup:
return x/warmup
return max(1.0 - x, 0)
return max((x-1.)/(warmup-1.), 0)
SCHEDULES = {
'warmup_cosine':warmup_cosine,