Benjamin Badger
ff689f57aa
Extend save_pretrained to offloaded models (#27412)
* added hidden subset
* debugged hidden subset contrastive search
* added contrastive search compression
* debugged compressed contrastive search
* memory reduction for contrastive search
* debugged mem red
* added low memory option feature
* debugged mem optmimization output stack
* debugged mem optmimization output stack
* debugged low mem
* added low mem cache
* fixed 2047 tensor view
* debugged 2042 past key val inputs
* reformatted tensors
* changed low mem output
* final clean
* removed subset hidden csearch
* fixed hidden device
* fixed hidden device
* changed compressor dtype
* removed hstate compression
* integrated csearch in generate
* test csearch integration into generation
exit()
* fixed csearch kwarg integration with generation
* final wrap and added doc
* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* added debug print
* direct hstate cat
* direct hstate cat
* direct hstate cat debug
* direct hstate cat debug
* expanded full hidden state stack
* expanded full hidden state stack
* matched dims for hstates
* matched dims for hstates
* logits fix
* equality test
* equality hidden debug
* debug
* added prints for debug
* added prints for debug
* equality check
* switched squeeze dim
* input format debug
* tracing top_k_ids
* removed trace
* added test context
* added jitter
* added jitter
* added jitter
* returned state
* rebuilt past key value reconstruction
* debugged
* cleaned traces
* added selection for pkv
* changed output to dict
* cleaned
* cleaned
* cleaned up contrastive search test
* moved low_memory kwarg
* debugged
* changed low mem test batch size to 1
* removed output
* debugged test input shape
* reformatted csearch test
* added trace
* removed unsqueeze on final forward pass
* replaced unsqueeze with view
* removed traces
* cleaned
* debugged model kwargs
* removed special models from test
* ran make quality
* Update src/transformers/generation/configuration_utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/generation/configuration_utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* refactored
* refactored
* refactored
* make fixup
* renamed flag sequential
* renamed flag sequential
* iterative onloading
* black style and test utils
* added traces for integrated test
* debugged
* added traces
* make style
* removed traces, make style
* included suggestions and added test
* debugged test
* added offload module check and make style
* is_accelerate_available and make style
* added test decorator
* changed test model and config spec
* added offload condition
* added lazy loading for each shard
* debugged
* modified sharding
* debugged
* added traces
* removed safe serialization
* no index overload;
* trace on safe save ptrs
* added ptr condition
* debugged
* debugged ptr
* moved module map init
* remake shard only for offloaded modules
* refactored
* debugged
* refactored
* debugged
* cleaned and make style
* cleaned and make style
* added trace
* sparse module map
* debugged
* removed module map conditional
* refactored
* debug
* debugged
* added traces
* added shard mem trace
* added shard mem trace
* removed underlying storage check
* refactored
* memory leak removal and make style
* cleaned
* swapped test decs and make style
* added mem checks and make style
* added free mem warning
* implemented some suggestions
* moved onloading to accelerate
* refactored for accelerate integration
* cleaned test
* make style
* debugged offload map name
* cleaned and make style
* replaced meta device check for sharding
* cleaned and make style
* implemented some suggestions
* more suggestions
* update warning
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* more suggestions
* make style
* new make style
* Update src/transformers/modeling_utils.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/modeling_utils.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/modeling_utils.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-06-07 07:50:35 -04:00
..
2024-05-31 14:16:23 +02:00
2023-10-09 11:04:57 +02:00
2024-05-15 10:02:31 -04:00
2024-05-13 18:14:36 +02:00
2024-03-19 14:43:02 +00:00
2024-04-22 13:15:28 +01:00
2024-06-06 15:21:32 +05:00
2024-06-07 11:51:41 +02:00
2024-05-30 15:25:43 +01:00
2024-02-29 03:56:16 +01:00
2024-06-06 14:50:45 +01:00
2024-06-03 14:55:10 +01:00
2023-12-07 10:00:08 +01:00
2024-05-22 06:40:15 +02:00
2024-05-22 06:40:15 +02:00
2024-05-29 16:20:59 +02:00
2024-05-29 11:55:43 +01:00
2023-12-20 18:33:17 +00:00
2024-03-06 10:57:04 +00:00
2023-11-15 14:10:39 +01:00
2024-05-29 11:55:43 +01:00
2023-06-15 07:30:24 -04:00
2024-03-15 14:18:41 +00:00
2024-02-20 16:20:20 +01:00
2024-03-15 14:18:41 +00:00
2024-05-21 13:56:52 +01:00
2024-05-24 11:51:51 +01:00
2024-05-16 10:56:11 +01:00
2024-01-23 10:28:23 +01:00
2024-05-13 15:59:46 +01:00
2024-03-21 14:04:11 +00:00
2024-06-07 07:50:35 -04:00
2024-02-05 14:50:07 +00:00
2024-01-19 09:59:14 +00:00
2023-09-05 10:12:25 +02:00
2024-05-24 08:38:58 -07:00
2024-06-03 10:53:15 +02:00