* weights only compability * better tests from code review * ping torch version * add weights_only check
batch_size
max_batch_size
FbgemmFp8Linear
use_parallel_residual
qkv_bias