Device: cuda Loaded pretrained from output_v5/checkpoints/epoch_0120.pt (epoch 120) Trainable params: 9.81M (predictor + text encoder + start_emb) CodecDataset: 6879 samples (max 900 frames) Training: 6879 samples, 26 batches/epoch [1/200] step 20 | pred=0.4082 | grad=0.27 | 29.2s Epoch 1/200 | pred=0.4050 | 37.5s New best (pred=0.4050) [2/200] step 40 | pred=0.3979 | grad=0.17 | 19.9s Epoch 2/200 | pred=0.4019 | 36.9s New best (pred=0.4019) [3/200] step 60 | pred=0.4054 | grad=0.10 | 11.4s Epoch 3/200 | pred=0.4001 | 36.7s New best (pred=0.4001) [4/200] step 80 | pred=0.4018 | grad=0.08 | 3.3s [4/200] step 100 | pred=0.3983 | grad=0.08 | 31.3s Epoch 4/200 | pred=0.3998 | 37.0s New best (pred=0.3998) [5/200] step 120 | pred=0.3939 | grad=0.05 | 23.0s Epoch 5/200 | pred=0.3993 | 36.8s New best (pred=0.3993) [6/200] step 140 | pred=0.4009 | grad=0.05 | 14.1s Epoch 6/200 | pred=0.3993 | 36.6s New best (pred=0.3993) [7/200] step 160 | pred=0.4047 | grad=0.06 | 6.2s [7/200] step 180 | pred=0.4005 | grad=0.13 | 34.0s Epoch 7/200 | pred=0.3982 | 36.7s New best (pred=0.3982) [8/200] step 200 | pred=0.4074 | grad=0.07 | 25.8s Epoch 8/200 | pred=0.3979 | 36.9s New best (pred=0.3979) [9/200] step 220 | pred=0.4003 | grad=0.07 | 17.3s Epoch 9/200 | pred=0.3974 | 36.6s New best (pred=0.3974) [10/200] step 240 | pred=0.3957 | grad=0.08 | 8.9s [10/200] step 260 | pred=0.3976 | grad=0.08 | 36.8s Epoch 10/200 | pred=0.3969 | 36.8s Saved: output_v5_pred/checkpoints/epoch_0010.pt New best (pred=0.3969)