artificial foundation ResNet
artificial based HuggingFace implementation for weights dropout.
- Input
- 4316-dim embedding
- Encoder
- 113 x ResNet with 14 heads
- Output
- recall projection
Training config
optimizer=NAdam, lr=0.964, scheduler=plateau, warmup=1923