Experimented with and compared DFW neural network optimizer with SGD and ADAM on both vision and language tasks
pytorch densenet dfw cifar10 back-propagation wide-resnet bilstm infersent optimizer-algorithms deep-frank-wolfie
-
Updated
May 21, 2021 - Jupyter Notebook