Computer Vision News - January 2017
Here is one step of an LSTM optimizer. All LSTMs have shared parameters, but separate hidden states: Communication between coordinates works based on global averaging cells; in each LSTM layer a subset is designed with their activations averaged at each step across all coordinates. An L-BFGS optimizer allows lower-ranked updates and the optimizer-operated coordinate-wise. Now, the tool itself: What is notable in this work is that the DeepMind team has released it as an open source. You can access it here . Since the computational graph of the architecture could be huge on MNIST and CIFAR-10, the current implementation only deals with the task on quadratic functions as described in section 3.1 of the paper . The implementation is based on TensorFlow and written in Python; after downloading the package, you can test it by training the MultiLayer Perceptron (MLP) network with the “MINST” dataset. You do that by simply running: train.py --problem=mnist --save_path=./mnist The save_path parameter saves the optimizer parameters for further evaluations. Question: Can I optimize the learning rate of my own network with this method? Yes! If you want to add your own network, all you have to do is to implement it in the problem.py file. To give you a test of how easy that is, we attached the MultiLayer Perceptron (MLP) MNIST implementation from the problem.py file. As you can see, it is a single function with a build() which returns the loss of the objective function to be optimized. 10 Computer Vision News Tool Tool “Can I optimize the learning rate of my own network?”
Made with FlippingBook
RkJQdWJsaXNoZXIy NTc3NzU=