Initial implementation of autograd#30
Conversation
|
@botev @jramapuram @itsnarsi This has been a long time coming, but I'd appreciate if you guys had any feedback as well. |
|
CC @arrayfire/core-devel |
|
@Reithan too |
|
Awesome work @pavanky . Will take a look in more detail when I get to a terminal. Quick question: can you take second derivatives with your implementation? |
|
@jramapuram Not yet, I wanted to get the first order working first :) |
|
@jramapuram went ahead and changed the gradients to be Variables too. This should make it easy to perform higher order derivatives. |
|
@pavanky just tested it on my laptop and it looks pretty neat. Unlike python, I did not see any initial delay. This might be because of no JIT I guess. |
|
@itsnarsi This is still very nascent. I want to incorporate some of the stuff mentioned here to make it more efficient: |
examples/FFNet.cpp
Outdated
| using namespace af; | ||
| using namespace afml; | ||
| using namespace afml::nn; | ||
| using namespace af; |
There was a problem hiding this comment.
Do you have a tool for detecting this or a really good eye :D
There was a problem hiding this comment.
A tool would be great. Unfortunately, I'm just an irritating nitpicker. 😇
include/af/autograd/Variable.hpp
Outdated
| { | ||
| if (m_grads.size() == 1) return; | ||
| Variable grad = m_grads[0]; | ||
| for (int i = 1; i < (int)m_grads.size(); i++) { |
There was a problem hiding this comment.
I would prefer unsigned iterable to avoid clang's -Wconversion signedness warnings when indexing to std::vector.
|
Decreased the scope of the PR to get a minimum viable thing going. The additional functions and operators can be added once this PR gets merged. |
- autograd::Variable::Shared now a thin layer without methods - Variable::BackwardFunc_t renamed to Variable::GradFunc_t - Variable::getData renamed to Variable::array - Variable::getGrad renamed to Variable::grad - Variable::backward renamed to Variable::calcGradInputs
|
@jramapuram I think enabling the support for higher order derivatives by default will increase the memory being used. I am going to enable a flag to enable it during the backward pass. By default only the values will be stored. |
- Disabled by default - can be enabled by passing true as second argument to backward
umar456
left a comment
There was a problem hiding this comment.
Minor preliminary comments. Everything looks great. We can refactor it later as long as we have a clean user-facing API.
|
|
||
| find_package(ArrayFire REQUIRED) | ||
|
|
||
| add_library(afml SHARED "") |
There was a problem hiding this comment.
If you don't add SHARED then you can control the type of library you make by BUILD_SHARED_LIBS variable
| Variable operator +(const Variable &lhs, const Variable &rhs) | ||
| { | ||
| auto result = lhs.array() + rhs.array(); | ||
| auto grad_func = [](std::vector<Variable> &inputs, const Variable &grad_output) { |
There was a problem hiding this comment.
Don't we usually have outputs then inputs?
It looks like you know the # of inputs for each function. I would use something like std::array<Variable, N> for something like that
There was a problem hiding this comment.
Both of these are inputs. grad_output is an input coming from a different place.
There was a problem hiding this comment.
And using std::array is not an option. All functions need to share the same signature so they can be stored as GradFunc_t inside Variable.
- Implemented baseclass nn::Module - Added basic modules: nn::Linear, nn::Sigmoid, nn:Tanh - Added container modules: nn:Container, nn:Sequential - Deleted unnecessary examples, cleaned up perceptron.cpp
- Trying to solve for the entire batch was a bad idea
umar456
left a comment
There was a problem hiding this comment.
A couple of minor issues. This is looking great!
examples/perceptron.cpp
Outdated
|
|
||
| // Update parameters | ||
| // TODO: Should use optimizer | ||
| for (auto param : perceptron.parameters()) { |
examples/perceptron.cpp
Outdated
| @@ -0,0 +1,88 @@ | |||
| /******************************************************* | |||
| * Copyright (c) 2015, ArrayFire | |||
include/af/autograd/Variable.hpp
Outdated
| GradFunc_t m_grad_func; | ||
| }; | ||
|
|
||
| public: |
There was a problem hiding this comment.
Needs to be aligned with other access qualifiers.
src/nn/Modules/Module.cpp
Outdated
| @@ -0,0 +1,61 @@ | |||
| /******************************************************* | |||
| * Copyright (c) 2015, ArrayFire | |||
src/nn/Modules/Module.cpp
Outdated
|
|
||
| void Module::eval() | ||
| { | ||
| for (auto parameter : m_parameters) { |
include/af/autograd/Variable.hpp
Outdated
| private: | ||
| void evalGrad(bool retain_grad_graph = false); | ||
|
|
||
| std::vector<Variable> getInputs() const; |
There was a problem hiding this comment.
Does this need to return by value?
What is done so far:
autograd::Variable,autograd::backward.Variable
af::arrayfrom the uservar.backward(grad_var)is invoked, it builds a DAG as vector starting with the current variable and propagates gradients down the graph to all the Variables in the graph using the grad function specified at each variable.var.setCalcGrad(false)Functions
Variableparameters and returnVariableas a parameter.Variableconstructed using arguments as parameters:af::array: The result calculated earliervector<Variable>: containing the inputs to the functionBackwardFunction_t: A function pointer to the backward pass. Usually implemented as a lambda function.Example function:
Example:
A simple example showcasing how this can be done currently
TODO: for this PR