Skip to content

Instantly share code, notes, and snippets.

@kailaix
Last active December 28, 2018 02:54
Show Gist options
  • Select an option

  • Save kailaix/1418bb01844d0794815852e2cc14b273 to your computer and use it in GitHub Desktop.

Select an option

Save kailaix/1418bb01844d0794815852e2cc14b273 to your computer and use it in GitHub Desktop.
Predicting CO2 using PyTensorFlow
using DelimitedFiles
using PyPlot
using PyTensorFlow
using Statistics
reset_default_graph()
if !("co2.csv" in readdir("."))
download("ftp://data.iac.ethz.ch/CMIP6/input4MIPs/UoM/GHGConc/CMIP/yr/atmos/UoM-CMIP-1-1-0/GHGConc/gr3-GMNHSH/v20160701/mole_fraction_of_carbon_dioxide_in_air_input4MIPs_GHGConcentrations_CMIP_UoM-CMIP-1-1-0_gr3-GMNHSH_0000-2014.csv","co2.csv")
end
D = readdlm("co2.csv",',',header=true)[1][10:end,2]
# m = mean(D)
# σ = std(D)
# D = (D .- m)/σ
# we will try to fit D
N = length(D)
num_layers = 4
x = reshape(1:N|>collect, N, 1)
m = mean(x)
σ = std(x)
x = (x.-m)/σ
x = constant(x, dtype=Float64)
is_training = placeholder(Bool)
variable_scope("co2", reuse=AUTO_REUSE, initializer=random_normal_initializer(0.,0.1)) do
global net=x
for i = 1:num_layers
if i!=num_layers
net = dense(net, 20)
net = tanh(net)
else
net = dense(net, 1)
net = squeeze(net, axis=2)
end
end
end
loss = sum((net - D)^2)
sess = Session()
run(sess, global_variables_initializer())
__cnt = 0
function print_loss(l)
global __cnt
if mod(__cnt,100)==0
println("iter $__cnt, current loss=",l)
end
__cnt += 1
end
opt = ScipyOptimizerInterface(loss, method="L-BFGS-B",options=Dict("maxiter"=> 30000, "ftol"=>1e-12, "gtol"=>1e-12))
ScipyOptimizerMinimize(sess, opt, loss_callback=print_loss, fetches=[loss])
plot(D,label="real")
plot(run(sess, net),"--",label="fitted")
legend()
xlabel("years")
ylabel(L"CO_2")
@kailaix
Copy link
Author

kailaix commented Dec 28, 2018

We demonstrate curve-fitting using TensorFlow. Note it is very important to normalize the input data so the nonlinear activation function does not saturate in the early stage. A much better way is to add batch_normalization in between

net = dense(net, 20)
net = tanh(net)

The choice of number of layers, hidden neuron sizes, or in general architectures is affected by the tradeoff between the optimization ability and the representation power for the neural networks.

There are many training tricks online available nowadays. In general, they can be divided into two classes: Internal Medicine or Surgery. Surgery is related to changing the neural network architectures. For example, we can add batch normalization, change the type of activation function, add skip connection, etc. The Internal Medicine does not modify the neural networks itself, but tunes the optimization. For example, the decision whether to use LBFGS or SGD, the dynamic feature of the step size, etc. Researchers have devoted considerable time and effort to pushing the cutting-edge techniques in both areas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment