TensorFlow

Tensorflow is a deep learning library developed by Google with a user friendly API that allows users to build machine learning models easily. Tensorflow is available on Knot only for the CPU mode (the GPUs on Knot are of an older generation which does not support Tensorflow). 

Tensorflow runs on a container based on Singularity, and uses the Ubuntu kernel. 

Instructions

To use tensorflow include the following lines in your .bashrc (or .profile)

export PATH=/sw/csc/singularity/bin/:$PATH
export LD_LIBRARY_PATH=/sw/csc/singularity/lib/singularity:$LD_LIBRARY_PATH

This is pretty much what you need to do!

Example

Below is a simple example code adapted from A. Damien's repository. This example builds a simple linear regression model using the computational graph scheme in Tensorflow

from __future__ import print_function

import tensorflow as tf
import numpy
rng = numpy.random

# Parameters
learning_rate = 0.01
training_epochs = 1000
display_step = 50

# Training Data
train_X = numpy.asarray([3.3,4.4,5.5,6.71,6.93,4.168,9.779,6.182,7.59,2.167,
                         7.042,10.791,5.313,7.997,5.654,9.27,3.1])
train_Y = numpy.asarray([1.7,2.76,2.09,3.19,1.694,1.573,3.366,2.596,2.53,1.221,
                         2.827,3.465,1.65,2.904,2.42,2.94,1.3])
n_samples = train_X.shape[0]

# tf Graph Input
X = tf.placeholder("float")
Y = tf.placeholder("float")

# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")

# Construct a linear model
pred = tf.add(tf.multiply(X, W), b)

# Mean squared error
cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)
# Gradient descent
#  Note, minimize() knows to modify W and b because Variable objects are trainable=True by default
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# Initializing the variables
init = tf.global_variables_initializer()

# Launch the graph

with tf.Session() as sess:
    sess.run(init)

    # Fit all training data
    for epoch in range(training_epochs):
        for (x, y) in zip(train_X, train_Y):
            sess.run(optimizer, feed_dict={X: x, Y: y})

        # Display logs per epoch step
        if (epoch+1) % display_step == 0:
            c = sess.run(cost, feed_dict={X: train_X, Y:train_Y})
            print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), \
                "W=", sess.run(W), "b=", sess.run(b))

    print("Optimization Finished!")
    training_cost = sess.run(cost, feed_dict={X: train_X, Y: train_Y})
    print("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')

    # Testing example, as requested (Issue #2)
    test_X = numpy.asarray([6.83, 4.668, 8.9, 7.91, 5.7, 8.7, 3.1, 2.1])
    test_Y = numpy.asarray([1.84, 2.273, 3.2, 2.831, 2.92, 3.24, 1.35, 1.03])

    print("Testing... (Mean square loss Comparison)")
    testing_cost = sess.run(
        tf.reduce_sum(tf.pow(pred - Y, 2)) / (2 * test_X.shape[0]),
        feed_dict={X: test_X, Y: test_Y})  # same function as cost above
    print("Testing cost=", testing_cost)
    print("Absolute mean square loss difference:", abs(
        training_cost - testing_cost))

Suppose the name of this file is 

linear.py

Then, what you need is to include, in the same folder, is to have a job submission script (suppose it is called submit.job):

#!/bin/bash

#PBS -l nodes=1:ppn=12
#PBS -l walltime=1:00:00
#PBS -N TFlinear
#PBS -o out.log
#PBS -V

# Make sure that you are in the job submission directory
cd $PBS_O_WORKDIR

singularity exec /sw/csc/SingularityImg/ubuntu_w_TFlowKeras.img python linear.py

There are several points which require some attention:

  • Notice that we do not call the python on the host, but rather use the singularity container we built (ubuntu_w_TFlow.img). The job will fail without it.
  • We cannot use more than 1 node. This image does not contain the MPI utilized version of Tensorflow (which has just recently been released and we have not tested it yet).
  • Notice that the container image uses Python 2.7. 

Then, simply submit your job to the queue by

qsub submit.job