Description
Goal
The goal of this assignment is to use tensorflow to build some neural networks, and to experiment with the options and flexibility that tensorflow offers.
Data Set
For Part 1 of the assignment, you will use the room occupancy data set that you used for Assignment 2.
For Part 2 of the assignment, you will use the data set available here from me. This data set consists of 104 relationships described by the Hinton Family Trees network. Here is some very embarrassing python code I wrote to read in the data set and build an input/output representation that can be fed to tensorflow.
Part 1
Use tensorflow to build a feedforward neural net to predict occupancy. This net should do exactly the same thing that your code in Part 2 of Assignment 2 does.
(1a) Run a simulation using tensorflow that is identical to Assignment 2, part 2f, in which you vary the number of hidden units and make a plot. Superimpose the plot you made from Assignment 2, part 2f.
(1b) DIscuss the results: are they the same as with your own code? If one works better than the other, explain why you think that is.
(1c) Add a second hidden layer, and train a few architectures with 2 hidden layers. Report what architectures you tried (expressed as 5-h1-h2-1, i.e., 5 input, h1 hidden in first layer, h2 hidden in second layer, and one output unit), and which ones, if any, outperform your single-hidden-layer network.
It will not be a big deal to add the second hidden layer once you have all the rest of your code in place.
Part 2
Replicate the Hinton Family Trees architecture in tensorflow. My notes describe the architecture and number of neurons in each layer. You can also refer to the original paper.
(2a) Randomly split the data set into 89 examples for training and 15 for testing. Train and evaluate 20 such random splits
of the data and report the mean and standard deviation of the test set accuracy. (Report accuracy not squared error. A response should be counted as correct if the most active unit is the target person2.)
(2b) Train a network on all 104 examples and examine the weights from the one-hot person1 input representation to the distributed person1 representation. Figure out a sensible way to graphically display the weights in the 6 hidden units.
(2c) For at least 2 of the hidden units, interpret what the network has learned in its mapping from inputs. You’ll have to refer to my create_dataset.py code to determine the interpretation of the person1 input neurons. (I tried to keep the same ordering as Hinton uses.)
Part 3 (Extra Credit)
Compare the achitecture Hinton describes for Family trees with a generic feedforward architecture with one hidden layer consisting of 12 neurons.
(3a) Conduct an experiment like the one in (2a) using the generic architecture. Report the mean and standard deviation of the test set accuracy.
(3b) Do you see any difference in performance between the structured net (2a) and the generic net (3a)?
(3c) It wasn’t difficult to interpret at least some of the hidden units in the structured net. Can you interpret what any of the hidden units are doing for the generic net? Explain.