Finally, I’ve got some time to write something about PyTorch, a popular deep learning tool. We suppose you have had fundamental understanding of Anaconda Python, created Anaconda virtual environment (in my case, it’s named condaenv), and had PyTorch installed successfully under this Anaconda virtual environment condaenv.
Since I’m using Visual Studio Code to test my Python code (of course, you can use whichever coding tool you like), I suppose you’ve already had your own coding tool configured. Now, you are ready to go!
In my case, I’m giving a tutorial, instead of coding by myself. Therefore, Jupyter Notebook is selected as my presentation tool. So, I’ll demonstrate everything both in .py files, as well as .ipynb files. All codes can be found at Longer Vision PyTorch_Examples. However, ONLY Jupyter Notebook presentation is given in my blogs. Therefore, I suppose you’ve already successfully installed Jupyter Notebook, as well as any other required packages under your Anaconda virtual environment condaenv.
Now, let’s pop up Jupyter Notebook server.
Clearly, Anaconda comes with the NEWEST version. So far, it is Python 3.6.4.
Our very FIRST test code is of only 6 lines including an empty line:
After popping up Jupyter Notebook server, and click Run, you will see this view:
Clearly, the current torch is of version 0.3.1, and torchvision is of version 0.2.0.
We are NOT going to discuss the background details of Convolutional Neural Network. A lot online frameworks/courses are available for you to catch up:
Several fabulous blogs are strongly recommended, including:
One picture for all (cited from Convolutional Neural Network )
The ONLY concept in CNN we want to emphasize here is Back Propagation, which has been widely used in traditional neural networks, which takes the solution similar to the final Fully Connected Layer of CNN. You are welcome to get some more details from https://brilliant.org/wiki/backpropagation/.
- : the training dataset is composed of pairs of training samples, where
- : the th training sample pair
- : the th input vector (can be an original image, can also be a vector of extracted features, etc.)
- : the th desired output vector (can be a one-hot vector, can also be a scalar, which is a 1-element vector)
- : the th output vector from the nerual network by using the th input vector
- : size of dataset, namely, how many training samples
- : in the neural network’s architecture, at level , the weight of the node connecting the th input and the th output
- : a generalized denotion for any parameter inside the neural networks, which can be looked on as any element from a set of .
In such a case, the loss function can easily be deducted as:
Some PyTorch explaination can be found at torch.nn.CrossEntropyLoss.
The updating weights is also determined as: