Are you fascinated by AI and wish to learn in regards to the software used to construct deep neural networks just like the ones that power ChatGPT and Stable Diffusion? Read more below about our WAT.ai team members’ experiences working on our project comparing the performance of varied deep learning frameworks like TensorFlow, PyTorch, and more!
WAT.ai is a student design team that is concentrated on constructing undergraduate talent in artificial intelligence on the University of Waterloo. Through participating in 8-month long projects, team members get to construct relevant and impactful experience through working on research and industry applications of AI. The Deep Learning Frameworks Comparison (DLFC) project was one such project that ran between September 2022 — May 2023.
A lot of our core members who joined the frameworks team joined because they were captivated with AI and desired to develop skills in deep learning. Individuals with different backgrounds and skill sets can join the teams. We had a number of recent members who had never touched deep learning and were wanting to learn! Team members could collaborate with like-minded individuals and learn from each others’ experiences. Plus, the project is an ideal thing to focus on to employers on a resume!
WAT.ai members also get a likelihood to attend cool events just like the Canadian Undergraduate Conference for AI (CUCAI). WAT.ai also hosts its own education sessions and events on the University of Waterloo, open to anyone interested!
The goal of our project is to match the speed of varied software frameworks used to construct neural networks. Currently, the 2 hottest deep learning frameworks are TensorFlow and PyTorch, each within the Python programming language. Nevertheless, there are numerous other open source alternatives in each Python and Julia that usually claim to have speed advantages over the others. We built a few networks from scratch and evaluated their training and testing speed on standard datasets. Each team member was answerable for a unique framework, including TensorFlow, PyTorch, MXNet, and JAX in Python, and in Flux and KNet in Julia.
We asked our team members about their experiences on the project, here’s what they’d to say! Note, the responses were edited for length and clarity.
Are you able to introduce yourself?
I’m a 1st 12 months MASc student in Systems Design Engineering and a Technical Project Manager (TPM) of the DFLC team.
I’m a 4th 12 months BMath student in Data Science and a TPM of the DLFC team.
I’m a 4th 12 months BASc student in Nanotechnology Engineering and a Core Member of the DLFC team
I’m a 4th 12 months BASc student in Mechatronics Engineering and a Core Member of the DLFC team.
I’m a third 12 months BASc student in Management Engineering and a Core Member of the DLFC team.
Why did you join WAT.ai?
I joined WAT.ai because I desired to be an element of a recent design team focused on developing skills within the AI field through unique projects.
I’m quite fascinated by working with data but lack experience within the AI side of things. I desired to get more experience in AI and ML and wanted a strategy to apply what I learned in my coursework.
I even have mostly delved into AI projects and learning by myself, so this gave me the chance to work with a gaggle on a project. I also saw this as an ideal learning opportunity and introduction into the assorted frameworks which might be seen in industry.
I joined WAT.ai because I used to be fascinated by joining a design team working on AI specifically and to also further enhance my knowledge in the sphere as I wrap up my studies.
I wanted to achieve experience learning more about deep learning and developing my skill set in AI. Moreover, I wanted to achieve more technical experience that I could placed on my resume, which also contributed to why I joined WAT.ai.
How was your experience together with your framework, did you enjoy using it?
As a TPM, I worked with several frameworks to help people of their work. I enjoyed learning about how each of the frameworks do various things and gained an appreciation for a way the favored frameworks like PyTorch and TensorFlow have made an effort to make their APIs easy to make use of.
I worked with Julia’s Flux library for this project. This was my first experience with Julia, so it was a whole lot of fun picking up a recent programming language. Moreover, I think I got to learn more about Neural Networks through this library because the models weren’t as easy to implement as in a number of the popular Python frameworks. Implementing every part from a lower level gave me a deeper understanding on why certain things work.
This was my first experience using Mxnet, but I’ll say it was a fun one. I used to be fortunate that it shared a whole lot of similarities with PyTorch, and so I used to be capable of merge a whole lot of my knowledge and ideas and apply them on MXNet.
I focused on JAX. I even have used other frameworks before reminiscent of TensorFlow, Keras and Pytorch but never JAX. Overall, it was much tougher than any of the opposite frameworks as you needed to construct all of the components from scratch reminiscent of defining all vectors/matrices for all of the weights and parameters, manually updating them and performing activation functions. While I did enjoy working on the network at a lower level, it was difficult and harder to learn when using JAX.
This was my first experience using the framework KNet, which focuses on using Julia. I discovered KNet interesting, because it helped me exit of my comfort zone and learn a recent programming language. Moreover, KNet is recognized to be catered for research and isn’t heavily utilized within the industry. Overall, I had challenges using the framework KNet, but I learned lots and developed my skill set.
What were a number of the struggles you found using your framework, and the way did you resolve them?
I learned lots in regards to the functional perspective of making deep learning models through working with Flux, KNet and JAX. It was a really different way of pondering in comparison with the object-oriented viewpoint of a lot of the Python frameworks. JAX was probably essentially the most difficult to make use of and I needed to seek the advice of with their documentation lots throughout developing the models and training code. Nevertheless, it paid off to see the performance advantages of using the Jax framework!
Since Julia is generally utilized in academia relatively than the industry, it was quite hard to get tutorials on doing specific things, unlike PyTorch and TensorFlow, which have a whole lot of tutorials. Nevertheless, Julia, and specifically Flux, has tons of documentation and that helped me resolve a number of the problems. It also taught me the importance of getting good documentation and the importance of going through documentation intimately as it might probably provide a whole lot of helpful information.
I feel the foremost struggles I faced were mostly to do with exploring an outdated function or documentation. Several things may not have also been implemented on MXNet and/or were now not present, and workarounds took a bit to determine.
As mentioned, the low level nature of JAX made the initial learning curve far more difficult. Moreover, the documentation, although not bad, was lacking for certain varieties of models. The issue plus limited examples made development difficult. Ultimately, the one strategy to resolve this problem was to achieve a very strong understanding of the underpinnings of JAX and to be very knowledgeable on the small print of neural networks. You only must put within the grunt work.
Once I encountered a difficulty, I immediately checked out the documentation for KNet to seek out a possible solution to my issue. More often than not, it worked. Nevertheless, sometimes I might discover the documentation was outdated. This resulted in me asking my peers for help who’ve experience utilizing Julia with the same framework. They helped out and provided feedback on implementations for various tasks I had.
What were some key takeaways you bought from being in WAT.ai?
There are a whole lot of keen students fascinated by learning about AI and I’m excited to proceed with this team in the subsequent 12 months to develop more exciting projects!
Firstly, I learnt methods to pick up a recent programming language quickly. I also got to boost my skills for reading and understanding documentation, while learning the importance of getting good documentation. Most significantly, I got to learn quite a bit about image processing and implementing Neural Networks!
Biggest takeaway was that there is kind of lots that I even have to learn, and I’m excited for this journey in my learning.
For one, I got a greater understanding of neural networks. I got exposed to JAX which requires more work but ultimately gives you more precise control — if you should construct more performant solutions, also supported by our findings, JAX appears to be an excellent alternative. Overall, it jogged my memory of just how far more there’s to learn.
I used to be capable of learn and further develop my skill set in deep learning. I used to be also capable of network with other individuals and meet a whole lot of cool people. Through WAT.ai, I used to be capable of learn what form of an individual I’m, allowing me to grow and learn in machine learning.
Now that you just’ve heard a bit more about our team, let’s dive a bit deeper into our project. As a recap, each of our team members used a unique framework to construct neural networks from scratch. We then compared the speed of coaching and evaluation of the networks. We also did research into the differences of the mechanics of the frameworks and compared how easy they were to make use of as a developer.
The Python-based frameworks TensorFlow and PyTorch are “beginner friendly” and good selections for somebody taking their first step into deep learning. There have been also tougher Python frameworks like MXNet and JAX, which individuals worked with. JAX, specifically, takes a functional programming approach to writing neural network code, which is kind of different from the thing oriented approach of TensorFlow and PyTorch.
Moreover, a few of our members took an additional challenge to learn frameworks based on Julia, namely Flux and KNet. Julia is a more moderen programming language that has many similarities to Python, MATLAB, and R, and is designed for performance, especially for numerical and scientific computing. So on top of learning about neural networks, those team members also learned a recent programming language.
Each team member used their framework to construct, train, and test a network on two tasks: A multi-layer perceptron (MLP) applied to the MNIST dataset and a ResNet convolutional neural network applied to the CIFAR-10 dataset. We’ll break down those two tasks below!
Task 1: MLP on MNIST
The primary network we built is a multi-layer perceptron (MLP) to do image classification on the MNIST handwritten digits dataset. MLPs are considered the best neural network, though the structure can be used in additional complex architectures. The structure of an MLP consists of an input layer, then a number of hidden layers, after which an output layer. Each layer consists of artificial neurons, which process inputs by a weighted sum, then applying an activation function. The output of every neuron is shipped to all neurons in the subsequent layer. In our task, we use the MLP to categorise a handwritten digit from the MNIST dataset into one in all ten classes. Through constructing a MLP, we also learned in regards to the other key concepts in deep learning reminiscent of data splitting, loss functions, optimizers, and metrics.
Task 2: ResNet CNN on CIFAR-10
The second network we built was a convolutional neural network (CNN) to do image classification of the CIFAR-10 tiny image dataset. CNNs use the convolution operation as a strategy to efficiently process images by applying shared filters across the spatial dimensions. Much like how images have RGB color channels, CNNs also process images in numerous channels and may mix information from across channels to create higher level feature representations, that are fed into an output classification layer. On this network, we implemented the ResNetv2 architecture. ResNet is a well-liked architecture used for image processing and was the inspiration for many inventions in subsequent neural network architectures. Through this task, we also learned about data augmentation and methods to use GPUs for neural network training.
Results
After testing all of the frameworks, we found that JAX was one in all the fastest frameworks in each the MLP and CNN tasks, having a powerful 8.5x — 17x speedup over the opposite frameworks for the CNN per-epoch training time metric on GPU. For more details on the code, our methods, and results, please discuss with our GitHub page and the paper we submitted to CUCAI.
That’s the tip of our Project Talks post! Thanks a lot for taking the time to read our blog. Be at liberty to contact our members to ask questions through email or LinkedIn. We hope that you just are wanting to be a part of the team and be our companions within the journey of deep learning! When you are also further fascinated by learning more about what we did, try our project’s GitHub and paper.
Lastly, if you should not sleep thus far with WAT.ai, we encourage you to follow the WAT.ai LinkedIn page. Stay tuned for posts about methods to join recent projects and AI education sessions running in September 2023!