Designing a Modern CS Curriculum

How do we effectively teach computer science? It is not so simple. My university struggled to teach the fundamentals and omitted a majority of the implementations. However, I think this is mostly due to poor planning and a lack of effort. In this post, I will detail my gripes with computer science education using my experience at Vanderbilt and provide a set of alternate solutions I believe would enhance the current curriculum.

My experience learning computer science has been lackluster. Looking back, I would have majored in math. This is because the core of my programming skill has been developed outside of the classroom while I think mathematics requires the guidance of a professor in many more situations. For example, I believe that it is much harder to learn to become a good proof writer than a strong programmer. Partially, this is due to the open-source collaborative environment so readily available in the computer science atmosphere. For example, I can write code for an open-source Github project, get feedback from a senior engineer and ultimately refurbish my code in a matter of a few hours while it may take days to hear back from any stack exchange question for mathematics. This is an unbelievable advantage for computer science over other fields. Unfortunately, as a Vanderbilt student, I have been required to contribute to exactly 0 open source projects. Instead, my assignments revolved around filling in boilerplate code where a majority of the project is created for me. Furthermore, as a teaching assistant, it would devastate me to see many students go through the core CS classes and the first time they are exploring the terminal is as a third year. Additionally, we do not touch Github till the third course in the core sequence meaning if one wanted to contribute to an open-source project, they would not know how to do so until at least their sophomore year (Without learning it on their own). That absolutely disgusts me. So my point is that computer science curricula need to do three things: teach modern tools and barebone fundamentals, force students to contribute to open source projects for feedback and provide an environment where students are challenged.

Now that we have identified a set of goals, I would like to propose my version of a modern curriculum. Personally, I believe that students should understand a computer from the barebones and build upward. I would like to reference the curriculum developed by one of the best modern hackers, George Hotz. In summary, here are the main ideas of his curriculum:

  • Start with transistors and FPGAs. One should know some Verilog and understand how a transistor works on a high level. Make an LED flash.
  • How do we get a computer processor to work? Write an ARM (open source) processor using Verilog.
  • How does our code run in that processor? Write an assembler and a simple compiler.
  • Let’s build a real compiler. Rewrite the compiler using a functional language. Build out a linker and maybe even some more complex ideas such as malloc.
  • Now, we can run code. Let’s start to build some useful tools for the user. Write an OS, a standard library(libc), filesystem, and some basic commands (ls, cat, rm, touch, etc.)
  • And I think for the first two years, this is excellent. Fantastic in fact. For any of these components to work together, they need to be really well designed. That means the student will need to learn modern languages, version control, hardware interfacing, design patterns, data structures, functional and imperative programming, Linux commands, etc. In addition to practical skills, this will also provide students with an insight into the current tech stack and where new ideas can be built for the future of computing. Obviously, this is a challenging course load to undertake, but with the number of resources available, I believe it is feasible.

    Great, now we have built a strong core for our computer science curriculum. How do we start to specialize? Personally, my expertise is in machine learning and so I will build a track for that. When a student graduates with a machine learning specialization, they should be able to do two things: 1) Read modern machine learning papers and understand their techniques. 2) Implement those papers. From there, the student will be able to do almost anything in machine learning. To achieve this, I propose a series of 5 courses. The first is a pure statistics course. It has no programming, but to interpret modern ML papers, one better understand regression, support vector machines, neural networks, gradient descent as well as some deeper topics such as kernel reduction, Lipschitz bounds, and the universal function approximation theorem from a mathematical perspective. Concurrently, I believe students should take a barebones implementation course. This means implementing all the topics above starting from just NumPy and moving toward integrating popular libraries such as Torch or SciKit Learn. Next, we need to learn about modern GPUs. Like it or not, machine learning is about building models that can learn efficiently and so we should understand programming in CUDA, how matrix multiplication is computed, and ultimately why our model is fast or slow. Fourth, it is imperative to understand that neural network based approaches are the future. A modern machine learning expert should understand the hot approaches to vision, natural language processing, and robotics. Thus, they should be familiar with the ideas behind CNNs, transformers, and actor-critic algorithms. Thus, this fourth course should be titled, topics on neural network techniques and their implementations. Lastly, I would offer a final course to seniors that puts everything together. Every two weeks they look at a new top-tier paper in each domain and implement it to repeat the results. This will teach them model organization, tensorboard debugging, and of course, the patience required for training. Lastly, every project or assignment in this final course should be open source. Imagine how much better the ML community would be if there were 100 seniors every year verifying top-tier papers. (And think about the exposure those students would receive). That is all. 9 courses to create a machine learning expert. Now, obviously, those 9 courses are certainly challenging and require a great deal of work, but that is it.

    On a larger level, I would offer a series of tracks following this same idea repeated. One can develop the same track ideas for systems and programming languages, cybersecurity and reverse engineering, networks, web design, and even an “I want to get a job at Google” track for software engineers. The central themes would again be open-sourced ideas and implementation of modern technologies (Examples would be writing a web browser, or a scaled form of Ghidra depending on the course). From here, you have a strong computer science curriculum. It provides students with a core understanding of the tech stack and forces them to learn a majority of the modern paradigms. Then, it allows students to accurately specialize while maintaining their fundamentals. I do understand that this is a very challenging undertaking, but if the goal is to create the best CS students in the world, hard work is required.

    That is all for this post. Please feel free to write me suggestions at luke.bhan@vanderbilt.edu or if you have any more things I missed, please feel free to let me know.