Reflections on a year of reading Knuth

Want to feel stupid? Read Knuth! Want to get smarter? Read Knuth! For a long time, I've seen other programmers write in awed, hushed tones about the multi-volume series of books written by Donald Knuth titled "The Art of Computer Programming" (TAOCP). As somebody who genuinely enjoys reading computer programming books — I actually lost track of time reading Richard Steven's "TCP/IP Illustrated" many years ago — when I consistently hear so many positive things about a book or a series of books, it makes it onto my to-read list. I finally got around to picking up a copy of the first three volumes of TAOCP little over a year ago, and I just finished the first volume, having averaged about 30 minutes a day of reading or working problems.

These books are written college-textbook style: a few pages of exposition followed by some sample exercises to help you solidify the concepts presented. These books include a lot of sample exercises — in some cases, there are more pages of exercises than explanatory material. Almost all of them have answers included in the back of the book if you get stuck.

I worked, or at least attempted to work, every single problem in the first volume. In some cases I settled for just understanding what the question was trying to ask for. In some cases I failed even to accomplish that (don't judge me until you try it yourself). Each problem is assigned a difficulty rating from 0-50 where 0 is trivial and 50 is "unsolved research problem" (in the first edition, Fermat's last theorem was listed as a 50, but since Andrew Wiles proved it, it's bumped down to a 45 in the current edition). I think I was able to solve most of the problems rated < 20 — it was hit and miss beyond that. Most of the problems fell into the 20-30 difficulty range, but Knuth's idea of "difficult" is subjective, and problems that he considers to be of average difficulty ended up stretching my comparatively tiny brain painfully. I've never climbed Mount Everest, but I imagine the whole ordeal feels similar: painful while you're going through it, but triumphant when you reach the pinnacle.

The first third or so of the book is pure math: specifically, discrete math. As a computer programmer by trade and by education, I've always had a tumultuous relationship with math. When I was an undergraduate studying computer science, I dutifully took all the required math courses, including three semesters of brain-crushing calculus. At the time, I viewed advanced, college-level math as sort of a "dedication test": I didn't see it as practical, or useful, or applicable to my field of study, but instead as a weed-out program to see who was serious about proving their worthiness for this degree. The degree was a prize to be obtained, but you had to jump through some hoops to obtain it. Although I've since come around and accepted that math is useful — and I'm damned glad I did spend a year and a half learning calculus — Knuth manages in this book to underscore, over and over again, that math is practical and eminently applicable to the field of computer science. This was by far the hardest section of the book for me: Knuth considers a proof by induction, for example, a triviality, but I personally still have to stop and think to produce one. Some of the problems depend on what he calls "higher math", which is usually calculus with a bit of linear algebra sprinkled in, but always expertly applied. I muscled through, trying to solve each exercise. I got pretty lost in the chapter on generating functions, but I learned a lot from the section as a whole. There were plenty of topics in here that I considered myself familiar and proficient with, like number theory and binomial coefficients, but Knuth had more to teach me than I ever would have imagined.

After about a hundred or so pages of pure math, he finally gets around to the computer programming part and starts by presenting a hypothetical computer architecture and associated assembler language in which all of the code samples in the remainder of the book are written. It seems that a lot of the people who complain about this series of books complain specifically about his decision to present code samples in assembler — and not just any assembler, a pretend one that he made up! Knuth gets a little defensive about that decision in the preface, pointing out that if he had written the book using the popular high-level language of the time, all of the examples in the first edition would have been in Fortran, then re-written in C for the second edition and again in Java for the third. I'm actually glad that he presented everything in a made-up, "ideal" assembler language: after having spent time learning how to code in it and understanding how it worked, there was no "magic" behind any of the examples. There are freely available emulators for the MIX assembly language (I used MDK), so you can (and I did) execute the sample programs and test out your answers to the sample exercises.

The later parts of the book were easier going for me, once I got the hang of Knuth's MIX assembler, but that's not to say that they were easy! For one thing, MIX is based on ancient assembler concepts. I learned assembler in the late 80's on a 6502, so I had at least some notion of registers and overflow flags and such, but MIX is closer to, say, the IBM 700 series (that predated even System/360!). The first edition of the first book was written in 1963 and follows the state of the art for computer architecture at the time. There's a lot you're going to have to learn about this architecture and development style if you're younger than about 60. Knuth thinks nothing of writing self-modifying code, for instance, and doesn't even call it out as a "trick" when he does it; to him, it seems perfectly natural. Everything operates in "words" of five (six-bit!) bytes and a lot of the instruction set is dedicated to shifting and masking these words in ways that we would use AND/OR and XOR to accomplish in a modern computer. Knuth has since updated the architecture and created MMIX for the later volumes of the book, and has stated that he plans to go back and rewrite the first three volumes to be use MMIX as well, but for now, everything in the current edition of the first three volumes is still in MIX.

Finally, after spending half the book covering the requisite mathematics and presenting the fictitious platform on which the remaining examples are shown, he starts to cover what you might consider "basic" programming techniques like memory allocation, stacks, queues, linked lists, trees and garbage collection. I thought I knew these concepts inside and out and that there was nothing else I could possibly be taught — how wrong I was! Again, I was in awe of just how deep Knuth could manage to go down each rabbit hole to examine the limits to which all of these concepts can be pushed. Have you ever wondered how many different binary trees there are? Or how garbage collection works when there's almost no memory left for it to use? Here you'll find the distillation of 50 years of research on the boundaries of computer science.

I've seen it suggested that these books be treated as reference books rather than read cover to cover. I'm inclined to disagree. Supposing that you had a copy on your bookshelf, and you were trying to implement a garbage collection routine. You could glance through the table of contents, see that garbage collection is discussed in section 2.3.5, and flip to the corresponding page. But if you did, you'd discover that the text refers back to some of the mathematical results that were presented and proven in previous sections (sometimes even in the exercises of the previous sections), and all of the examples are written out in MIX assembler. To make much sense of it as a reference work, you'd have to at least work through the MIX assembler description and some of the math sections. All in all, I think this book was written to be read all the way through, but also in a way that you could easily refer back to it years later, after you'd absorbed it all once.

I've also seen people mention that they read it, but only skimmed the exercises. I guess everybody is different, but I can't fathom trying to ingest all or even most of this without working through the end-of-chapter exercises. In many cases — maybe even most — I found myself still confused after the end of the chapter, but that working through the exercises solidified the concepts that I was still unclear on in a very controlled way. Quite a few sections are nothing but proofs of complex theorems, and the exercises are there to show you what the theorems are actually useful for. Part of Knuth's brilliance is in leading you down a path of learning such that you walk away with a deep understanding of the material, and actually retain it months or even years later: but you have to follow along with him to get the most out of it. I'd go so far to say that most of the valuable content of the book is in the exercises: in spite of the page count, most of it isn't actually published!

I definitely plan to finish reading the series. It may well be 2020 before I'm finished, but in the end I think it will be well worth the time. No, there are no "marketable skills" to be obtained here. There's no discussion on threading or security. There's nothing that you'll learn from Knuth that you can list on your resume. Nobody is hiring MIX programmers (if they were, I wouldn't apply anyway). It may even well be that I could have just as profitably spent the time I spent reading Knuth's "Art of Computer Programming" doing sudoku puzzles, or crosswords, or reading novels or even reading comic books, although I do feel like I learned a lot and am a better programmer for having completed it.

Add a comment:

Completely off-topic or spam comments will be removed at the discretion of the moderator.

You may preserve formatting (e.g. a code sample) by indenting with four spaces preceding the formatted line(s)

Name: Name is required
Email (will not be displayed publicly):
Comment:
Comment is required
xerxenka, 2017-01-04
Vol 3 also has some interesting Tree Balancing tricks (6.2.3) I've never seen used before.

If you liked the Math preliminaries in Vol 1 there's a draft of 4B out now on Knuth's site with 'Mathematical Preliminaries Redux' another 100 or so pages of new material covering martingales and probability.
Josh, 2017-01-17
Thanks - I'm looking forward to it, although I'm a long, long way from volume 3. I'm about halfway through volume 2 now, just finishing up the floating point handling part. I imagine it will take me the rest of the year to finish.
xerxenka, 2017-05-28
15-859 Theoretical Cryptography at CMU still uses the tests in Vol 2 - Chapter 3.3.x as part as the course curriculum.

TAOCP is also the only book(s) I am aware of that cover historical references of all material and even has exercises for them, like 7.2.1.7 which has a very thorough chapter on the history of generating combinatorial patterns "Make a patch to Takakazu Seki's 17th century algorithm so that it will complete".
Victor, 2017-08-17
If someone where to start going through TAOCP now, do you suggest going through Concrete Mathematics before it?
xerxenka, 2017-09-02
The official pre-reqs are listed in the Preface, which is basically some exposure to undergrad Calculus (Knuth learned from Thomas' Calculus 2nd edition), and experience writing a program or two. Concrete Math (CMATH) is Chapter 1 Basic Concepts rewritten with additional insight/material and more exercises. As you go through the volumes, Knuth will refer to CMATH chapters you may want to look at so you'll likely end up reading a lot of anyway such as I did when trying to understand generating functions but you may not need to.

To satisfy the second requirement, search for "15-213 schedule CMU" to get the class page for Intro to Computer Systems which has free recorded lectures on youtube that cover bit level representation like Two's Complement, X86-64 assembly language, stack frames, cache memories (Knuth goes into the principle of locality in later volumes) other program optimization, and basic architecture so you can understand his MIX arch. Like Josh explains don't expect to do this at any sort of fast pace, I started ~6 years ago and I just this year got to the latest published fascicle as Knuth sends you off on many tangents each chapter with his additional book/paper recommendations.
Farrukh, 2019-04-07
Thanks Joshua.

This post is incredibly motivating.
Josh, 2019-04-08
Stick with it - the books are great to read!
steviekong, 2019-08-03
Thanks for the insight. Just like you I've seen a lot of online forums referring to this book as a reference book which made me apprehensive about reading it.

This post really encouraged me and I'm going to dive head first into this book.
Josh, 2019-08-06
You won't regret it!
Joe Zbiciak, 2020-10-15
My reading process for TOACP has been "progressive scan." Each new volume, when I encountered it, I read through but didn't really drill down into. Some concepts are easier for me to wrap my head around, and others are either just over my head, or I'm not sufficiently motivated to drill into.

But, by getting all of the pages in front of my eyeballs, I still learn a lot. I would never have learned about Benford's Law, for example, without reading the historical anecdote about IBM's hex float, and so on.

Another example: The section on sorting networks covered an incredible amount of detail that I simply wasn't prepared to absorb. But, when I needed to implement a rather expensive 2-D median filter on a statically-scheduled vector machine, I was able to come back to the text, drill down on the portions that I really needed, add my own twists, and turn in a 25x speedup on a critical algorithm.
I've also made use of graph algorithms, random number generators, numerical algorithms, etc. based on "I'm sure I saw that in TAOCP; let's go find it again."

As for the exercises: There's a significant amount of material covered _only_ in exercises. If Knuth were to remove the exercises, he'd have to migrate a lot of material - perhaps more than half - into the main text in order to retain the value. (I mention this, as Knuth mentioned in one of his oral histories on developing TAOCP, that idea was floated as a way to bring the overall size of the work down.)

I have worked some of the exercises, but I haven't invested the time to truly work them rigorously.

So I guess that puts me in a middle camp: Read the whole thing, but not for 100% detail the first time.

I started re-reading TAOCP earlier this spring, although I stalled out about halfway through mainly due to life getting in the way. I actually started with Volume 4B Fascicles 5 and 6 first, and then went back to Volume 1. I'm currently partway into Volume 2, in the heart of random number generator testing. I plan to resume before too much longer.
I definitely spent more time working through the mathematical preliminaries than I did when I first purchased the books. Knuth uses the binomial operator in ways I'd never imagined. Also, I now sorta get generating functions, but I admit I do not have command of the concept. Before, it was 20,000ft over my head.

Maybe on my next re-read, I'll dig into the exercises further. I need to clear some other hobby project time demands off my calendar first, though.
Porrapat, 2021-08-02
Hi Joshua, I'm from Thailand. I dedicated to read this too. Your post is the top of some inspirational for me to study on these books that I have followed around last year ago. And this is time I think I should write something back. Anyway, there are so many things to talk that I would like to invite you to join these Facebook group. I'm mostly on there.

Meetup : The Art of Computer Programming Study Group
Youtube : TAOCP - The Art Of Computer Programming Reading Group
Facebook : The Art Of Computer Programming

Your comment don't allow me to link that but I think it's easy for you all to Google.

Welcome to you all and I will study most other of your post too.

Many Thanks
Joshua Davies, 2021-08-09
I will definitely have to check those out! I feel like I could benefit from going back and re-reading them all again myself at this point.
My Book

I'm the author of the book "Implementing SSL/TLS Using Cryptography and PKI". Like the title says, this is a from-the-ground-up examination of the SSL protocol that provides security, integrity and privacy to most application-level internet protocols, most notably HTTP. I include the source code to a complete working SSL implementation, including the most popular cryptographic algorithms (DES, 3DES, RC4, AES, RSA, DSA, Diffie-Hellman, HMAC, MD5, SHA-1, SHA-256, and ECC), and show how they all fit together to provide transport-layer security.

My Picture

Joshua Davies

Past Posts