In Part II of our discussion about programming, we talked about when some of the more popular programming languages came into existence, and how people used them. In this part of the discussion, we’ll talk a little about the environment the language C sprang from.
This series of articles will swing back to embedded programming on the Jetson! I think it is useful to understand the parallel to the way other systems have evolved over time. Unfortunately I just don’t have enough time to make the articles shorter.
Even More Background
You noticed in the last article the lack of mention of the Windows operating system underpinnings. Up front, the original Windows 1.0 was built using C and assembler. The preceding IBM PC operating system was the command line MSDOS, written in strictly assembler. Because of the very large user base for the preceding MSDOS, Windows has always had a much harder time of things. Basically everything has to be backwards compatible, without the luxury of having known hardware configurations. Everyone is familiar with the ubiquitous ‘Blue Screen of Death‘ which was the ‘Twitter Fail Whale’ of its generation(s).
OK, the major desktop and network machines use C or variations thereof for low-level work. That is a compelling reason for using C to program embedded processors, so we’re done. Not so fast …
Here’s the original paper The UNIX Time-Sharing System Dennis M. Ritchie and Ken Thompson Bell Laboratories when Unix was introduced to the world. The interesting part is the machine that Unix was developed on, the DEC PDP-11.
The PDP-11 was an innovative machine. The particular model in the paper, the PDP-11/40 costs about $40K USD at the time (1974). The PDP-11 was a 16 bit processor with 144KB of memory and a 40 MB hard drive. In case you don’t recognize the term 144KB, it means 144 kilobytes, kilo meaning thousand. So 144*1024 bytes, or approximately a tenth of a megabyte. Of that physical 144KB, the Unix operating system took up about 42KB, leaving a generous 100K or so left for application and user programming.
If you are a ‘modern’ day programmer, it is hard to wrap your head around those numbers. An entire operating system in 42KB? Today a $5 Raspberry Pi Zero has a 32 bit processor and 512MB of onboard RAM. As you might imagine, this meant that memory was an extremely precious resource, and because the processors were relatively slow execution speed was a major concern. This is also your first clue as to the mindset of programmers of the day, and how some of those programming ideas have persisted over the last 40 odd years.
In those days most computer printers used continuous feed paper with perforations at page breaks. This made it possible to print out what were called ‘computer listings’ or ‘program listings’ on one long sheet of paper. It was common to see people spread their program listings on their desks or on the floor down a hall and read/markup/write their programs. They would frequently act as if they were ‘computers’ themselves, going through the execution of critical pieces of code and jumping from page to page imagining how the program executes. This is reminiscent of the way that Napoleon used maps laid out on the floor to visualize battles before going to war, getting a feel from the ‘virtual reality’ technology of the day.
Remember that the Unix OS executable was only 42kb, which means that the source code for the entire system was probably in the 100K line range spread across 4400 files. My guess would be < 10K lines of code for the kernel. One could easily print the entire kernel in a listing and learn it. A much different environment from today where the Linux kernel is around 15 Million lines of code. Not quite apples to oranges comparison, as driver support in modern Linux is around 8 million LOC and 2 million LOC is for architecture support. The Linux kernel itself is probably around 200K LOC. You would need quite a few trees and an awful long hallway to print it all out.
At the time of the Unix paper, there were 75 users of Unix. There are slightly more now. The main point here is that the developers of Unix had a different level of familiarity with the system than people can have now. At the time, they were dead serious when they would refer people to look at the source code for the real documentation on how the system works. The system at the time was small and simple enough where one person could understand it by reading the source code. Of course, such advice travels through the ages and eventually becomes part of the culture. People to this day will tell others to look “through the source code” to understand the OS.
Eventually about 20 years into it, there were enough people who had developed a difficult relationship with Unix and responded with a friendly book, “The UNIX-HATERS Handbook“. Money quote:
Our grievance is not just against Unix itself, but against the cult of Unix zealots who defend and nurture it. They take the heat, disease, and pestilence as givens, and, as ancient shamans did, display their wounds, some self-inflicted, as proof of their power and wizardry. We aim, through bluntness and humor, to show them that they pray to a tin god, and that science, not religion, is the path to useful and friendly technology.
Some of the criticisms from the book have been addressed in the following two decades. The book was written before the Internet became popular, so it was possible to have an actual polite discourse about the subject. But at the end of the day, most users don’t really get a say in the matter.
For example, I have no idea how this web page is being served, by what kind of machine, or what kind of software. All I know is that I upload this amazing content to a service, and the service delivers it on demand. I can control parts of the interaction obviously (such as how I produce content), but I don’t really control the OS in the data center, or the set-top box, the phone, or machine that you’re viewing it on. Back to programming …
One way to look at C is as a portable assembler. The language itself is pretty simple, and it is easy to imagine the mappings of structs and such directly to hardware. It is also very assembler like in that there is no safety net for things like memory allocation/deallocation, range checking structure/array fetch/stores, or illegal memory access due to things like invalid pointer arithmetic.
It’s also easy to be seduced by the romantic idea of a handful of really smart people writing a bug free, beautiful and efficient operating system in C which runs in 42KB. You can even imagine that scaling to some degree and still have warm fuzzies. Then you start reading stats from the Linux foundation :
Regular 2-3 month releases deliver stable updates to Linux users, each with significant new features, added device support, and improved performance. The rate of change in the kernel is high and increasing, with over 10,000 patches going into each recent kernel release. Each of these releases contains the work of over 1,400 developers representing over 200 corporations.
Since 2005, some 11,800 individual developers from nearly 1,200 different companies have contributed to the kernel. The Linux kernel, thus, has become a common resource developed on a massive scale by companies which are fierce competitors in other areas.
“The rate of Linux development is unmatched,” the foundation said in an announcement accompanying the report. “In fact, Linux kernel 3.15 was the busiest development cycle in the kernel’s history. This rate of change continues to increase, as does the number of developers and companies involved in the process. The average number of changes accepted into the kernel per hour is 7.71, which translates to 185 changes every day and nearly 1,300 per week. The average days of development per release decreased from 70 days to 66 days.”
They’re very proud. This also leads to having a “push forward” culture which requires that whenever a bug or issue is encountered, the first question asked is “Do you have the latest updates?” What does the term “latest” mean in a context where there are 8 changes made every hour? After you “update”, were you sure that other issues weren’t being introduced in other unrelated areas?
Remember in Part 1 of this series where I stated that sometimes Cluster Fucks are created when the technology you’re building on invites disaster?
Take a quick glance at Open SSL Vulnerabilities and search for the terms like ‘overflow’ and ‘underflow’ and ‘heap corruption’. SSL is kinda important for everyone, it would be really swell to be able to be able to rely on it without worrying about hackers attacking you. The programming language and environment has not serve them well.
In part, the ‘deficiencies’ of C are magnified because there is a new programming paradigm, where thousands of people contribute to software projects. The whole idea of thousands of contributors with disparate programming backgrounds is a new phenomenon, which suggests that programming languages for such use needs a bit more safety than something like C provides.