Wednesday, June 29, 2005

How Programming Has Changed

I was just reading a book on programming in Perl. It seems like an odd choice for the author. If you're going to teach a programming language these days, it's probably Java, or if you're more daring, Scheme, or some functional language. To pick a scripting language that has difficult semantics, well, it makes you wonder if the author has tried teaching it in a classroom situation. It's funny how people write programming books geared towards beginners, but may never have taught beginners to see the kinds of mistakes they make.

I've taught beginners, and there's one fact you realize quickly. Not everyone is cut out for programming. More than perhaps anything else, writing a program means dealing with constant failure. It doesn't take much for a program to fail.

Errors tend to fall into two categories. The first is the "dumb error". This is the kind of error you run into when you know what's wrong, but due to a lapse in concentration, you do the wrong thing. For example, recently I was taking a rather silly quiz online. I was supposed to give the correct spelling of tattoo. I picked (out of 4 choices) the spelling: tatoo. I know how to spell tattoo, but in my haste, I picked the wrong one.

There are errors similar to that in programming. You write = instead of ==. You think a function returns a boolean, when it returns an int. You forget string operations in Java don't mutate the string, and so forth.

The other kind of error is due to a lack of understanding. This error can fall in two subcategories. You can either not understand some feature in a language (or misunderstand it), or you can not understand your solution. I suppose there's even a third subcategory, where you know what you want to do, but you're not sure how to express it in the language.

Again, it helps to explain with an analogy. Suppose you're cooking, You are using some appliance, like a food processor. Maybe a food processor can't blend the way a blender does, but you think it can. You wonder why, when you use a food processor, that it can't pulverize properly. This is analogous to not understanding a feature of the language (you don't understand how your appliances work).

On the other hand, you might have a recipe, and you are asked to saute, but you don't know what it means to saute. You understand how to use the appliances fine. Thus, you don't understand cooking terminology.

You can draw analogies from cooking to programming. Cooking involves tools and techniques. Tools include knives, whisks, blenders, mixers, etc. Those tools are similar to the programming language you use. Then, you have recipes and techniques within recipes. Recipes use ingredients prepared by using tools. Similarly, programs are written in programming languages.

Anyway, I was reading this intro programming book on Perl, and I began to realize that the way it teaches programming is not that different from the way it was taught in the 80s. That means, the ideas are 20 years old or more. Programming, being the young discipline that it is, and being technology-driven, has evolved in 20 years. It's not that ideas from the 1980s are bad or even wrong, just inadequate.

To give another analogy. For many years, tennis teachers in the US taught a classic style of tennis that was popular in the 1950s. Players were told to grip racquets using an Eastern grip. They were taught to step into the shots, and be sideways to the net. They were taught to hit their shots flat or with a slight amount of topspin. Well, the game hasn't been played that way for a while. Grips became Western. Wrist shots were used more liberally. Heavy topspin was the name of the game.

A tennis teacher who kept teaching the same thing year after year, would find him or herself teaching techniques that were out of date. Clearly, those techniques weren't bad, but they didn't expose the student to new ways of playing that might make them better.

When people started teaching programming, they used flowcharts. There were other disciplines that used flowcharts, disciplines in engineering, for instance. Since programming was seen as engineering, it made some sense to use flowcharts. Then, it became popular to talk about top-down design. Even today, I'm sure you can find a programming teacher that emphasizes top-down design.

Top-down design basically says take a problem, work out a high-level solution. Then, break that down into smaller and smaller pieces, until the pieces are so small, it's easy to program. Again, an analogy. The problem: to host a dinner for ten of your best friends. The solution: cook a dinner for ten people.

You must then think of a solution with more details. For example, you might decide that the entire dinner is all appetizers. Or you want to make one big dish. Or you might want to be fancy and have soup and appetizers, followed by a main course, followed by a dessert. Do you want paper plates or fine china? Do you want music? Do you want candles? Who does the cooking?

As you can see, the solution depends on what you want. In fact, the questions I'm asking are part of a new idea in teaching programming (it's not that new, but it's pretty new). In the past, people thought of writing self-contained programs. You have a small problem you want to solve. For example, you might want to have several customers opening a bank account, and you want to manage that (give them interest, etc).

In the past, programming teachers made this huge assumption that many weren't aware of. Programmers should write complete programs. That means, programmers should write every line of code. Furthermore, they assumed beginning programmers couldn't write many lines of programs, so they came up with trivial programs to write, that contained maybe 100 lines at most.

To give yet another analogy, suppose you are teaching someone music composition. You might think students aren't patient, so they should only write songs that are, say, 100 notes or so. And they should only use one instrument. And they should only write music in one key. And they should only use 4/4 as their beat. In other words, you simplify a great deal. In the end, the student writes this really simple piece of music, and has simply no idea how to write for multiple instruments, how to deal with different keys, and different time signatures, and so forth.

Has the student really learned how to compose music? Not really. Sure, they have to start somewhere, but it's easy to teach them the bare basics, and ignore the complexities of really writing good music.

The biggest change in programming in the 1990s was twofold. First, there was the switch to object-oriented programming. This really threw programming teachers for a loop (pun intended). Programming teachers were used to teaching functions as the basis of programming. If you think in functions, you seperate the data (which gets passed in as parameters), from the manipulation of the data. However, the data gets "exposed". It's underlying structure is obvious to the programmer, and therefore isn't easily abstracted.

For example, you might have a chessboard. You use a 2D int array to represent the board. Now, every function that manipulates the chessboard works with a 2D int array. What if you wanted to change the int array to some kind of tree structure. The biggest problem is that you don't think of the chessboard as a "board" but as something intrinsic to a programming language: the array.

Object oriented programming languages like Java allow you to create objects that represent the chessboard. You can have the chessboard object tell you what piece is at a certain location, or to make it move a piece, and then check to see if that move is legal. You can have it determine checks, checkmates, castling, and so forth. In other words, you begin to treat the chessboard more like a chessboard.

Object oriented programming begins to combine data with functions to manipulate the data. In particular, each object has a limited set of functions that can manipulate the object. Users use the object by these functions. This provides safeguards. Users only think of the object in terms of what they can do with the object. The underlying data can be hidden away, so that it can be changed if need be.

To teach OO programming, you need to start thinking in terms of objects. What objects do you want? What should these objects do? This was a big change from the way people were used to thinking. When Pascal programmers switched to Java, many of them simply tried writing objects that behaved like functions. They just couldn't think in objects (at least, not without a lot of work).

But OO programming still makes one big assumption. You write your programs in one language.

This is what's changing about programming. Java was supposed to be cool because not only did you get a garbage collected language, you also got this tremendously huge library. Programmers were no longer going to write much of their own code from complete scratch. They were going to use objects others had written, and since the libraries are standard, they could expect other programmers to have the same library.

But people are realizing that one language may not be enough. For example, you may want to use a database, but you may not necessarily want to use a Java database. Or you may need to install a webserver (at least you can use Java based servers, such as Tomcat). Even if you do use one language, people are using canned tools. For example, suppose you want to play music or you want to display charts. You use other people's fully developed programs and combine it with yours.

This is good and bad. In programming, it's great to use other people's solutions. The only problem is that other people's solutions aren't static. They keep changing too. Most people who suggest you use other people's solutions, have you install their software. This can be problematic over time.

In fact, this trend of using other software to solve your problems has a huge drawback. Within 5 years, the solution is likely to fail. It works now, but assumes software doesn't change or that the people who distribute software don't change it. Except due to the fact we sell software, it does change---all the time. So, in 5 years, you're lucky if you can find the software. If it breaks, you don't know why. When you piece together software that isn't standard (for example, not using Java libraries), you're running the risk it won't work in the long term.

The advantage is that you can create sophisticated solutions that work now.

The real challenge with using libraries and other software is learning they exist, and making them part of your programming lexicon. Suppose you want to use a database. Which one do you pick? What are the criteria? If I master one database and its installation and quirks, am I really prepared to learn a second one? Better solutions might exist, but now I have to learn how they work! And this isn't enlightening learning. Everyone wants to do things their own way, rather than develop a standard.

I fear that as powerful as software has become, the commercialization of software is killing software. Once upon a time, people wrote utilities for Unix. Some, like make, were not very good. There were obvious things to fix in make. Yet, it persisted and persisted (though several flavors of make arose too). More or less the make used in Unix now is about the same as it was 20 years ago. Can the same be said of some database? Or Java? Many of those didn't exist 20 years ago.

The increasing evolution of software is, I argue, a bad thing, because people trying to run the software must now scavenge for the pieces, and hope they all fit together. Even when you say you're using Java, you need to specify the platform, and the version. People used to say, with Java, write once, use anywhere. Well, that's not strictly true. Java keeps coming out with tweaks to the language. Some of that is good, as it fixes bugs. Some of it is just new, and may create new bugs.

In the end, as programming changes, we don't know how best to teach it anymore. People go out and see that the skills they came out with aren't enough to cope with the programming world as they see it now. Worse still, smart people, teachers and professors, are often displaced from the working programmer's world, and therefore, are slow to respond to changes, since they teach, rather than learn the new skills of the workplace.

The workplace now demands use of version control and continuous build and a build process and good IDEs, and so on and so forth. Each of these technologies requires many hours to master, and is likely to change in 5 years time. How will the industry attract new people when these changes aren't always for the better (after all, it keeps changing!). I suspect, at some point, people are going to laugh at the way we did programming now. It produced solutions, but solutions that are the kludgiest of patches. They were elegant in some fashion, but not good enough to stand the test of time.

Programming teachers want to believe that what they teach doesn't have to change. Programming is a technology, and uses technology, so it's always changing. This is not calligraphy or art, where the basis seems to stay the same, for at least decades, if not centuries at a time.

No comments: