Tea Leaves: programming

Showing posts with label programming. Show all posts

Thursday, March 06, 2008

Scruffy vs. Neat

When I used to work in AI as a grad student, there were two rough camps. There was neat AI and there was scruffy AI. Neat AI was based in mathematics and was posited on the idea that you created sound mathematical models from which you could derive results from. Scruffy AI was about trying a bunch of things and seeing what worked. Scruffy AI didn't care there was mathematical formalism. It looked at stuff that was interesting, and tried to make progress.

I believe these terms can be applied to intelligence. When someone says "that guy is really smart", and they are of the academic variety, they usually mean they are smart in a "neat" way. They understand a lot of hard math. They can draw high level conclusions. Neat people often prefer the world in a neat way.

Although computer science arose from mathematics, and many early practitioners of computer science were mathematicians, the field is fundamentally scruffy, and that's because computer science, at its core, lies in solving problems through the use of programming languages.

I know many people, including the esteemed Donald Knuth, see the basis of computer science as algorithms. Well, maybe that's true of computer science, but it's not true of the software industry. Algorithms are often phrased in the hermetically sealed world of math. People solve graph algorithms because it abstracts away many of the details an algorithmitist doesn't care about. Practical programmers worry about a myriad of real details.

They care about version control, about Windows APIs, about various standards of image and sound, about IDEs, about bandwidth. You have to both know a lot of grungy details, and learn how to figure out grungy details.

A neat computer type wonders why there needs to be more than one language. A scruffy guy either accepts the fact that many people use many languages, each with its strengths and weaknesses, and they learn to deal with it, or they even embrace the idea that there are many such languages, and find them all interesting.

Of course, these two categories are not mutually exclusive. One can like both the scruffy aspects (or at least be proficient in it) and also be good at the neater aspects, too.

And then there are those who aren't particularly good at either, even those who have a piece of paper to their name that professes their aptitude in computer science.

Practically speaking, it helps to be scruffy rather than neat, because the world of computing always changes, and in arbitrary ways, and you have to learn to live with that fact. You can't say "Let there be Lisp and no other language forever and ever, so I can concentrate on writing programs
instead of learning syntax."

Maybe the field will mature more, but it seems like decades away. In those decades ahead, we'll look back at today, much as we look back at the coders from the 60s, and wonder how they could deal with so little abstraction, and how much time was spent coding in what seems to be an awful way.

I also believe that we may experience at least one significant change to the way view programming, even more significant than object-oriented programming, though perhaps not reaching the level of "programming" using neural networks.

So even if the neat smarties get more credit for being smart, scruffy types are smart in their own way, and in a way that makes it more likely to be successful in the industry.

Friday, January 25, 2008

Dumbing Down?

Joel Spolsky has been complaining about the dumbing-down of computer science education. He's not the only one. Suddenly, out of the woodwork, you are getting folks who are agreeing with this. Is this problem specific to computer science? Do all the other disciplines have great teachers, and computer science awful ones?

There are a plethora of problems with computer science education, and I'll hit some of them myself, but the solutions are, frankly, very hard. Some of the issues are institutional, mired in the way academia views computer science education. Some of it is merely the mission of the university, seeking to educate as many students as possible, and the resulting mediocrity that's sure to come from that. Some of it is due to the incredible faddishness of the industry that pulls everyone in a million directions, and declares that their one obscure area of expertise is what every student should learn (a recent article proclaimed everyone should learn compilers--pretty soon, you hear everyone should learn algorithms or type theory or AI or network security or linguistics).

Let's begin with computer science education itself, and why it's causing problems. Perhaps the first problem with programming is that it's gotten quite complex. Object-oriented programming may seem particularly cool especially when it caught fire in industry in the late 80s, and universities struggled to keep up.

Object-oriented programming is tough compared to the simplicity of C or Pascal (and C isn't that easy either). But we continue to teach programming as if it were C or Pascal because academia doesn't want to admit that programming got difficult, and that two courses aren't enough.

Indeed, traditional CS had two courses for programming: CS1 and CS2. CS1 was learning the basics, which honestly, was control-flow (if statements, loops), arrays, and functions. No classes, and maybe a touch of pointers. CS2 was data structures: stacks, heaps, trees, and pointers. That was all you needed.

CS courses then often taught assembly, which has disappeared (and to be honest, if it returned "as is", it would not give enough insight into low-level programming as some would imagine).

One could easily argue that object oriented programming requires at least a third course to fully comprehend programming in it. A third semester alone is useful for a horrid language like C++ where templates, virtual functions makes reading and debugging a nightmare (and memory management).

And we haven't even talked about threads!

The other part academia hasn't particularly cared for is that software engineering has become a discipline. True, many a department now have software engineering professors, but academic software engineering is a strange beast, often divorced a bit from the reality of real software engineering, and then further lacking the respect from other more mathematical disciplines that have been around longer.

Indeed, many other computer science research areas look at programming as a mere tool, useful as a means to an end, and not an end to itself.

Software developers have to deal with a lot of issues these days. Let's hit a few of the basics. At the very least, you now need to know version control. There used to be RCS and SCCS, which both sucked. Now there's CVS and Subversion, and now a whole spate of distributed version control systems. It takes a while just to understand how version control works, especially the nastiness of branching and merging.

Academics who haven't dealt with version control (fortunately, fewer and fewer) find this subject painful. What does this have to do with programming? And, in a very real sense, they are right. It has very little do with programming, and everything to do with software development. And because it's tool-based, and because we still haven't fully gotten it right, people are going to come up with one system after another, and as soon as you master CVS, you waste time trying to learn git and other bits of arcania just to get by.

But then, there's the new trend (and it is a trend now) towards agile programming. That means unit testing. That means test driven development. That means behavior driven development. And, oh the plethora of tolls. Programming has become so convenient to the masses that the best of them produce wonderful tools. And now, people have to pay attention to their existence! If you were doing Rails development, you might play with RSpec or autotest, tools that are only about a year old, and you'd have to keep your ear pretty close to the ground to keep up. That's tough when an academic wants to do research rather than keep track of the tabloids.

Software development has lead to people using terms like "requirements gathering" and "test document" and "test plan". Documentation has gotten big, even if people routinely do a bad job of it.

Let's briefly talk testing. This used to be something a programmer does. Indeed, it's still something a programmer does. But now, there are separate folks that handle quality assurance, so much so, that it's given the name quality assurance, and there's a whole spate of terminology and tools surrounding testing! And the mentality of testing is quite different from coding. What was considered something a conscientious programmer would do has now become its own discipline, almost worthy of a major.

Speaking of tools, what about usability? The web did a marvelous job exposing the need to write usable software. The average person doesn't understand software so much, and can quickly leave one webpage for another. A webpage has to be visually appealing, yet easy to use, and preferably both. Once upon a time, people figured the only people using programmers were other programmers. Thus, beauty, comprehensibility, and all those things people now care about were a complete afterthought.

Did we mention how faddish the industry is? Right now, dynamic languages like Python and Ruby have caught everyone's fancy. And while those languages make great strides towards wide acceptability, people are already looking for the next great panacea of a language. Whispers of Erlang, Haskell, O'Caml, Scala abound. Even if we stick to Python and Ruby, both have enough magic in it that you can do a lot of non-obvious magic.

And, one of Java's downsides is that it's so verbose that people need a good IDE to write code in the language. You simply didn't need a decent IDE for languages like Pascal. Eclipse itself is so complex, you need hundreds of pages to scratch the surface of what it can do. It's integrated with tools to test, use version control, refactor(!). Things no one much cared about 20 years ago, so that people could focus on, you know, programming (I know--it's all programming, isn't it?). Thus, a good programmer now has to master a complex IDE, and one that's not likely to be around 20 years from now.

Once upon a time, most programs didn't play well with each other. But now, people extend languages all the time. Thus, people write tons of libraries for Python and Ruby. You have to worry about what libraries exist, and how to use them. There are people that now link in other people's code. A good programmer has to locate all sorts of software, and evaluate them and decide whether to use it or not. In the good old days, you'd simply write the code yourself (badly) or simply do with a bad solution (your own).

Oh, what about open source? Want to explain the gazillion variations of open source licensing and what it means to the average programmer?

It's a big world out there. Want to explain internationalization, and how it affects your code? Is your code ready for the world market?

How about handling all those timezones and dates? That also falls under internationalization. As does, of course, Unicode (and that it's not just one code, but a family of codes).

How about databases? You don't talk about all those web frameworks without databases? And web frameworks? And XML? These are now part of the day-to-day toolset a programmer needs to know.

And that's outside of all the usual stuff academics generally care about, like algorithms, compilers, computation theory, AI, bio-computing, numerical analysis, and so forth, most of which, the average developer knows little about.

All of these topics could fill courses and course and courses that a typical computer science department doesn't even want to tackle. Why? Because five years from now, another new trend will sweep in, and people will have to learn again. And will those changes be an improvement? More than likely, not enough to offset the headaches learning it.

Now, here's what I'd love all the critics to do. Teach an intro course. Decide that everyone is an idiot, and tell it to their faces. Then, be told that you still have to get them to learn something, and feel what it's like, what it's really like, to have to get people to learn that don't want to learn. If it will make the visualization easier, imagine it's your own child, refusing to learn, wondering why it's so hard, and why there's so much crap, rather than that superstar you just hired who can't get enough of this new stuff, and can take anything you throw at him or her, and turn out magic.

Spolsky complains about the dumbing down of the curriculum, but it's only because Java doesn't do it for him. He knows that to get the speed he wants, he's dealing with languages that will give it to him. Even he's not crazy enough to believe that coding in assembly will offset the productivity losses coding in something that horrid. Don't you think that if Java ran ten times faster than C++, he'd be hapier to give up all the crap associated with C++. But because he needs stuff that runs better, runs faster, he realizes that his coders have to know these grungy details.

Compare computer science to math, where irrelevant details are left out, and where people learn deep concepts, to computer science, where dealing with complexity has lead us to fads of the day, as good as we have now, but likely to be replaced with something new, and more and more and more code out there that we have to deal with.

Now, let's take a step back. Breathe.

We can teach as much of this as we want, but learning isn't simply a bunch of concepts that you teach. It's a worldview. When you are given a problem, what do you do? Suppose someone tells you to port a device driver. Do you even know what a device driver is? Or what it means to port? And yet, some people can take something that vague, and get code to work, and someone else will say that they were never taught that in college, and how are they supposed to deal with this?

And the fact of the matter is that, as much as the industry complains, unless they are prepared to head into academia (itself, very territorial, and having its own idea about what students need), academia can basically ignore what is being said. First, academia is so distributed that most professors couldn't even tell you what courses are required for their own students to graduate. They barely care about their own class, and don't even think about how their class fits in the overall plan. To get them to work together and make such changes, especially changes that are likely to come every five years, is to against their nature that knowledge shouldn't be a total fad.

And it's contrary to the mission of universities which is to graduate students. Most software pundits would have 90% of computer science majors jettisoned, despite the fact that mediocre programmers are often needed to do a lot of work. They would have their other courses jettisoned, because there's no time to worry about all those humanities and such. If every major took that attitude, most students would not even be in college. Since most universities are in the business of graduating students, then each major has to worry about how to get students who don't understand pointers very well to do well enough to get out.

Imagine it's your job to educate all the students who want to be computer science majors. The mediocre ones and the brilliant ones. Then, your view of what they should learn changes, when you realize that it's hard to even get the basic programming down, beyond all this other crap you have to learn to be decent in the field.

Thursday, November 08, 2007

Array Array!

Much in the tradition of repeated titles such as Europa Europa, Jamon Jamon, and Corinna Corinna, there's Array Array!.

Except this is not a film, but a play on words. "Array" roughly "hey" as in "hey you" in Hindi. And in computer-ese, arrays are the simplest data structures beyond the lowly variable. Once you teach a beginning programmer a variable, if-then, then it's time for loops and arrays. And, in this day and age, the next step is objects.

We say familiarity breeds contempt, but it also breeds love. People gravitate to the stuff they find easiest. Since arrays come so early on, people use arrays, even as, in principle, they learned objects.

To give a more drastic example, I heard of a project where students were asked to design classes for a project. Up until that point, they had been told classes and methods to implement. They didn't have to think it up themselves.

When asked to do this project, students fell into one of two categories. Those who used one super class to do everything. And those who treated each class as if it were a variable(!). Thus, if the program needed ten variables, they created ten classes.

Indeed, you also find, despite knowing better, that beginners prefer to write super functions hundreds of lines long rather than break it up into functions. The thought of having to pass variables to functions, even to break down a problem into functions drives them nuts. They have no idea how to do it. Indeed, given their druthers, they might prefer to code in FORTRAN, and forget about modular design, despite all attempts to teach them otherwise.

I have an idea how to solve this.

Teach design.

And teach design separately from coding it up. There's perhaps many a entry level coder, who goes to a computer job, and finds execs who've never programmed seriously a day in their lives making decisions on how this should be done and that should be done, and says "I could make those decisions too!" (They probably can't, but it seems easier than coding).

Design is something like making managerial decisions. If you don't have to code the design up, you might create a more sensible design. Then, the design can be critiqued on its own, and eventually, this might lead to better designs, which means coders will care about design rather than care about getting the code to work with the least hassle possible.

Of course, after the novice coders create a design, then they should implement it, to see if they regress into arrays and huge functions. If they don't, then the idea has merit.

Wednesday, October 03, 2007

No Country For Old Programmers

I was eating lunch today out, as I normally do, and oddly enough, the following question came up. "Why are there no old programmers?"

This was based on the following observation, which was, admittedly, rather limited. At our company, we have maybe 6-7 programmers around 30, perhaps 3-4 above 30 but within their early 40s, probably another 5-6 aged 25 or under. In any case, there aren't any programmers (well, maybe one) that are 45 or above. It's a relatively young company, and I'm on the older end.

Why is that? Why don't we find programmers above 45 or so.

It's not that they don't exist. I think one thing about "good" companies is that they do hire young, because, while older guys do have some experience, they have to continuously learn, and that's a new skill that's being expected in computing that didn't exist before.

Think about this. If you look at programming in the early 80s, it didn't require you understand databases (mostly), version control, reading incomplete APIs on the web, knowing about stuff like RSS or CSS or various protocols or learning a sophisticated IDEs, and then learning new stuff a few years later.

Indeed, many jobs require you do roughly the same thing over and over again, and use those skills over and over. For a programmer, at the time, that would mean learning a single language, learning the techniques of the day, learning to debug in a certain way. All that time spent getting good at that, and they need to learn something new again.

To give an analogy, think about MS Word. They just came out with Word 2007. This is supposed to be an overhaul of the system. All the time you spent learning how to do stuff in Word, may have to partially relearned. At the very least, you have the benefit that it's MS releasing the product, and they won't do something completely different. But it is different enough. You do need to do a fair bit of work to adapt skills you didn't think had to be adapted!

This happens all the time to programmers and worse. There are more resources to programmers, just as their are more resources to everyone--from the Web! You now have to find libraries and documentation and have to know where to find it. Scouring the Web has become a necessary skill. Before, you had a limited number of choices, if it wasn't obvious from the software you had what to do, then that was it. With the Web, there's always some search that might yield results, which is both boon and bane, good and bad.

Older programmers learned to program, but in a certain language, in a certain way. They didn't expect that they had to keep up with programming like some people keep up with Hollywood gossip.

The generation of programmers that learned during the 90s and beyond are starting to realize that the skill of continuous learning is needed. To give an example, recently, someone in our company found a nifty tool and suggested that people give it a try.

So what causes one person to think "Yeah, that's cool, I will try it out", and another to think "Well, I'm not required to use it, and I don't see a need for it, so why bother?". The modern programmer needs to look for new tools and try it out, and see if they can use it. It's like the person searching for that perfect diet, except where that search is typically futile, finding good tools is often not futile.

But why do people not learn new tools? First, they have to find it. Then, they have to install it. That may not be trivial. And imagine it takes a day of tinkering to get right. At what point is someone likely to say "Forget it. Unless I'm required to do it, I'm not going to waste my time, not even 20 minutes, trying to get this to work."

Then, you have to use it and get some value out of it. This may force you to work in a way you're not used to. Think of version control. Version control offers a ton of benefits, but there's work. You have to think about version control, what commands are available, knowing whether to branch or not, and perhaps fixing stuff up that's broken. Imagine what happens when you don't use version control. You don't have to deal with any of this.

And the downside? If you lose stuff, you lose stuff. And some people are content with that, because it means less they have to worry about right now.

And installation? Not all software installs easily. Some require hunting for libraries and such. Think about every computer science class you've taken. Most of them prepare all the software you need. That way, you don't have to deal with those headaches yourself. The good news is, over time, installation has become a lot easier. In the old days, you were left to figure out all sorts of things on your own, including, if you were on UNIX, what kind of UNIX you were on.

If you have the attitude that you want to use a new tool, even if it takes time to master, then you're likely to become better, to improve because you are willing to waste ten hours to save ten seconds. You're willing to force yourself to think in a new way because it offers benefits to you.

I'll give you another example. Since I went through academia, I became aware of TeX and LaTeX, which are the typesetting system created by Knuth and modified by Leslie Lamport (and since then, modified by a bunch of other folks). Knuth is not only a computer scientist, he's an aesthete. He cares about beauty.

He spent ten years of his life trying to preserve high quality typography. He cared that in good texts, ff, placed together, is replaced by two f's that overlap. That if you have the word Vast, the "a" sits underneath the "V", which doesn't happen with most word processors because each character is in a bounding box that doesn't overlap.

I know, for example, there is a different between left double quotes and right double quotes. Indeed, one of our data sources uses "TeX" style representation to represent left double quotes (two backticks in a row is considered a left double quote). I recognize this because I've used TeX.

But the average computer programmer graduating from college is not likely to have seen this. Indeed, they may have only learned about ASCII, heard about Unicode, and not realized that Unicode is not merely a 16 bit extension of ASCII (though at its core, that's what it is), and that this increase in characters allows a lot more punctuation.

Thus, the average computer science graduate might not have to worry about fonts, but someone out there does or they have to worry about some standard or they have to worry about Unicode. This often means spending a good deal of time learning about stuff they didn't teach you.

So this fella, who's actually pretty young, had never heard of this, didn't do that much research on the topic--didn't even know there was more to be researched (even as I pointed out an article to him), and probably finds all this rather tedious. Why study this when there's more interesting, more straightforward stuff to learn. Read about the world in a book, and it reads like a story. Let someone digest it for you so you can regurgitate it back.

To give you another analogy. At the end of every year, there's some sophisticated formula for deciding who needs to win and lose to get into the NFL (American football) playoffs. Most announcers have no idea how this is done, because they simply don't sit down and learn it. They figure they were bad at math, and so there's no reason to learn it, and they stop.

Or how the salary cap works. That's usually beyond most people, but once you study it and figure it out, it's not that hard.

So I posited that, unless there was going to be something completely different coming up, that the programmers of today might become old programmers. They're willing to spend the time figuring this out or figuring that out, even stuff that seems a complete waste of time to learn. It's this desire to figure new stuff out that will make today's programmers age gracefully.

It's not to say that every programmer will survive. After all, there's still a fair bit of programming that requires, say, debugging, and some people simply don't like debugging code, especially code they didn't write.

But these skills were skills that programmers of the past, even very bright people, those with Ph.D's didn't have. In math, you learn a notation and a system of proof, and that's it. They don't change notation on you just because it's trendy to do so. It means if you sent a mathematician through time in the future, they would have some chance of following a proof today, because the language stayed the same, but someone who wrote FORTRAN would find today's C++ code nearly indecipherable. The ideas involved, the sophistication used, and this is just day-to-day everyday programming, not genius code, would be hard to figure out.

We're still in the early stages of computer programming development, maybe comparable to Babe Ruth playing baseball. There's likely to be a lot more thought given to the craft of programming. In the meanwhile, we're still in the quagmire of today, where learning to program is still very hard, and learning to program well, requiring a philosophy, requiring us to care how the code is written.

The question is whether we want to grow old doing it.

Tuesday, July 31, 2007

Fresh Eyes

Sometimes, when you stare at code too long, the assumptions you made, the decisions you made, which made sense as the time are hard to challenge. Rather than get into too much depth, here's an analogy.

At one point, people sat in front of what would now be termed, a computer. This computer was huge, and there was only one of them. And so a very limited number of people could use it.

Once someone had the bright idea (or more likely, several someones), then people could sit in front of a keyboard, possibly in a remote room, and essentially send their keystrokes to a remote machine. That machine could handle several users at once, by giving each user a tiny slice of time.

Users were said to use "dumb terminals", which basically used a computer as a kind of telephone to a computer. A key press was typed, the characters sent across a wire to the computer that would interpret it, then send back an echo of the key pressed, plus results.

Soon, people figured everyone could have their own computers, such as personal computers. Then, you can think of the Web as creating a remote computer, by which the browser is the local dumb terminal.

Often, there is the desire to put smarts near the person on the "computer" they sit in front of which alternates with the desire to put the computer remotely and having the local "computer" merely relay messages to a more powerful computer elsewhere, which is shared by many people.

Such was the case that happened today. Was it better to do a big computation now so that future results could take pieces? Or was it better to do smaller computations, whose sum total might take longer than the big computation, but be quicker. Tradeoffs go different ways depending on which way you go, and one way, which may have been dismissed a while ago may require revisiting if things have changed.

You never know til you try, right?

That's the thing with programming. There's so many ways to do things, so many tradeoffs you can make.

Remember, a bad solution in the past, might become a good solution of the future.

Wednesday, July 25, 2007

A Matter of Taste

Sometimes I think, in fact, I feel I know that the programming languages we have now don't quite do what we want it to do. Ideas don't transmit themselves so easily through code. Beyond a certain length, the ideas behind what is going on is replaced by a morass of code.

Code is both a kind of expression, perhaps even a poem, lyrical, beautiful. No, it's far too purposeful to reach that. At what point, does the mundaneness of purpose win out.

Ask yourself the following questions. Does it bother you if a class has more than 20 methods? 30? Do you say to yourself "Boy, that's ugly", or do you not even blink twice. The class is what the class is. If I need to add something, I add it. If that is what the solution requires, I won't try to make it "nice".

How about method length? Does it bother you that it's 400 lines long? That it can't fit comfortably in two screens? Or three? Or ten? Or do you simply keep writing because to break it up into functions would be hard. Passing parameters is "hard". You prefer to leave your thoughts uncluttered by having all variables accessible everywhere?

Does it bother you that you class does two things? Three things? Ten things? That you tie this object to that object to the next object? After all, that's what the problem seems to require. Why not do it that way?

In many ways, writing a program is like writing a story with lots of constraints. You want a boy, and a girl. OK, maybe you make them related. Oh, now you need two more boys, and two more girls. They'll all be related. This will be like the Dukes of Hazzard, where you had five cousins and uncle and not a married couple in sight with the exception of Boss Hogg and Lulu.

Think about it. Bo Duke. Was he dating anyone? Nope. Luke? Nope. Uncle Jesse? Nope. Any of the police staff? Daisy Duke? Relationships are complex, especially since most episodes were devoted to cars leaping skywards on fake ramps, and general shenanigans.

Some code makes as much sense as the Duke clan.

Does cut-n-paste code bother you? How often do you find yourself preferring to cut-n-paste rather than consolidate a function?

Ultimately, most of us as coders, myself included, don't read enough code, don't read enough commentary about code, and so we code blind, doing what we think makes sense, whether it makes sense or not. We learn through trial and error, repeating errors thousands of others made. It's as if people taking a creative writing course had never read anyone else.

I don't buy the idea that avoiding the study of others will make you somehow more creative. For every person who can break the trend and produce something truly original, a thousand churns out utter tripe, and would be far better served by studying others, and mimicking the best they can. History shows that this works far better than trying to be completely original.

And in programming, there's no need, most of the times, to be truly original.

All it requires, sometimes, is a matter of taste.

Tuesday, July 17, 2007

Code Critique

Many years ago, I went to college with a friend who was/is Jewish. I suppose, like many people, we have the need to classify folks: African American, tall, short, fat, ugly, German, and so forth.

In this case, the religion is worth pointing out because he would, on occasion, read a book written in Hebrew. It wasn't just a religious text, but it was also a commentary on that religious text, by various Talmudic scholars over the centuries.

Recently, I was reading this, as well as the followup commentary at the bottom of the page.

Here's what I find interesting. One of the reviewers noted that critiquing code worked well for seasoned developers, and not so well for those students coming out of college.

I understand that the "real-world" is supposed to teach you something. I also understand that college doesn't always teach you real-world stuff. It's theory and all.

But computer science is a little weird. As a discipline, it is still figuring out how to do what most people do, which is program. Oddly enough, many of the older academics seem to think of programming as a throwaway thing, a mean to an end, and believe other topics to be of more fundamental interest.

It's hard to compare this to, say, learning English. You could say learning a language is more fundamental than learning a programming language. Two things most people need to master is a spoken language and mathematics (and reasoning).

Many academics feel that math and English are good enough, and don't particularly care for the programming aspects.

So imagine what they must think when the beliefs of how one should program are changing over time. The ideas that people espoused in the 80s aren't the same as the ones they mention now. For example, the Rails guys (most notably Dave Thomas) like the acronym "DRY" which means "don't repeat yourself". When people were teaching Pascal in the 80s, they probably didn't think of this, and would wonder why it's worth teaching at all.

My point (finally!) is this. If code critique is important to get hired, then why isn't it important enough to teach? Why aren't there books on the topic critiquing code? Understandably, code critique is a matter of opinion, and people disagree. That's fine. I think folks need to understand why people disagree, so they can begin to form their own opinions.

I think as the critique grows, we begin to understand the limitations of a programming language, why we do things a certain way to overcome those limitations.

If you think about it, programming languages evolve far faster than human languages. Human languages give a great deal of power to the speaker, but demands much of the listener to interpret shades of meaning. Even though programming languages are far more precise, they still don't seem to capture what we mean to say.

So I challenge those who say we need to learn to critique to start critiquing code and give their opinions. Let a thousand flowers bloom (I know, that's terribly Maoist, especially since he backed off on that opinion).

Sunday, June 10, 2007

Interpreters vs. Compilers

When I was first learning computer science, I used to wonder the distinction between a compiler and an interpreter. Honestly, it helps to write one or the other to know the difference.

I think it's easier to explain a compiler. A compiler takes a (mostly) human-readable program written in some language and (typically) translates this to machine code. CPUs , at its lowest levels, execute machine instructions. Typically, people compile code because running native machine code is faster than handling interpreted code.

Here's an analogy. Suppose you want to write an instruction manual. You have it written in English, but you intend for it to be used by Japanese folks. So you have the stuff translated to Japanese.

What's an interpreter? An interpreter is a program that understands a program you've written, and runs it on the fly. Let me give you an analogy, although this is a programming analogy.

Suppose you want to write a "robot language". You can tell the robot to move forward, turn left, turn right, etc. You can either convert this robot language directly to machine code, or you can write a program that incrementally processes it. When the user wants to robot to move forward, you call some code to move it forward.

Typically, interpreters are used in an "environment" where the user can make decisions on the fly, and the interpreter can offer commands to the interpreter as well as run code. The code it runs may not be converted directly to machine code and optimized, but is run though high level programming language code.

Wow, this is really hard to explain. Hmm, I think the closest analogy I can think of is that a compiler is like having a series of detailed explanations completely worked out and sent to someone. They can't ask you for more help afterwards. An interpreter is more like having a person answer your questions as needed. In other words, the interpreter is a program that serves as a kind of middleman, being able to handle commands as you type it in, while a compiler expects all of its input to exist right away, and makes a complete translation to lower-level code.

There's a software engineering analogy. A compiler is closer to the "big design up front" while an interpreter is more like agile design, where the customer can make changes along the way.

Fundamentally, they share stuff in common (basic parsing and such), but the goal of n interpreter is interactivity between the user and the interpreter where as a compiled program is meant to just run (it can have interactivity too, but not with the developer).

Ugh, I wonder what's a better way to explain the difference.

Wednesday, November 15, 2006

Is Joel Spolsky Right About Hiring?

Joel Spolsky is opinionated about hiring great programmers. That's fine. He recognizes that great programmers produce at a rate that's gaudy compared to the average programmer. There are certainly people smart enough to be a great programmer, but lack the will to do it. Thus, in principle, a Ph.D. in computer science is bright enough to figure out most of the complexities of programs, but it may frustrate them to no end to actually do it.

But Joel has the luxury to hire as he wants. As I've pointed out numerous times, his blog allows him to get some fame, at least, in enough circles that he can get talented programmers to apply. Needless to say, without his blog, he'd find his talent pool far, far thinner.

There's still an additional issue. Joel is cautious about how much business he takes on, and indeed, the worst thing that could happen to Joel is to have a project that requires two dozen programmers to accomplish. He'd have to reject such projects, because he needs the time to find the programmers he wants.

Indeed, even places as vaunted as Google or Microsoft who try to hire the best, may end up having to "lower" their standards to get enough people to do what needs to be done. Having said that, this lowering of standards often leaves them with people that are still far above their competitors.

Joel does get one thing right, and it's somewhat sad, but true. He assumes that great programmers are, well, born, but not made. More properly, there's some things you can teach to improve the skill of a programmer, but that takes effort most companies are unwilling to take. They'd prefer that programmers have already reached the stage that anything they need to learn, they can learn on their own.

And someone like Joel wants programmers that show good judgment, meaning they don't get distracted on coding or research that doesn't pertain to what needs to be done, and yet be careful enough not to write quick and dirty code that forgets to deal with strange cases.

The point is that, given his company, he has the luxury to refuse people, because he never has the need to have, say, 20 people hired in three months. He simply couldn't find that many good people because as well known as Fog Creek Software is, it's still an elite community that's heard of them. They lack the day-to-day recognition that Microsoft and Google has, and this is even among people who program for a living.

Indeed, Joel could almost say "I wouldn't hire someone who had never read my blog or heard about the company", because it means that person isn't savvy about the technical world around them. These are the same folks that are likely not to have heard about Ajax, Web 2.0, Reddit, Digg, and so forth.

And of course, I haven't even bother to talk about the fact that if you take his philosophy to the extreme, then any talk he presents to colleges (and he has been doing that lately) could start with "Most of you, I would reject in a heartbeat. You're not good enough to work for my company, and frankly, most of you shouldn't even be working at all, because if I would reject you, so should everyone else".

That's, of course, far too mean to say, because it might scare off the good people too, who usually aren't so narcissistic to believe they are the best (a few believe that, but they can be a pain to work with, especially if they are right--which makes it a reason to avoid hiring them as well).

This strategy is basically like professional sports. Most people playing college sports will never make a living as a professional, and frankly, they don't. They have to seek alternatives. To be fair, those that make it to the pros are handsomely compensated. And to be fair, if you're above a certain level in programming competency (which might be a bar set very high), then you can have as many of that skill of programmer as you could want.

What I mean is this. Competitive sports directly pits athlete against athlete, and as athletes get better, what used to be good enough is no longer good enough, because they have to compete against better players.

This is less the case with programmers. If you're smart enough to understand algorithms, graphics, math, etc. and you're able to master new material and software quickly, and you're savvy enough to decide whether you should take path X or Y, then you're probably good enough to be a top-notch programmer. And anyone who can pass this threshold should be good enough. Theoretically, that can be as many people as you'd want.

Practically, it's hard to get programmers to this level, because it takes a certain attitude and intelligence to reach here.

To sum up (which I find I need to do often), hiring as Joel suggests is not often feasible unless you can control how much work you need. This is exactly why Joel refuses to do consultingware. He simply lacks the manpower to do that, and this would force him to hire someone just to get stuff done, and therefore, people he'd ordinarily reject, he'd have to take a second look. And unlike the mostly fuss-free management he does now (or his second-in-commands), he'd really need to have someone to budget, manage, cajole, teach these folks so they could get stuff done.

Tea Leaves