Saturday, June 10, 2006

Is Math Necessary for Computer Science?

Computer science is a weird major. Most people who major in computer science believe computer science is about programming. Indeed, the first few courses are spent programming, which often amounts to at least two years of programming. When you ask a second year student about grad school, they imagine that graduate students, having had years of computer science must write huge programs, like hundreds of thousands of code. They'd be surprised to learn that some grad students hardly program at all.

In fact, many professors would argue that programming is not what computer science is at all. Indeed, programming is merely a tool to solve problems, and it's the problems that can be solved by computers that they care about.

Computer science research divides into many categories, which are fairly broad. Among the topics: databases, networks, theory (cryptography, algorithms, theory of computation), programming languages, human-computer interaction, computer science education, software engineering, artificial intelligence, information retrieval, graphics (including scientific visualization), vision (could be grouped with AI), systems, numerical analysis.

Even programming languages is not the study of how to program, but what features are most interesting in programming. Areas of research including various paradigms of programming (functional programming, logic programming), compiler research, compiler optimization. It's not so much about how to better programmers.

Software engineers could do this, although they are sometimes interested in the process by which real software engineers work, and want to characterize this.

Since computer science originated in mathematics, indeed, the idea of a computer far predates the computer, there's a sense that computer scientists should have training in math. Usually, a few courses in calculus are required. It's not that calculus is used so often, because it's not. Computer scientists often use discrete math (logic, set theory, etc) rather than continuous math.

However, since the reason for calculus stems from its prevalence in all of engineering and physics and chemistry, and since the sciences and engineering are still a large part of a university's technical major, there's still strong need to teach calculus, and thus lots of resources to do it.

Still, computer scientists note that there are kids who program who's skill at math are rather weak, just as many chess players aren't that good at math, though somehow being good at math has some correlation with the ability to learn and master chess.

Some universities and colleges, especially those that aren't academically strong, have decided that it's not so important for computer science major to have strong math skills, as they fall into the trap that knowing how to program doesn't require much math.

And while it's true math isn't essential, it does teach a person who gets good at it, the ability to abstract, and that's important in computer science as in any topic. For example, to know how efficient an operation is, you need to know that something happens N times, not 10 times, not 100 times, but some abstract quantity N.

This level of abstraction is often daunting for folks that aren't good at math. And by good at math, I mean real math, not adding, subtracing, solving algebraic equations, but being able to understand proofs and be able to manage simple proofs. There are people who claim to be good at math, but what they're probably saying is that they are good at computing.

In essense, the relation between doing computations in math, and being able to prove, is somewhat analogous to being able to program, and being able to think about programming and apply it to real world situations.

Now, there are people that are plenty good at math that don't make good programmers. That's because programming has started to get increasingly difficult, and requries odd skills that mathematicians would puke if they heard it. There are magic numbers and incantations and special rules in programming that make no sense except that's the way they work. In math, with a few axioms and mathematical maturity, a person can reason a lot of math.

Einstein was able to figure out the mysteries of physics by thinking his thought experiments and using his formidable math skills to come to certain conclusions.

However, programmers have to deal with programming languages, and the rules that someone or some committee decided about language behavior. Now, these designers may have had many good reasons for why they did things a certain way, but that way was not inevitable.

But, even beyond the nitpicky aspects of the language, there's now the entire environment that surrounds it. You have to learn an editor, or an IDE, plus a debugger. You have to learn how to use tools like memory leak tools. You have to learn how to install the latest version of the compiler or editor. You may have to work with a third party library that's not well written, nor well documented. You may have to learn how good programmers do certain things, but without the obvious benefit of a book. A book may tell you how a language works, but now why Bjarne codes this way or that.

And, then there's version control, continuous builds, agile development, the waterfall method, learning which off-the-shelf tools you should use, learning what the community at large is doing, deciding whether you should learn Ruby on Rails or C# or not. Is strong typing good or not? Where do I look on the web for solutions for problems similar to mine?

These skills don't require any math, but instead, require the desire to keep up with an ever changing field, and that's important too.

But, to get back to math. Without a firm understanding of math, can you deal with abstraction in math. Here's an exercise to try. You have a character array, filled with white space and commas and periods. Write a function that returns an array of Strings, that contains all the words. For example, "Come here, Watson, I need you!" should produce an array that looks contains "Come", "here", "Watson", "I", "need", "you".

How hard is that? If you're good at programming, it's easy, though tedious (no strtok or split, though knowing that is useful too). But to beginning programmers, the thinking required to do that can be incredibly hard. They have no idea how to wrap their minds around this problem. They don't even know how to start.

Even when you explain how to do this, there are those that just don't see how you even reason in this way. It just seems amazingly difficult. And this stuff is easy. Imagine you wanted to show them how a compiler works. Or how to ensure that certain expressions type-check correctly. Or how garbage collection works. Or how to implement certain algorithms.

Some of the most fundamental stuff in programming requires algorithmic knowledge, and my friend, that requires math. It's true, some kinds of programming, require no math background at all. You can be, I imagine, a decent to great system administrator by mastering all sorts of details on UNIX commands and writing Perl scripts. The skill set needed by a good sysadmin do not necessarily overlap those required by good programmers which do not necessarily overlap those who would make good theoreticians.

I think the problem ultimately boils down to this. Programming seems easy. There aren't any difficult constructs (well, if you stay in a C-like language), at least, it doesn't seem to require deep math, and yet, it is hard. And tedious. And error prone. Even the ability to find bugs is a skill that programmers have, but theoreticians find totally tedious.

At the very heart of mathematics is proof. You can write extremely rigorous proofs assuming nothing but axioms. But we'd get very little math done if mathematicians had to prove in this laborious way. They tend to say that certain statements are true because someone proved it before, so they don't belabor it. They want you, the human, to figure out the gist of what they say, and getting pedantic about details that aren't illuminating is not that important.

Programming, by contrast, is that tedious. Every little step has to be spelled out in a formal language, and that language isn't even constant. Someone comes up with another language, and your "proofs" are no longer in vogue. There are better ways to do it, and so you must learn to think in this new language, and learn how the best do it, and try to be productive there as well.

Even as diverse as math is, everyone pretty much agrees how a proof should look, and thus, everyone has a solid commonality between them. This is not so in computer science where human-computer interaction researchers often use as much psychology as computer programming, where theoreticians in computer science may laugh at them and database people for doing what they mockingly refer to as "engineering", which is beneath them, not permanent, not delving into the truths of the universe.

My short answer is yes, you need math to be a good computer scientist. Computer science is more than just programming. It's reasoning and problem solving, and math has provided us the best way we know to tackle reasoning and problem solving in a rigorous fashion.

1 comment:

Anonymous said...

I believe what you said is true. I also want to stress that it is not calculs that we use alot in computer science, it is more of discrete math and probability.