Wednesday, September 5, 2007

Infinite Methods

What follows is 1,899 words. The original, found here, was 2,645.

Even if my later posts are windy, I'm not going to do this again, but I hope it was of interest to the people who asked for it.

~~~~~

The nine kinds of words are nouns, pronouns, adjectives, verbs, adverbs, propositions, conjunctions, interjections and articles. My sons didn't know this; nor had I when I was their age.

I dropped out of high school after the tenth grade; the summer I was sixteen I was homeless and sleeping in a park, which put a crimp in further education.

Recently my 8 year old, Richard, wanted to know if I'd played basketball in college. We play together regularly and we have not much in common otherwise outside action movies and being guys. Richard's older brother Bram lives and eats Pokemon, and Richard's almost as bad. My Pokemon knowledge extends to "Ash," "Pikachu," and "training," because you train Pokemon. Aside from this their lengthy discourses on the subject are in Mandarin.

So we talk basketball or movies, which is more than my Dad and I had in common. (To be fair to us, my Dad didn't watch movies much, and football bores me; but I watched University of Miami football games so I could suffer and gloat with him. At the 2 minute mark of Lakers games I'd know Dad was tuning in so he could call and exclaim over Magic or Kobe's brilliance, despite being even more indifferent to the Lakers than I was to Miami. "I love you" can be said lots of ways.)

"Did you play basketball in college" -- I sidestepped. "No, honey, I never played organized basketball." My older kids know I didn't go to college (my daughters even know why) -- but my kids go to good schools, are doing extremely well in school, and once the habit of good school performance is set, we can talk about why I didn't do well in school.

One of the reasons, though, is that I think analytically and was frequently bored in school because the material wasn't presented in a unifying structure. This analytic tendency has been useful to me as a programmer: decades of hammering away at my craft have separated out what's critical to the process of building scalable, maintainable systems, from what's not.

For example, I used to be a big Hungarian notation guy. Naming conventions are necessary but otherwise largely irrelevant, so long as they're not downright stupid. I've known this for years but still felt that a Hungarian notation-based naming system (iThing for integer data, sThing for string data, and so on) was really the best approach.

For about a year now I've written in, and gotten comfortable with, a non-Hungarian naming standard. And then I returned to a client where my code, three to eight years old, is in production. Working with this code ... I was downright annoyed at how unintuitive the naming convention was. Obviously a Hungarian notation-based system is less intuitive than the approach I've been using the last year ....

I'm never having a naming convention argument again so long as I live. The part of my brain that cares about such things is stupid and fickle.

~~~~~

So what is essential? Despite being bright I did badly in school. I was in my teens before I could diagram a sentence. An early story came back from George Scithers, bless him, then editing Isaac Asimov's Science Fiction Magazine, with the suggestion of a book on grammar. I read that book and discovered there were only nine kinds of words. That's it! That's grammar! Or at least the hard core of it ...

If any teacher had ever told me there were only nine kinds of words in the English language, I'd have learned them.

I recall one class in which I got a rare 'A' -- a ten week Geometry class, summer before the tenth grade. The teacher didn't want me: I'd done badly in his Algebra class. But ten weeks was the right speed. It went fast enough to keep my attention, was the sort of material I'm wired for, and across the years I went to school is my one really outstanding memory for hitting a subject I liked, engaging with the material, and having the class move fast enough. That teacher took me into his tenth grade trigonometry class with high expectations. Bad year -- we had the PSATs that year and I got the second highest score at that school, a Catholic boy's school with some very smart kids. I'd skated through the ninth grade without any teachers noticing me; that damned test brought me to their attention and I was miserable the whole tenth grade.

But the person most disappointed in me was my math teacher, because he knew what I was capable of first hand -- so halfway through the year he let me study at my own pace, and the second half of that class was better than the first. I was well into a different textbook by year's end.

Aside from a few courses on computers, astronomy, and writing, I've never been back to school. But I've kept learning. I've read over a thousand non-fiction works, learned a variety of useful business and life skills, at my own pace and when I felt like it. And what's come to me through the School of Dan, which I never got straight in real school, is that in all material there are core concepts, peripheral concepts, and chrome. Looking back, most of the schools I went to taught chrome.

What does core look like? In both writing and programming I've come to believe that it boils down to conciseness. I recall, very early in life, reading a book called "Philosophy and Cybernetics." This exposed me, though I didn't realize it at the time, to this idea: entia non sunt multiplicanda praeter necessitatem.

~~~~~

In the business world I work in good database design does not consist of doing more with less: it consists of doing less. Storing less data. Creating less structure. Writing less code.

This is not the way business people think about databases (to the degree they do think about databases.) Business people tend to prefer large to small, more tables to fewer, more data to less. The problem is that data may or may not be meaningful. The following strings contain equal amounts of data:

'00000000000'

'I love you.'

Each string contains eleven characters but the second string contains more information. Plainly, data is useless and information is useful: and the more concisely information can be characterized, the more useful it is.

I approach both writing and programming from the same perspective: do less. Omit words, to quote a smart guy. A sign with the words "Minimize structure - Minimize code" has hung over my desk at several companies.

I've been interviewing DBAs for twenty years. There's a question I ask all prospects, which in twenty years only a few people have ever answered correctly. It's this:

What, in almost all cases, is the difference between a query that performs badly, and one that performs well?

I've interviewed some very bright people over the years, and received interesting answers to this question. Good indexes, I've been told: covering indexes, clustered indexes, high cardinality indexes. Good statistic. A proper execution plan. Proper use of temp tables, or derived tables, or table variables. Proper joins. Correct normalization. Wise denormalization.

None of these answers are necessarily wrong, but they miss the point. Queries run on a computer, a thing in the real world. With rare exceptions they run against magnetic media: and magnetic media is slow. Off a good RAID array at this time, for sequential file transfers, you might pull bursts of 300 megabytes per second. Database queries by nature access media more randomly; 100 megabytes per second throughput is a superb real-world result.

For context, modern high-speed RAM has throughput to the CPU of over 10 gigabytes per second – about two orders of magnitude faster.

The difference between queries that perform well and badly is, almost always, that the query that performs well executes with fewer reads. So the concept that's not peripheral or chrome is this: databases perform well in direct proportion to the degree that they retrieve the correct answer with the fewest reads.

This question will be on the test.

To broaden out from computers, our goal is the correct result with the least effort. Now ... how you get to that goal is peripheral. There's more than one right way to perform most tasks ... but there are an infinite number of ways to perform a task incorrectly. (Moran's Principle of Infinite Methods -- "Infinite Methods" is the title of one of the many, many books I'll probably never write.) The first pass in learning any skill is to get out of the Infinite Methods. Once out you're an amateur: you produce functional work, though the work is likely not elegant or scalable or easy to maintain -- the criteria vary by field. But the work produces results that match your stated goal.

At some point you're a professional. (I'll define the word for you, ignoring connotations from various fields: a professional gets paid.) By now, one hopes, you know several ways to solve a given problem outside the Infinite Methods ... and so your job grows more complex. If you're honest you'll admit you don't always know which approach is best for a given problem: you haven't solved Problem X often enough to know all the options. (Some people never do solve Problem X more than one way, and they never get past the status of journeyman.) So you flex; curiosity is useful here. Try X, try Y, try Z. You have deadlines and that's life in a capitalist society -- so stay late and try the alternate approach. Noodle away at it over the weekend and before bedtime. What's the core of my problem? What's the simplest way to solve it? What approach takes the fewest steps, requires me to build and maintain the fewest objects?

This is one of the places where programming and writing fiction part ways: you don't maintain a production environment in writing. Once a piece is done it works or doesn't, and with some exceptions you're not going to revisit it. This is unfortunate: re-writing an old piece many years later is a good learning experience in both arenas.

If Stephen King and JK Rowling had to come back years later and rewrite their novels, they'd write shorter the first time.

~~~~~

Minimize structure. Minimize code. It's a reminder to me to never build something I don't need or that's similar to something I've already built. When in doubt, extend and reuse the similar entity. When in doubt ... don't.

Despite popular misconception, Occam's Razor doesn't say Pick the simpler solution, all else being equal; it says entities should not be multiplied needlessly. Which, studying, takes you to reductionism and parsimony. I've written statistical software; if I hadn't been exposed to the idea of parsimony already I'd have written useless statistical software. In business (as opposed to research, or so I'm told) statistical software works best to the degree you can identify the core data required to make successful predictions, and then quitting before you get yourself into trouble … which is parsimony.

What is parsimony? Less is more. Minimize structure, minimize code. And save some thoughts for later.

3 comments:

Steve Perry said...

I dunno, Dan. I thought there was a nice three-hundred-and-fifty page book buried in The Stand, which was pretty long.

Later, King did revise the novel for a new edition.

Made it longer ...

But then, he can sell his laundry list, and nobody wants to risk irritating the goose that lays the golden eggs ...

Sean Fagan said...

The Stand was an excellent apocalyptic novel, and a mediocre (at best) GvsE novel.

If you stop the original edition right in the middle, you've still got that great apocalyptic novel. :)

Dan Moran said...

I think I posted a while back about my idea for a "Phantom Edit" of King's Dark Tower series -- 7 volumes, and really would have made a world-class trilogy.

I read both the original and expanded versions of The Stand. The original's better and a slimmed-down version of that would have been better yet, as you both observe ...

Rowling's 7th Potter novel is going to make a much better movie than book -- the third of the book spent wandering around having dumb arguments will be trimmed to 10 minutes in the movie.