The 9 kinds of words are nouns, pronouns, adjectives, verbs, adverbs, propositions, conjunctions, interjections and articles. Was talking to my sons about this this fact a few days back ... they didn't know it. Neither had I when I was their age.
I dropped out of high school after the 10th grade, mostly because I was bored to tears with it. The summer I was 16 I ended up homeless and sleeping in a park, which put a crimp in further education --
The other day my 8 year old, Richard, wanted to know if I’d played basketball in college -- he has a serious basketball jones and we’re sending him to a basketball camp next summer. He and I play together regularly and we have not much in common otherwise -- outside bad action movies and being guys, you know -- Richard’s older brother Bram lives and eats Pokemon, and Richard’s almost as bad. My Pokemon knowledge extends to “Ash,” “Pikachu,” and “training,” because you train Pokemon. Aside from this their lengthy discourses on the subject are in Mandarin.
So we talk basketball or movies when we socialize -- which is more than my Dad and I had in common, so bonus there. (To be fair to me and my Dad -- he didn’t watch movies much, and football bores me to death; but I watched University of Miami football games just so that I could suffer and gloat with him. And at the 2 minute mark of most Lakers games, I’d know he was tuning in so he could call up and exclaim over Magic or Kobe’s brilliance, despite being more indifferent to the Lakers than even I was to Miami. “I love you” can be said in lots of ways.)
But the “did you play basketball in college” -- I sidestepped. “No, honey, I never played organized basketball.” My older kids know I didn’t go to college (my daughters even know why) -- but my kids go to good schools, are doing extremely well in school, and once the habit of good school performance is set, we can have different sorts of conversations about why I didn’t do well in school --
One of the reasons is that I think analytically and was almost uniformly bored in school because the material wasn’t presented in any unifying structure. This analytic tendency has been hugely useful to me as a programmer -- decades of hammering away at my craft have separated out what’s critical to the process of building scalable, maintainable websites, from what’s not.
I used to be a big Hungarian notation guy -- the field has pretty thoroughly moved away from that, so I modified to a very minimalist Hungarian notation (sWord for strings, dWord for dates, nWord for numeric data including money.) Even that I finally abandoned -- there didn’t used to be editing environments that permitted you to inspect a variable’s properties, not for SQL Server, which is 90% of my development time these days. (The other 10% is also usually SQL, mySQL, a little Oracle -- very occasionally some VB.) But for a few years now there have been environments where, if I hovered the mouse over a variable, I could discover its type if I didn’t remember it -- more mature environments have done this forever, of course, but I still almost exclusively write T-SQL in Microsoft’s query analyzer -- which doesn’t. However, the new SQL Server 2008 does do this ...
I’d already abandoned my last vestige of Hungarian notation a while back. Over a year ago now I joined a startup that had code written in 4-5 different naming/formatting standards, all conflicting. I settled on a naming convention I didn’t like, mostly because it was the convention most frequently in use at that company, and as we’ve refactored we’ve cleaned up, until 2/3rds (up from maybe 1/5th) of the codebase now uses this naming and formatting convention.
The short version of this is, for a table: tbl_noun_relationship_to_other_nouns. So, for example, a table that stored addresses would be tbl_address; a table that stored companies would be tbl_company; and a table that stored the many-to-many relationship between companies and addresses would be tbl_address_company_map.
Code is usp_noun_verb ... usp for “user stored procedure,” to distinguish it from Microsoft’s stored procs, which are sp_whatever. A procedure that retrieves company data would be usp_company_select.
Simple enough, though I imagine I lost more than half my readers already. But what I said above, about what’s essential and what’s not? Naming conventions are necessary: but it’s mostly irrelevant what they are as long as they’re not downright stupid. I’ve known this for years but still felt that my way was the right way.
So for about a year now I’ve coded in this new naming convention -- not one I’ve used before. Got comfortable with it. So ... recently I went back to do some consulting work for an old client. And there’s a lot of my old code floating around in production over there.
As I say, I adopted the sans-Hungarian notation lower case with underscore naming convention reluctantly, because it was as close to a common convention as existed at my new company -- and when I started working again with code I wrote between 3 and 8 years ago, I was downright annoyed at how unintuitive my old naming convention was. Obviously the correct way to do it is non-Hungarian lower case with underscores ...
I’m never having a naming convention argument again as long as I live. The part of my brain that cares about such things is stupid and fickle.
So what is essential? I write well and was reading at 4 -- and was a terrible frustration to my teachers. I don’t think I ever got an ‘A’ in English in my whole life. (Even the numerically challenged could count the ‘A’s I did get without taking off their shoes.) I had a teacher in junior high who had other kids’ parents angry at her because she kept giving the class harder and harder spelling tests -- I wouldn’t study for them and I never got a word wrong. Drove her batty.
But I was in my teens before I got to where I could diagram a sentence -- some of my first stories came back from George Scithers at Asimov, bless him, and he suggested a book -- I forget the title now, but I sat down and read it. And discovered there were only 9 kinds of words. That’s it! That’s grammar! (OK ... it’s not grammar. But it’s the hard core of it.) If any teacher had ever told me there were only 9 kinds of words in the English language, I think I’d have learned them.
I got a few As in my life, but I only specifically recall one -- summer school, a 10 week fast-moving Geometry class in between the 9th and 10th grades. The math teacher didn’t want me -- I’d done badly in his Algebra class the previous year. But 10 weeks to cover the entire book was exactly the right speed -- it went fast enough to keep my attention, was exactly the sort of material that I’m wired for, and across all the years I went to school, is my one really outstanding memory for hitting a subject matter I liked, being engaged with the material, and having the class move fast enough. That teacher then took me into his trigonometry class in the 10th grade, with high expectations. Bad year -- we had the PSATs that year and I got the 2nd highest score at that school, a private Catholic boy’s school with a lot of really smart kids -- I’d skated through the 9th grade without any teachers noticing me. That damned test brought me to their attention and I was thoroughly miserable the whole tenth grade.
But the person most disappointed in me was my math teacher, because he knew what I was capable of first hand -- so about halfway through the year he let me study at my own pace, and the second half of that class was better than the first. I was well into a different textbook by the time we got done, though I still didn’t bring the overall grade up to an A -- missed too many tests if I recall.
Aside from a couple computer courses, an astronomy course, and 2-3 writing courses at a community college, I’ve never been back to school. But I’ve kept learning -- I’ve read well over a thousand non-fiction books, learned a variety of useful business and life skills; at my own pace and when I felt like it. And what’s come to me through the School of Dan, which I never quite got straight in real school, is that in all material there are core concepts, peripheral concepts, and chrome. Most of the schools I went to as a kid taught chrome, looking back at it.
What does core look like? In both writing and programming I’ve come to believe that it boils down to conciseness. I recall, very early in life, reading a book called “Philosophy and Cybernetics.” This exposed me, though I didn’t realize it at the time, to this idea: entia non sunt multiplicanda praeter necessitatem.
In the business world I live in good database design does not consist of doing more with less: it consists of doing less. Storing less data. Creating less structure. Writing less code.
This is not the way business people think about databases (to the degree they do think about databases.) They tend to believe that large is better than small, that more tables are better than fewer tables, that more data is better than less data. The problem with this is that data may or may not be of value. The following strings contain equal amounts of data:
‘I love you.’
Each string contains eleven characters worth of data, but the second string contains more actual information. So we come to a simple enough precept: data is meaningless but information is valuable. The more concisely information can be stored and transmitted, the more effective and useful it is.
Both writing and programming I approach from the same perspective: do less. Omit words, to quote a smart guy. Minimize structure. Minimize code. (“Minimize structure. Minimize code.” was a sign I used to have hanging over my desk at various companies.)
I’ve been interviewing DBAs for twenty years. And there’s a question I ask all DBA candidates, which in twenty years only a few people have ever answered correctly. It’s this:
What, in almost all cases, is the difference between a query that performs badly, and one that performs well?
I’ve interviewed some very bright people over the years, and the variety of answers I’ve gotten to this question has been interesting. Good indexes, I’ve been told: covering indexes, clustered indexes, high cardinality indexes. Good statistics, I’ve been told. A proper execution plan. Proper use of temp tables, or derived tables, or table variables. Proper joins. Correct normalization. Wise denormalization.
None of these answers are wrong, necessarily, but they miss the point. Database queries run on a computer. A thing that exists in the real world. And, with very rare exceptions they run against data which is stored on some form of magnetic media. And magnetic media is slow. Off a good RAID array at the time of this writing, you might be able to pull 300 megabytes per second in sustained bursts – bulk transfers of large files. Database queries, inherently more dependent upon random access, will be slower. 100 megabytes per second throughput, with real-world equipment, is a superb result.
To put this in context, modern high-speed RAM has throughput to the CPU of over 10 gigabytes per second – about two orders of magnitude faster.
The difference between a query that performs badly, and a query that performs well, in almost all cases: the query that performs well executes with fewer reads. So the core concept in this particular case, the part that’s not peripheral or chrome, is that databases perform well in direct proportion to the degree that they retrieve the correct answer with the fewest reads.
This question will be on the test.
Now ... how you get to that goal is peripheral. There’s more than one right way to perform most tasks ... but there are an infinite number of ways to perform a task incorrectly. (Moran’s Principle of Infinite Methods -- “Infinite Methods” is the title of one of the many, many books I’ll probably never write.) So the first pass in learning any skill is to get out of the Infinite Methods. Once you’ve done that, you’re an amateur: you can do work that functions, more or less, though it may not be quick or elegant or scalable or easy to maintain, or whatever -- but it produces a result that matches your stated goal. That’s an amateur.
At some point on the path of acquiring a particular skill you’re a professional. Most likely you know a few different ways to solve any problem outside the Infinite Methods. And now your job starts to get more complex again -- you have a toolkit and it’s bigger than it used to be. If you’re honest with yourself you really don’t know all the time which approach is best for a given problem, because you haven’t solved Problem X enough times to have a clear sense of all the different ways to do it. (Some people never do solve Problem X in more than one way -- makes the job easier, but they never do get past the status of journeyman.) So you flex -- curiosity is a good trait here. Try X, try Y, try Z. You have business needs that need to be met, that’s life in a capitalist society -- so stay late and try the alternate approach. Noodle away at it over the weekend. Think about it before bedtime. What’s the core of my problem? What’s the simplest way to solve it? What approach takes the fewest steps, requires me to build and maintain the fewest objects?
This, just for the record, is where programming and writing mostly part ways -- you don’t have to maintain a production environment in writing. Once a piece is done it either works or doesn’t, and with very rare exceptions you’re not going to tune it up again later. In a way this is unfortunate: re-writing an old piece many years later is a huge learning experience, in both text and code.
If Stephen King and JK Rowling had to come back years later and rewrite their novels, they'd learn to write shorter the first time around.
Minimize structure. Minimize code. It’s a reminder to me to never build something I don’t need, and to never build something that’s similar to something I’ve already built. When in doubt, extend and reuse the similar entity. When in doubt ... don’t.
Occam’s Razor doesn’t, despite popular misconception, say “Pick the simpler solution, all else being equal.” What it really says is: entities should not be multiplied needlessly. Which, if you study the idea, takes you to reductionism, to parsimony -- I’ve written statistical software; if I hadn’t been exposed to the idea of parsimony ahead of time, I’d have written useless statistical software. Statistical software (in particular for business-oriented process automation, which I’ve essentially worked in my whole adult life) works best to the degree you can identify the core discrete data points required to make a prediction, and thereafter quitting before you get yourself into trouble. (I’m told that it’s different in actual research; I wouldn’t know but it sounds reasonable.)
What’s parsimony? Less is more. Minimize structure, minimize code. And save some thoughts for later.
Sometimes these posts end up longer than I intend.
I’m working on a very short database book -- “The Elements of Speed.” In concept it’s a direct lift of Elements of Style, though obviously on a rather different subject matter. In very short (non-Microsoft-specific) form it covers my thoughts on how to build simple structures that perform well and are easy to maintain. (Did you know there’s only two things in the universe? Matter/Energy and time. Things and time happening to them. More on that in the book.)
I’ve been enjoying working on “Speed” -- I get to write and code at the same time. How can I say “x” most succinctly? With the fewest words? Constant revision is the key; you can boil down most ideas if you have time -- this post should probably be half the length it is. But I’m delivering an app later today -- it’s short and elegant, but you know -- I’m getting paid for that.