Hein (fub) wrote,

  • Mood:

Always normalise your database tables!

When you design a database, you should always normalise your tables. I myself stop at 4NF, and this habit has served me well in all these years.

People might whine that normalised tables are 'harder to casually analyse'. That argument is nonsense. Sticky notes and printouts are for 'casual analysis' -- a database is for the cold, hard facts.
People might whine that joins are hard to do. These people need to teach themselves SQL in a hurry, and stop complaining. Lack of skill is no argument to do things in improper ways.

Yes, you will get more tables, and you will have to do some extra joins. But the advantages far outweigh the disadvantages:
  • Your domain analysis will, of course, be a little harder and take a little longer, but this is a good thing. It forces you to think the domain over, to check and double check with the domain experts -- and this ensures that you don't get any nasty surprises later on because you overlooked something.
  • You don't duplicate data. This means that it is much easier to keep your data consistent. I think this is actually the most important advantage.
  • Your data is stored 'atomically'. That is, a single object/concept occupies a row in a single table. This itself has it's own set of advantages:
    • It is easier to create new relations between objects later on;
    • You can change a single table independently of the rest of the tables, as long as you don't change any columns that are used in relations.
Modularity is a Good Thing, and this also applies to databases.

The main argument against normalisation is performance. Joins are expensive (in terms of CPU cycles), but not overly so. If you run a website or application where database performance becomes an issue, there are a few things you can do to speed things up:
  • Look at where you can add indexes (hint: the fields on which you join are prime candidates for that);
  • Consider defining views or stored procedures for multi-join statements. These procedures and views are stored in 'compiled' form in the database, meaning that the execution plan is already in place by the time you do your query;
  • If you need some data that can only be generated by an iterative process (which SQL is ill-suited for), then consider creating a table to hold this (non-normalised) data. A scheduled process could fill this table with data gained from the normalised tables. Of course, your application should never write data in this table: the data in the normalised tables is The Truth and therefore leading!
Normalising the tables in your database makes things so much easier. Trust me -- it always pays off in the long run!
Tags: life manual, technology

Recent Posts from This Journal

  • Mock Chicken

    The Judge Dredd comics are an interesting artefact of the 1980s: it’s what British comic writers thought how the US urban landscape would…

  • Friday Five: Food

    Food is an important subject of this blog, so I couldn’t pass up a food-themed Friday Five. 1) What is your favourite food? Okonomiyaki, no…

  • Baking

    I have been making chocolates with flexible silicone molds, because those are easy to get the chocolate out of once it’s been set. I did try…

  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded