Learning from a failed deployment

This morning a deployment failed catastrophically. One of the scripts for upgrading the database caused several objects to be dropped unexpectedly.

We restored the database from backup, corrected the script, and repeated the deployment, which was successful. We now had to do a retrospective to learn what went wrong and how to avoid it in future.

We found that the database scripts generated by SSDT included statements to drop user objects. In this case, it deleted the user with db_owner role, which is used for deployment. This meant that subsequent statements could not be executed, and objects that had been dropped could not be created again.

The lapse in our process that allowed this to happen was us having too much confidence in the scripts that were generated. Nobody had verified that they did not contain any destructive statements.

The error had not happened on our development databases because our Windows accounts had sa role, and the security context allowed the scripts to execute even if the db_owner user was deleted. The lesson here was to stage-deploy under the same conditions in the development environment as in the production environment—a sensible approach that we ignored for convenience.

To avoid the error, we are changing our process to include a visual inspection of the database scripts before they are executed. We are also adding a canary database, which is a copy of the production database, on which the scripts can be tested as a final check.

Model-Based Testing

Robert Binder’s Testing Object-Oriented Systems book sits permanenly on my desk. At over 1500 pages long, it is almost a never-ending read, but from time to time I pause to read a few choice chapters.

Binder also wrote about the compliance testing of Microsoft’s court-ordered publication of its Windows client-server protocols in 2012. An interesting fact from the article is that instead of having to test software against documentation, Microsoft had to do the reverse because the code was already published and had to remain used as the gold standard. Under scrutiny and tight deadlines, they managed to check that 60,000 pages of documentation matched the protocols exactly, all thanks to model-based testing (MBT).

Test fixtures

I try to avoid the Arrange-Act-Assert (AAA) pattern for unit tests. I find that with multiple test methods depending on the same starting conditions, the ‘arrange’ code becomes repetitive, which makes tests tedious to write and difficult to maintain.

My preferred approach is to set one test fixture per test class, the test fixture being common for its test methods. In woodwork a fixture keeps a piece in place whilst it is being worked on; similarly, a test fixture keeps an object in a fixed state as the tests are executed.

Most test frameworks allow a method in a test class to be run before each test method is executed. In JUnit, the decorator @Before  designates this method; in MSTest, the attribute [TestInitialize] has the same effect. This method can be used to configure a test fixture as required for the tests in the class.

 

The unit in unit-testing

Interpreted literally, unit testing means testing each class individually. Complete isolation is, however, difficult given that a class typically interacts with other classes. Therefore, for this definition of unit testing to hold, collaborating classes must be replaced with fakes in a test. But developers confront two main problems when using these mock objects.

First, tests become tightly coupled to mocks because the latter must be made to act precisely like the actual classes that they replace. Often, this set up becomes so complex that writing tests takes more time than writing actual classes.

Second, the class under test loses encapsulation, because it must provide ‘anchors’ for the mocks to interact with in order to fake the desired behaviour.

Still, mocks remain useful in many cases. Developers can only minimise their unpleasantness with certain approaches.

One way is to carefully consider the goal of each test. For example, is it necessary to test interactions in order to verify a given class? Could its correctness be checked differently? For example, could its state instead of its interaction be validated?

If interaction tests are necessary, developers must at least ensure that they are sensible. Mocks are fake, yet many developers inadvertently verify them in their tests. Therefore, developers must guard against this mistake.

A better way is to broaden the interpretation of ‘unit’ to a cohesive set of classes, whether it consists of one independent class or many collaborating classes. This definition grants developers the freedom to test several classes together, thus eliminating the need for mocks.

 

Learning BASE64 encoding

The purpose of BASE64 is to communicate binary data as text, using only characters that exist on most computer platforms. These safe characters form the BASE64 alphabet and are the letters A to Z and a to z, the numerals 0 to 9, and the characters / and +.

Other ways of representing bytes as text exist. For example, bytes can be converted to hexadecimal strings made up of the characters 0 to 9 and A to F. But this conversion results in two hexadecimal characters for each character in the original set—the output becomes twice the size of the input.

Each of the 64 characters of the BASE64 alphabet is associated with an integer value. For example, the character A is represented by 0, the character Z by 25, and the character / by 63. To have the range of 0 to 63, a BASE64 word must be six bits (2^6=64). Therefore, to convert a byte into a BASE64 character, it must be padded with extra bytes until the number of bits is divisible by six.

The smallest number of bytes (or 8-bit words) that can be re-arranged in a way that the number of bits is a multiple of six is three (3×8 bits = 24 bits, 24/6 = 4). In other words, the input of BASE64 encoding must be processed in groups of 24 bits. So for every three bytes (24 bits) of input, four bytes (32 bits) of output are generated, giving an inflation factor of 4:3—which is still better than the 2:1 ratio from hexadecimal encoding.

Input data that cannot be split exactly in groups of 24 bits must be padded to make them so. For example, an input that is one byte (8 bits) long must be padded with two zero-value bytes (8 bits + (2×8 bits), 24 / 24 = 1); an input that is 11 bytes (88 bits) long must be padded with one zero-value byte (88 bits + 8 bit = 96 bits, 96 / 24 = 4); and so on. In short, input data must be padded to reach a size in bytes that is divisible by three.

With the theory out of the way, here is how BASE64 is implemented in Java, using the example ‘any carnal pleasure’.

First, convert the string to an array of bytes.

byte[] bytes = "any carnal pleasure".getBytes();

This results in an array of 19 bytes.

Next, pad the array with two zero-value bytes to make its size divisible by three.

byte[] padded = Arrays.copyOf(bytes, 21);

Next, calculate each group of three bytes (24 bits) into an integer value.

Next, break each integer result (24 bits) into four integer values, each six bits long (4 x 6 bits),  using bit-shifting.

Next, append the BASE64 character represented by each 6-bit integer result to a StringBuilder instance.

for (int byteIndex = 0; byteIndex < padded.length; byteIndex += byteGroupSize) {

    // read the value of the 24-bit word starting at the current index
    int wordOf24Bits = (padded[byteIndex] << 16) 
         + (padded[byteIndex + 1] << 8) 
         + padded[byteIndex + 2];

    // read the 24-bit word as 6-bit word value
    int wordOf6Bits1 = (wordOf24Bits >> 18) & 63;
    int wordOf6Bits2 = (wordOf24Bits >> 12) & 63;
    int wordOf6Bits3 = (wordOf24Bits >>  6) & 63;
    int wordOf6Bits4 = (wordOf24Bits      ) & 63;

    result.append(BASE64_CHARS.charAt(wordOf6Bits1));
    result.append(BASE64_CHARS.charAt(wordOf6Bits2));
    result.append(BASE64_CHARS.charAt(wordOf6Bits3));
    result.append(BASE64_CHARS.charAt(wordOf6Bits4));
}

This yields the BASE64 string ‘YW55IGNhcm5hbCBwbGVhc3VyZQAA’.

Finally, replace the padding characters (“AA” in this example resulting from the two zero-value bytes) with as many “=” characters. The “=” is used in the BASE64 decoding process (which is not covered in this post) to determine the amount of padding that has been applied.

for (int i = result.length(); i > result.length() - paddingSize; i--) {
    result.setCharAt(i - 1, '=');
}

This gives the final result ‘YW55IGNhcm5hbCBwbGVhc3VyZQ==’.

There are at least two classes in the standard Java libraries that provide BASE64 functions. One is undocumented and is, therefore, subject to change; the other is included in the mail library, which will confuse if referenced in a project that does not use mail. If you learn how to write your own implementation of BASE64, you can avoid these dependencies and — more importantly — implement it in any language.

How we use SQL Server Data Tools

This post describes our process for developing databases with SQL Server Data Tools (SSDT) in Visual Studio.

For it to work, the conventions below must be respected.

  • Use the live database as the gold standard for schema objects (and data).
  • Deploy only database projects that have been built successfully.
  • Deploy to a database that matches the schema of the live database.

At the beginning of a development iteration

  1. Restore a copy of the live database onto the development computer.
  2. Synchronise database project schema objects with the schema objects in the restored database.
  3. Remove pre-deployment and post-deployment scripts from the database project.
  4. Update the database project version number.
  5. Build the database project.
  6. If the build fails, fix the errors and rebuild.
  7. If the build completes, check in the changes.

During a development iteration

  1. Make changes to script files in the database project.
  2. If the changes might result in data loss, write pre-deployment and post-deployment scripts to migrate the data.
  3. Build the database project.
  4. If the build fails, fix the errors and rebuild.
  5. If the build succeeds, publish the changes onto the database on the development computer and test.

Interim releases to the test environment

  1. Restore a copy of the live database from backup.
  2. Build the database project.
  3. Publish the database project onto the test server.

Deployment to the live environment

  1. Back up the live database.
  2. Build the database project.
  3. Publish the database onto the live server.

 

Three golden rules to tackle complexity

Tim Newing, the IT director of Camelot, shares three golden rules to manage complexity in IT projects.

  • Think of a collection of simple solutions instead of one complex project.
  • Manage outside the ‘business as usual’. In other words, set up a different structure so that the project team is not distracted by the normal business.
  • Give people a good reason to complete the project. This is not the same as motivating them to make the project a success; instead the objective is to convince them to finish the project when it is time to do so in order to avoid feature creep.

Three weeks with a MacBook Pro laptop

I have been using my MacBook Pro for the past three weeks. The MacBook Pro is a great laptop, perfectly suited for development work. The laptop delivers incredible performance, thanks to its Core Duo Intel CPU and the high-end ATI X1600 graphics chip.

After using an iBook G3 for the past four years, I am blown away by how fast things get done with this new laptop. What used to take minutes (for example, restarting Tomcat 5.5 under NetBeans 5.0) now completes in a matter of seconds. The boost in performance has put the fun back in programming, as I now spend more time writing and testing code than waiting for the computer to complete an operation. Geert Bevin reports that his newish Acer Ferrari laptop is shamed by the performance of the MacBook Pro—the MacBook Pro compiles programs about 30% more quickly.

Developers had to make the switch to Mac OS X in order to use a Mac in the past; now with an Intel CPU driving the MacBook Pro and the possibility to run Windows on it, they no longer have an excuse not to switch. With the MacBook Pro dual-booting Mac OS X and Windows, they have the best of both worlds: the availability of Windows software and the build quality of Apple hardware and the robustness of Mac OS X.

If you are looking for a new laptop, you should seriously consider the MacBook Pro.

How to identify and fix an anaemic domain model

In most CRUD applications, classes consist of many accessor methods and few behaviour methods. It is often difficult for developers to recognise this as a symptom of an anaemic model.

Anaemic classes are characterised by having only methods for reading and for writing attributes. In addition to negating the benefits of object-oriented programming, such methods are cumbersome boilerplate code.

TDD helps us to identify anaemic domain models. According to TDD rules, usage code must be written before the implementation. By writing tests first, we anticipate how our classes will be used and can thus write only methods that are necessary. For example, consider an Account class with the following responsibilities.

  • represent a user account
  • hold information about a user (ie. username, password, email address, status)
  • used for authentication

One unit test for user authentication could be as follows.

assertTrue(account.getPassword().equals("testpassword"));

Here, we see that the responsibility for validating the password is actually fulfilled by the test. Recognising that this is not right, we decide to move it to the Account class.

Therefore, we change our unit test to the following.

assertTrue(account.hasPassword("testpassword"));

Now we see that the responsibility is rightly held by the Account class.

We create these simple tests to visualise how our classes will be used and how they must be refactored to not be anaemic. By repeating this process, we can gradually transform anaemic classes into classes with valuable behaviour.

In my experience, the following rules help to identify refactoring opportunities.

  • If calling code reads an attribute value from an object, tests it, then calls another method on the same same object, the behaviour must be moved to the object.
  • Getters and setters must not be implemented mindlessly; instead, they must be written only when they are appear in unit tests.
  • Daisy-chain calls to methods of the same object indicate that the object is anaemic and needs more behaviour methods.

How to switch off a screen laptop under Linux

In this post, I describe how to write a shell script that switches off a laptop screen. The instructions are tested with Ubuntu Linux 5.10 (Breezy) on a Dell Latitude C810.

First, set the governor for the CPU frequency with the following command.

echo powersave > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

This command can be run automatically at run-level 2 by putting it in /etc/rc2.d/S30freq-scaling.

Out of the box Breezy is configured to manage power, but we need additional configuration for some laptops.

First, on the C810 the kernel needs the acpi_irq_balance option in order to report certain ACPI events (for example, closing the lid). We fix this by adding the kernel option acpi_irq_balance in GRUB’s menu.lst.

title           Ubuntu, kernel 2.6.12-9-686 
root            (hd0,0)
kernel          /boot/vmlinuz-2.6.12-9-686 root=/dev/hda1 acpi_irq_balance ro quiet splash resume=/dev/hda5
initrd          /boot/initrd.img-2.6.12-9-686
savedefault
boot

Second, it seems that the command xset dpms force off does not switch off the screen. Instead, we must use the command vbetool to do this. Although it is reported to cause unexpected behaviour, it works with the C810.

After configuring the kernel as describe above, we can write the script /etc/acpi/screen.sh to turn on and to turn off the laptop screen.

#!/bin/sh

case "$1" in
        on)
                /usr/sbin/vbetool dpms on
                ;;
        off)
                /usr/sbin/vbetool dpms off
                ;;
        *)
                N=/etc/acpi/screen.sh
                echo "Usage: $N {on|off}"
                ;;

esac

The script takes as argument on or off, as in the example below.

/etc/acpi/screen.sh off