Test fixtures

I know many people swear by the Arrange-Act-Assert (AAA) pattern for writing unit tests, but I have never liked it. I think it creates unnecessary duplication/repetition when the same arrangement is required in several test methods.

I prefer to have the unit under test initialised to the required state before each test method is run. In automated testing, this baseline state has a name: test fixture. Like its real-world counterpart, the fixture holds the unit in place so that it can be tested in different ways without its state and context changing between method executions.

So, instead of using the AAA pattern where the arrangement is done in each method, I have one test class in which the fixture is set up in one place and several methods act on the fixture and assert the outcomes in different ways. I name the class according to the context (for example, NewCustomerTests and ExistingCustomerTests) and the methods according to what they test (for example, NewCustomerTests.PropertiesAreSet and ExistingCustomerTests.CanBeDisabled).

Most test frameworks have a feature to mark a method for automatic execution before each test method call. For example, in JUnit such a method is decorated with the @Before qualifier; in MSTest, the attribute [TestInitialize] identifies such a method.

Organising test code like this helps reduce mindless mechanical tests and repetition, which are two of the main obstacles to TDD adoption.

TDD is not dead after all

Today, on Facebook, Kent Beck posted a rebuke to David Heinemeier Hansson’s “TDD is dead. Long live testing”. At the same time, Hansson also published another blog entry that looks like an attempt to appease the people who took offence to his earlier claim. His latest post bears some similarities to what I wrote about the consequences of abusing mocks in “The unit in unit-testing”. I can only ascribe that to coincidence. And, for the record, I see the value in TDD.

The unit in unit-testing

The interpretation of “unit” in unit testing to mean a class is wrong. This and an unfortunate misunderstanding of Kent Beck’s “run unit tests in isolation” still drive developers to test their classes in extreme isolation with consequences that are irrationally attributed to test-driven development (TDD) as not being viable.

When we exaggerate decoupling in our tests, we rely on mocks and stubs to stand for collaborators that are needed by the classes being tested. As the size of our test suite grows, so do the number and usage of mocks and stubs. But, there are two major problems with mocks and stubs. First, when we use them in tests, we reveal the implementation details of the classes that depend on them. This is because we write expectations and verifications based on a very detailed knowledge of how the mocks and stubs are used. Second, unlike real objects that change organically in response to changes in their collaborations, mocks and stubs are resistant to change due to the very specific ways that they are set up for test cases.

The first issue is usually a compromise that developers accept in exchange for testability. Also, many consider it to be harmless because encapsulation is not lost and usage of the classes is unaffected.

On the other hand, the second issue is what hurts developers most and drives many to abandon TDD. The scale of the problem manifests itself once the test suite has reached a considerable size. At that point, any slight change in the nature of the collaboration between classes requires changes to numerous tests that use mocks and stubs to fake the collaboration, with countless expectations and verifications suddenly needing revision. Since the task is usually tedious, and mocks and stubs do not have actual value in production, there is little motivation to spend resources on maintaining the tests. Eventually, the practice of TDD disappears altogether.

Ideally, mocks and stubs would only be used for testing integration between components instead of collaboration between classes, but if their use cannot be avoided, their effect can at least be minimised. The most obvious way to do this is to lower the number of expectations that are set up in the test suite. By grouping tests according to the mocks and stubs that they use and the expectations that are needed, respectively, the number of test cases that need to be changed when collaborations change will be less.

There are other ways to make TDD more efficient, such as good discipline in the design of test cases (which I will write about at some point), but one improvement that can be applied immediately is to reduce the use of mocks and stubs. And the way to do that is to embrace real object collaboration within your tests.

Learning BASE64 encoding

I used to search the web whenever I needed to do BASE64 encoding in my code, but when today I had to do it again, I thought it would be beneficial in the long run to learn the algorithm. It turned out to be not too difficult.

The point of BASE64 is to communicate binary data as text, using only characters that are likely to exist on most computer platforms. These safe characters are known as the BASE64 alphabet and are the letters A to Z and a to z, the numerals 0 to 9, and the characters / and +. There are other ways to represent bytes as text; for example, by converting them to hexadecimal strings made up of the characters 0 to 9 and A to F. But, doing so means that for every character in the original set, two hexadecimal characters are required, which doubles the size of the data.

The BASE64 alphabet consists of 64 characters, each one associated with an integer value. For example, the character A is represented by 0, the character Z by 25, and character / by 63. This means that to cover the range of integers from 0 to 63, the BASE64 word size must be six bits. As a consequence of this, during BASE64 encoding the original data must be laid out and padded to make its size in bits divisible by six.

The smallest number of bytes (or 8-bit words) that can be re-arranged in groups of 6-bit words is three (3 × 8 bits = 24 bits, which is divisible by six). This means that data must be batched in triplets of bytes, and each triplet must be converted into four 6-bit words. The BASE64 character matching the value of each 6-bit word is then output as an 8-bit ASCII character. So, for every three bytes of data, four bytes of output are generated, giving an inflation factor of 4:3 (which is a better compromise than the 2:1 ratio from hexadecimal encoding).

Data that cannot be split exactly in groups of three bytes must be padded to make them so. For example, data that are one byte long must be padded with two zero-value bytes, and data that are 11 bytes long must be padded with one zero-value byte. In other words, data must be padded to reach a size that is divisible by three.

With the theory out of the way, here is how BASE64 is implemented in Java, using the example “any carnal pleasure”.

First, encode the string as a series of bytes.

This results in an array of 19 bytes.

Next, pad the array with two zero-value bytes to make its size divisible by three.

Then, convert each triplet of bytes into four 6-bit words and calculate the value of each. (Use bit shift operators.) Append the BASE64 character represented by each 6-bit value to a StringBuilder instance.

This yields the BASE64 string “YW55IGNhcm5hbCBwbGVhc3VyZQAA”.

Finally, replace the padding characters (“AA” in this example resulting from the two zero-value bytes) with as many “=” characters. The “=” is used in the BASE64 decoding process (which is not covered in this post) to determine the amount of padding that has been applied.

This gives the final result “YW55IGNhcm5hbCBwbGVhc3VyZQ==”.

Now, I know that there are at least two classes in the standard Java libraries that provide BASE64 operations. One of those is undocumented and is subject to change, and the other is meant to be used by the mail library, which could cause confusion (or would be bad form?) if they are referenced in code that does not otherwise depend on the libraries where the classes reside. By writing my own implementation, I can avoid these unnecessary dependencies, and most importantly, I can do BASE64 in any language that does not have a built-in function for it.

How we use SQL Server Data Tools

This post describes the process that we use to develop databases with SQL Server Data Tools (SSDT) in Visual Studio.


For this process to work, the conventions below must be followed.

  • Use the live database as the gold standard for schema objects (and data).
  • Deploy only database projects that have been built successfully.
  • Deploy to a database that matches the schema of the live database.

At the beginning of a development iteration

  1. Restore a copy of the live database onto the development computer.
  2. Synchronise database project schema objects with the schema objects in the restored database.
  3. Remove pre-deployment and post-deployment scripts from the database project.
  4. Update the database project version number.
  5. Build the database project.
  6. If the build fails, fix the errors and rebuild.
  7. If the build completes, check in the changes.

During a development iteration

  1. Make changes to script files in the database project.
  2. If the changes might result in data loss, write pre-deployment and post-deployment scripts to migrate the data.
  3. Build the database project.
  4. If the build fails, fix the errors and rebuild.
  5. If the build succeeds, publish the changes onto the database on the development computer and test.

Interim releases to the test environment

  1. Restore a copy of the live database from backup.
  2. Build the database project.
  3. Publish the database project onto the test server.

Deployment to the live environment

  1. Back up the live database.
  2. Build the database project.
  3. Publish the database onto the live server.