SO #57225262—Inputting list of passwords in a class

Stack Overflow question ‘Inputting list of passwords in a class’ is more interesting as an object-oriented design (OOD) exercise than as a debugging exercise, but it also provides an opportunity to practise some Python programming.

The requirements identified in the problem are:

  • represent a set of credentials comprised of a username and a password;
  • represent a user account made up of an identifier and one active set of credentials and zero or more inactive credentials;
  • store user accounts that can be retrieved subsequently by username;
  • validate a given set of credentials, using the stored user accounts.

Additionally, a user account has the following constraints:

  • It must be uniquely identifiable by the username of the active set of credentials.
  • A set of credentials must be used only once for a given user account.

The design model consists of multiple classes, each addressing a single responsibility as follows.

ClassResponsibility
Credentialsrepresent a set of credentials
UserAccountrepresent a user account
AccountStorestore user accounts, enforcing the uniqueness constraint
Authenticationvalidate supplied credentials

The program code implements the model very faithfully and uses the same class names, thus making it easy to reason about. It has been tested with Python 2.7.15rc1 and has been checked for PEP8 compliance.

NOTES

For brevity, the solution presented here does not:

  • implement best practices used for security in real-world scenarios, such as salting and hashing;
  • enforce uniqueness of user account identifiers;
  • consider performance factors.

In lieu of conventional unit tests, a main routine exercises use-cases to ensure that the program works properly.

class Credentials:

    def __init__(self, username, password):
        self.username = username
        self.password = password

    def has_username(self, username):
        return self.username == username

    def matches(self, credentials):
        return self.username == credentials.username and \
            self.password == credentials.password

class UserAccount:

    def __init__(self, user_id):
        self.user_id = user_id
        self.active_credentials = None
        self.past_credentials = []

    def add(self, credentials):
        self._check_uniqueness(credentials)
        if self.active_credentials is None:
            self.active_credentials = credentials
        else:
            self.past_credentials.append(self.active_credentials)
            self.active_credentials = credentials

    def has_username(self, username):
        return self.active_credentials.has_username(username)

    def has_same_username(self, user_account):
        return self.has_username(user_account.active_credentials.username)

    def has_credentials(self, credentials):
        return self.active_credentials is not None and \
            self.active_credentials.matches(credentials)

    def _check_uniqueness(self, credentials):
        if self.has_credentials(credentials):
            raise Exception('These credentials are currently in use.')
        for c in self.past_credentials:
            if c.matches(credentials):
                raise Exception(
                        'These credentials have been used in the past.')

class AccountStore:

    def __init__(self):
        self.user_accounts = []

    def add(self, user_account):
        self._check_uniqueness(user_account)
        self.user_accounts.append(user_account)

    def find_by_username(self, username):
        for ua in self.user_accounts:
            if ua.has_username(username):
                return ua
        return None

    def _check_uniqueness(self, user_account):
        for ua in self.user_accounts:
            if ua.has_same_username(user_account):
                raise Exception(
                        'An account with the same username is already active.')

class Authentication:

    def __init__(self, account_store):
        self.account_store = account_store

    def validate(self, credentials):
        user_account = self.account_store.find_by_username(
                credentials.username)
        if user_account is None:
            return False
        return user_account.has_credentials(credentials)

if __name__ == '__main__':
    credentials = Credentials('user1', 'password1')
    user_account = UserAccount(101)
    user_account.add(credentials)

    account_store = AccountStore()
    account_store.add(user_account)

    user_account1 = account_store.find_by_username('user1')
    print 'user_account1', user_account1

    user_account2 = account_store.find_by_username('user2')
    print 'user_account2', user_account2

    authentication = Authentication(account_store)
    print 'Expecting True...', authentication.validate(
            Credentials('user1', 'password1'))
    print 'Expecting False...', authentication.validate(
            Credentials('user2', 'password1'))
    print 'Expecting False...', authentication.validate(
            Credentials('user1', 'password2'))

    user_account.add(Credentials('user1', 'password2'))
    print 'Expecting True...', authentication.validate(
            Credentials('user1', 'password2'))
    print 'Expecting False...', authentication.validate(
            Credentials('user1', 'password1'))

    try:
        user_account.add(Credentials('user1', 'password1'))
    except Exception:
        print 'Expecting exception... Pass'

    try:
        user_account.add(Credentials('user2', 'password1'))
        print 'Not expecting exception... Pass'
        print 'Expecting True...', authentication.validate(
                Credentials('user2', 'password1'))
    except Exception:
        print 'Not expecting exception... Fail'

    try:
        user_account1 = UserAccount(102)
        user_account1.add(Credentials('user1', 'whatever'))
        account_store.add(user_account1)
        print 'Expecting True...', authentication.validate(
                Credentials('user1', 'whatever'))
    except Exception:
        print 'Not expecting exception... Fail'

    try:
        user_account2 = UserAccount(103)
        user_account2.add(Credentials('user1', 'whatever'))
        account_store.add(user_account1)
        print 'Expecting exception... Fail'
    except Exception:
        print 'Expecting exception... Pass'


The output of the program is:

EY@LENNY:~/Source/junk/python/pwman$ python all.py
user_account1 <__main__.UserAccount instance at 0x7faa36f11170>
user_account2 None
Expecting True... True
Expecting False... False
Expecting False... False
Expecting True... True
Expecting False... False
Expecting exception... Pass
Not expecting exception... Pass
Expecting True... True
Expecting True... True
Expecting exception... Pass
EY@LENNY:~/Source/junk/python/pwman$

Undefined behaviours in C

Undefined behaviours in the C language confuse many beginners. As an occasional C programmer I am also baffled when I encounter them—as happened with this code that I wrote in an Arduino sketch.

static void
get_input(String prompt, void* const input, void (*parse_func)(void* const), int (*validate_func)(const void* const))
{
  while (!validate_func((const void* const)input)) {
    Serial.println(prompt);
    while (Serial.available() == 0);
    parse_func(input);
  }
}

void
loop()
{
  get_input("Enter the number of blinks: ",
        &(led->blinks),
        *parse_int,
        *validate_positive_int);
}

get_input() is a generic function. It takes a parameter String prompt and two function pointer parameters (*parse_func)(void* const) and (*validate_func)(const void* const). These generic function pointers take parameters of type void, which can be cast to any other type.

At runtime I passed the function validate_positive_int() as argument for parameter (*validate_func)(const void* const). It converts its argument to an integer value and tests if the result is greater than zero.

static int
validate_positive_int(const void* const val)
{
  return *(const int* const)val > 0;
}

As I was debugging the sketch, I modified the function as follows

static int
validate_positive_int(const void* const val)
{
  *(int*)val = 1234; // <--- modification 
  return *(const int* const)val > 0;
}

Although I assigned a new value to the parameter of type const void* const – that is, a constant – the code compiled successfully, and the program executed without any error.

But when I tried to change the value of the parameter by casting it to the same type as declared for the parameter, the compiler reported an error—as it should.

static int
validate_positive_int(const void* const val)
{
  *(const int* const)val = 1234; // <--- modification 
  return *(const int* const)val > 0;
}
test.c: In function ‘validate_positive_int’:
test.c:15:28: error: assignment of read-only location ‘*(const int *)val’
     *(const int* const)val = 1234;

This was puzzling because I expected the code to not compile in both cases.

Going through the C documentation, I found that an undefined behaviour arises when a const object (that is, const void* const) is modified through a non-const pointer (that is, int*).

Even if I was reluctant to accept this explanation because of my familiarity with safer language compilers that enforce parameter declarations strictly – in a case similar to the above, that the value of a const parameter is immutable – I eventually had to accept that C is different in how a parameter declaration is not enough to cause a compilation error when an argument is treated contradictorily to the declaration.

When a programmer declares a parameter for a function, they ask future users of that function to call it with arguments of the declared type. However, they do not guarantee that their function will not treat the argument however they want. In the example above, although the function tells the caller that it must be passed a const, it can still modify the argument.

Programming Arduino in Atmel Studio

As a fun way to improve my C, I started programming the Arduino using Atmel Studio instead of the friendlier Arduino IDE.

Below is my very first program. It blinks a red LED and a white LED according to a preset pattern.

#define F_CPU 16000000UL

#include <avr/io.h>
#include <util/delay.h>

struct Led
{
    int pin;
    int blinks;
    int on_duration;
    int off_duration;
};

void delay_ms(int ms)
{
    while (ms-- > 0) {
        _delay_ms(1);
    }
}

void blink(const struct Led* const led)
{
    for (int i = 0; i < led->blinks; i++) {
        PORTB |= led->pin;
        delay_ms(led->on_duration);
        PORTB &= ~led->pin;
        delay_ms(led->off_duration);
    }
}

int main(void)
{
    const struct Led white_led = { 1<<PB6, 10, 100, 500 };
    const struct Led red_led = { 1<<PB7, 10, 500, 1000 };

    DDRB |= white_led.pin;
    DDRB |= red_led.pin;
    while (1) {
        blink(&white_led);
        blink(&red_led);
    }
}

With C in Atmel Studio and the AVR-LIBC library, IO ports are manipulated by changing bit patterns in the registers associated with the relevant ports. This requires a good understanding of bitwise operations in C despite the availability of macros to simplify the task.

For example, to set pin 12 of the Arduino to output mode, bit 6 of register DDRB must be set to 1. To do so requires an | (OR) operation with a bit pattern operand where bit 6 is set to 1 and the rest set to 0, so that the states of the other bits in the register are not disturbed.

Using macros DDRB and PB6 defined in AVR-LIBC, this is done like this: DDRB |= 1 << PB6 .

If you are new to C and are unfamiliar with macros, you might wonder about that statement. Besides, DDRB and PB6 are not referenced anywhere else in my program, so how does this line of code work?

DDRB is a macro that expands into C code to dereference the address of the register associated with setting the mode for pin 12, and PB6 is just a symbolic constant for the value 6. In the statement above, by shifting the value 1 left by 6 positions, we create a new value which is then applied to the bit pattern stored at the dereferenced address with an | operation to turn bit 6 of the register to 1. In this case, this sets pin 12 to output mode.

In a nutshell, the sequence of operations is as follows.

Step 1:

1 << 6 = 01000000

Step 2:

Assuming the register DDRB is initially 00000001:

00000001 | 01000000 = 01000001

In my C program, the result of step 1 is assigned to struct field Led.pin and is used as the second operand for the operation in step 2.

It took about an hour to refresh my knowledge of bitwise operations, but the real challenge was interpreting the Arduino schema and the information in datasheets, especially to find the right registers to manipulate.

Hacking SSL support into smtpop.dll

We use smtpop.dll in one application to retrieve email from a POP3 mailbox. Today we had to connect to a mailbox over SSL, which smtpop.dll does not support.

Our code to retrieve email is abstracted behind a façade class, so I expected to simply substitute a new library for smtpop.dll and to call new methods. However, the tight coupling of the façade to the interface of smtpop.dll meant that we needed the replacement to also expose the exact same interface.

After trying several things, I resigned to create a new class with the code from the decompilation of smtpop.dll. Fortunately, only two methods, Open and Quit, had to be changed.

namespace ClassLibrary4
{
    using System.IO;
    using System.Net.Security;
    using System.Net.Sockets;

    using SmtPop;

    public class Pop3ClientWithSsl : POP3Client
    {
        #region Fields

        private SslStream sslStream;

        #endregion

        #region Constructors and Destructors

        public Pop3ClientWithSsl()
        {
            this.UseSsl = true;
        }

        #endregion

        #region Public Properties

        public bool UseSsl { get; set; }

        #endregion

        #region Public Methods and Operators

        public new int Open(string hostname, int port, string username, string password)
        {
            if (this.UseSsl)
            {
                return this.OpenWithSsl(hostname, port, username, password);
            }

            return base.Open(hostname, port, username, password);
        }

        public new string Quit()
        {
            try
            {
                return base.Quit();
            }
            finally
            {
                this.m_streamReader.Close();
                this.m_streamWriter.Close();
                if (this.UseSsl)
                {
                    this.sslStream.Close();
                }
            }
        }

        #endregion

        #region Methods

        private int OpenWithSsl(string hostname, int port, string username, string password)
        {
            this.m_host = hostname;
            this.m_port = port;
            this.m_user = username;
            this.m_tcpClient = new TcpClient(hostname, port);

            this.m_netStream = this.m_tcpClient.GetStream();
            this.sslStream = new SslStream(this.m_netStream, false);
            this.sslStream.AuthenticateAsClient(hostname);

            this.m_streamReader = new StreamReader(this.sslStream);
            this.m_streamWriter = new StreamWriter(this.sslStream) { AutoFlush = true };

            string welcome = this.m_streamReader.ReadLine();
            if (welcome != null && welcome.StartsWith(+OK))
            {
                return this.SendLogin(username, password);
            }

            this.m_error = welcome;
            return -1;
        }

        #endregion
    }
}

The methods of class POP3Client were not virtual, but some of the class members were in protected scope and were accessible in the new class. I rewrote the Open and Quit methods as new methods, which made them no longer polymorphic, thus forcing us to replace calls to POP3Client with calls to Pop3ClientWithSsl everywhere in the code.

Java Server Faces rage!

From http://thoughtworks.fileburst.com/assets/technology-radar-jan-2014-en.pdf:

We continue to see teams run into trouble using JSF– JavaServer Faces — and are recommending you avoid this technology.

Teams seem to choose JSF because it is a J2EE standard without really evaluating whether the programming model suits them. We think JSF is flawed because it tries to abstract away HTML, CSS and HTTP, exactly the reverse of what modern web frameworks do. JSF, like ASP.NET webforms, attempts to create statefulness on top of the stateless protocol HTTP and ends up causing a whole host of problems involving shared server-side state. We are aware of the improvements in JSF 2.0, but think the model is fundamentally broken.
We recommend teams use simple frameworks and embrace and understand web technologies including HTTP, HTML and CSS.

This quote describes exactly how I feel about Java Server Faces.

From my old posts, you can see that I have been a strong supporter of JSF. I continued to believe in its potential even when it was new and lacked features.

JSF 2.0 was supposed to address developers’ complaints about the shortcomings of earlier versions, such as lack of support for bookmarkable URLs and the absence of view parameters. But it has many of its own flaws, and various implementations still do not follow the underlying JavaEE standards. The slow progress and the persistent incompatibility make developing with JSF rather frustrating.

Learning from a failed deployment

This morning a deployment failed catastrophically. One of the scripts for upgrading the database caused several objects to be dropped unexpectedly.

We restored the database from backup, corrected the script, and repeated the deployment, which was successful. We now had to do a retrospective to learn what went wrong and how to avoid it in future.

We found that the database scripts generated by SSDT included statements to drop user objects. In this case, it deleted the user with db_owner role, which is used for deployment. This meant that subsequent statements could not be executed, and objects that had been dropped could not be created again.

The lapse in our process that allowed this to happen was us having too much confidence in the scripts that were generated. Nobody had verified that they did not contain any destructive statements.

The error had not happened on our development databases because our Windows accounts had sa role, and the security context allowed the scripts to execute even if the db_owner user was deleted. The lesson here was to stage-deploy under the same conditions in the development environment as in the production environment—a sensible approach that we ignored for convenience.

To avoid the error, we are changing our process to include a visual inspection of the database scripts before they are executed. We are also adding a canary database, which is a copy of the production database, on which the scripts can be tested as a final check.

Model-Based Testing

Robert Binder’s Testing Object-Oriented Systems book sits permanenly on my desk. At over 1500 pages long, it is almost a never-ending read, but from time to time I pause to read a few choice chapters.

Binder also wrote about the compliance testing of Microsoft’s court-ordered publication of its Windows client-server protocols in 2012. An interesting fact from the article is that instead of having to test software against documentation, Microsoft had to do the reverse because the code was already published and had to remain used as the gold standard. Under scrutiny and tight deadlines, they managed to check that 60,000 pages of documentation matched the protocols exactly, all thanks to model-based testing (MBT).

Test fixtures

I try to avoid the Arrange-Act-Assert (AAA) pattern for unit tests. I find that with multiple test methods depending on the same starting conditions, the ‘arrange’ code becomes repetitive, which makes tests tedious to write and difficult to maintain.

My preferred approach is to set one test fixture per test class, the test fixture being common for its test methods. In woodwork a fixture keeps a piece in place whilst it is being worked on; similarly, a test fixture keeps an object in a fixed state as the tests are executed.

Most test frameworks allow a method in a test class to be run before each test method is executed. In JUnit, the decorator @Before  designates this method; in MSTest, the attribute [TestInitialize] has the same effect. This method can be used to configure a test fixture as required for the tests in the class.

 

The unit in unit-testing

Interpreted literally, unit testing means testing each class individually. Complete isolation is, however, difficult given that a class typically interacts with other classes. Therefore, for this definition of unit testing to hold, collaborating classes must be replaced with fakes in a test. But developers confront two main problems when using these mock objects.

First, tests become tightly coupled to mocks because the latter must be made to act precisely like the actual classes that they replace. Often, this set up becomes so complex that writing tests takes more time than writing actual classes.

Second, the class under test loses encapsulation, because it must provide ‘anchors’ for the mocks to interact with in order to fake the desired behaviour.

Still, mocks remain useful in many cases. Developers can only minimise their unpleasantness with certain approaches.

One way is to carefully consider the goal of each test. For example, is it necessary to test interactions in order to verify a given class? Could its correctness be checked differently? For example, could its state instead of its interaction be validated?

If interaction tests are necessary, developers must at least ensure that they are sensible. Mocks are fake, yet many developers inadvertently verify them in their tests. Therefore, developers must guard against this mistake.

A better way is to broaden the interpretation of ‘unit’ to a cohesive set of classes, whether it consists of one independent class or many collaborating classes. This definition grants developers the freedom to test several classes together, thus eliminating the need for mocks.

 

Learning BASE64 encoding

The purpose of BASE64 is to communicate binary data as text, using only characters that exist on most computer platforms. These safe characters form the BASE64 alphabet and are the letters A to Z and a to z, the numerals 0 to 9, and the characters / and +.

Other ways of representing bytes as text exist. For example, bytes can be converted to hexadecimal strings made up of the characters 0 to 9 and A to F. But this conversion results in two hexadecimal characters for each character in the original set—the output becomes twice the size of the input.

Each of the 64 characters of the BASE64 alphabet is associated with an integer value. For example, the character A is represented by 0, the character Z by 25, and the character / by 63. To have the range of 0 to 63, a BASE64 word must be six bits (2^6=64). Therefore, to convert a byte into a BASE64 character, it must be padded with extra bytes until the number of bits is divisible by six.

The smallest number of bytes (or 8-bit words) that can be re-arranged in a way that the number of bits is a multiple of six is three (3×8 bits = 24 bits, 24/6 = 4). In other words, the input of BASE64 encoding must be processed in groups of 24 bits. So for every three bytes (24 bits) of input, four bytes (32 bits) of output are generated, giving an inflation factor of 4:3—which is still better than the 2:1 ratio from hexadecimal encoding.

Input data that cannot be split exactly in groups of 24 bits must be padded to make them so. For example, an input that is one byte (8 bits) long must be padded with two zero-value bytes (8 bits + (2×8 bits), 24 / 24 = 1); an input that is 11 bytes (88 bits) long must be padded with one zero-value byte (88 bits + 8 bit = 96 bits, 96 / 24 = 4); and so on. In short, input data must be padded to reach a size in bytes that is divisible by three.

With the theory out of the way, here is how BASE64 is implemented in Java, using the example ‘any carnal pleasure’.

First, convert the string to an array of bytes.

byte[] bytes = "any carnal pleasure".getBytes();

This results in an array of 19 bytes.

Next, pad the array with two zero-value bytes to make its size divisible by three.

byte[] padded = Arrays.copyOf(bytes, 21);

Next, calculate each group of three bytes (24 bits) into an integer value.

Next, break each integer result (24 bits) into four integer values, each six bits long (4 x 6 bits),  using bit-shifting.

Next, append the BASE64 character represented by each 6-bit integer result to a StringBuilder instance.

for (int byteIndex = 0; byteIndex < padded.length; byteIndex += byteGroupSize) {

    // read the value of the 24-bit word starting at the current index
    int wordOf24Bits = (padded[byteIndex] << 16) 
         + (padded[byteIndex + 1] << 8) 
         + padded[byteIndex + 2];

    // read the 24-bit word as 6-bit word value
    int wordOf6Bits1 = (wordOf24Bits >> 18) & 63;
    int wordOf6Bits2 = (wordOf24Bits >> 12) & 63;
    int wordOf6Bits3 = (wordOf24Bits >>  6) & 63;
    int wordOf6Bits4 = (wordOf24Bits      ) & 63;

    result.append(BASE64_CHARS.charAt(wordOf6Bits1));
    result.append(BASE64_CHARS.charAt(wordOf6Bits2));
    result.append(BASE64_CHARS.charAt(wordOf6Bits3));
    result.append(BASE64_CHARS.charAt(wordOf6Bits4));
}

This yields the BASE64 string ‘YW55IGNhcm5hbCBwbGVhc3VyZQAA’.

Finally, replace the padding characters (“AA” in this example resulting from the two zero-value bytes) with as many “=” characters. The “=” is used in the BASE64 decoding process (which is not covered in this post) to determine the amount of padding that has been applied.

for (int i = result.length(); i > result.length() - paddingSize; i--) {
    result.setCharAt(i - 1, '=');
}

This gives the final result ‘YW55IGNhcm5hbCBwbGVhc3VyZQ==’.

There are at least two classes in the standard Java libraries that provide BASE64 functions. One is undocumented and is, therefore, subject to change; the other is included in the mail library, which will confuse if referenced in a project that does not use mail. If you learn how to write your own implementation of BASE64, you can avoid these dependencies and — more importantly — implement it in any language.