Friday, April 04, 2014

goto fail refactored

i wrote the lion's share of this a while ago but wasn't sure i wanted to publish yet another post about GOTO here since this isn't a programming blog. my mind was made up yesterday when i read this post by Steven J Vaughan-Nichols where he quotes a number of technology personalities essentially giving bullshit excuses for why GOTO is OK to use. it's no wonder 2 separate crypto libraries (both making prodigious use of GOTO) suffered embarrassing and dangerous defects recently when programming thought leaders perpetuate myths about structured programming.

i'm providing this as an object lesson in how to avoid the use of GOTO, especially in security-related code where a higher standard of quality is sorely needed. i'll be using Apple's Goto Fail bug as the example. here is the complete function where the fail was found, with the bug intact:

static OSStatus SSLVerifySignedServerKeyExchange(SSLContext *ctx, bool isRsa, SSLBuffer signedParams, uint8_t *signature, UInt16 signatureLen)
{
    OSStatus        err;
    SSLBuffer       hashOut, hashCtx, clientRandom, serverRandom;
    uint8_t         hashes[SSL_SHA1_DIGEST_LEN + SSL_MD5_DIGEST_LEN];
    SSLBuffer       signedHashes;
    uint8_t         *dataToSign;
    size_t          dataToSignLen;

    signedHashes.data = 0;
    hashCtx.data = 0;

    clientRandom.data = ctx->clientRandom;
    clientRandom.length = SSL_CLIENT_SRVR_RAND_SIZE;
    serverRandom.data = ctx->serverRandom;
    serverRandom.length = SSL_CLIENT_SRVR_RAND_SIZE;


    if(isRsa) {
        /* skip this if signing with DSA */
        dataToSign = hashes;
        dataToSignLen = SSL_SHA1_DIGEST_LEN + SSL_MD5_DIGEST_LEN;
        hashOut.data = hashes;
        hashOut.length = SSL_MD5_DIGEST_LEN;
        
        if ((err = ReadyHash(&SSLHashMD5, &hashCtx)) != 0)
            goto fail;
        if ((err = SSLHashMD5.update(&hashCtx, &clientRandom)) != 0)
            goto fail;
        if ((err = SSLHashMD5.update(&hashCtx, &serverRandom)) != 0)
            goto fail;
        if ((err = SSLHashMD5.update(&hashCtx, &signedParams)) != 0)
            goto fail;
        if ((err = SSLHashMD5.final(&hashCtx, &hashOut)) != 0)
            goto fail;
    }
    else {
        /* DSA, ECDSA - just use the SHA1 hash */
        dataToSign = &hashes[SSL_MD5_DIGEST_LEN];
        dataToSignLen = SSL_SHA1_DIGEST_LEN;
    }

    hashOut.data = hashes + SSL_MD5_DIGEST_LEN;
    hashOut.length = SSL_SHA1_DIGEST_LEN;
    if ((err = SSLFreeBuffer(&hashCtx)) != 0)
        goto fail;

    if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
        goto fail;
        goto fail;
    if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
        goto fail;

    err = sslRawVerify(ctx,
                       ctx->peerPubKey,
                       dataToSign,                /* plaintext */
                       dataToSignLen,            /* plaintext length */
                       signature,
                       signatureLen);
    if(err) {
        sslErrorLog("SSLDecodeSignedServerKeyExchange: sslRawVerify "
                    "returned %d\n", (int)err);
        goto fail;
    }

fail:
    SSLFreeBuffer(&signedHashes);
    SSLFreeBuffer(&hashCtx);
    return err;



one of the things you might notice is that all roads lead to "fail:", meaning "fail:" isn't really just for failures, it's for clean-up.

another thing you might notice is that the final "goto fail;" doesn't actually bypass any code - it's completely redundant and if it weren't there the next thing to execute would still be the code after the "fail:" label.

the first thing we're going to try is the most obvious approach to refactoring this function, to get rid of GOTO by making proper use of IF.

static OSStatus SSLVerifySignedServerKeyExchange(SSLContext *ctx, bool isRsa, SSLBuffer signedParams, uint8_t *signature, UInt16 signatureLen)
{
    OSStatus        err;
    SSLBuffer       hashOut, hashCtx, clientRandom, serverRandom;
    uint8_t         hashes[SSL_SHA1_DIGEST_LEN + SSL_MD5_DIGEST_LEN];
    SSLBuffer       signedHashes;
    uint8_t         *dataToSign;
    size_t          dataToSignLen;

    signedHashes.data = 0;
    hashCtx.data = 0;

    clientRandom.data = ctx->clientRandom;
    clientRandom.length = SSL_CLIENT_SRVR_RAND_SIZE;
    serverRandom.data = ctx->serverRandom;
    serverRandom.length = SSL_CLIENT_SRVR_RAND_SIZE;


    if(isRsa) {
        /* skip this if signing with DSA */
        dataToSign = hashes;
        dataToSignLen = SSL_SHA1_DIGEST_LEN + SSL_MD5_DIGEST_LEN;
        hashOut.data = hashes;
        hashOut.length = SSL_MD5_DIGEST_LEN;
        
        if ((err = ReadyHash(&SSLHashMD5, &hashCtx)) == 0) {    
            if ((err = SSLHashMD5.update(&hashCtx, &clientRandom)) == 0) {    
                if ((err = SSLHashMD5.update(&hashCtx, &serverRandom)) == 0) {    
                    if ((err = SSLHashMD5.update(&hashCtx, &signedParams)) == 0) {    
                        if ((err = SSLHashMD5.final(&hashCtx, &hashOut)) == 0) {    
                            hashOut.data = hashes + SSL_MD5_DIGEST_LEN;
                            hashOut.length = SSL_SHA1_DIGEST_LEN;
                            if ((err = SSLFreeBuffer(&hashCtx)) == 0) {    
                                if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) == 0) {    
                                    if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) == 0) {    
                                        if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) == 0) {    
                                            if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) == 0) {    
                                                if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) == 0)    {    
                                                    err = sslRawVerify(ctx,
                                                                       ctx->peerPubKey,
                                                                       dataToSign,                /* plaintext */
                                                                       dataToSignLen,            /* plaintext length */
                                                                       signature,
                                                                       signatureLen);
                                                    if(err) {
                                                        sslErrorLog("SSLDecodeSignedServerKeyExchange: sslRawVerify "
                                                                    "returned %d\n", (int)err);
                                                    }
                                                }
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
    else {
        /* DSA, ECDSA - just use the SHA1 hash */
        dataToSign = &hashes[SSL_MD5_DIGEST_LEN];
        dataToSignLen = SSL_SHA1_DIGEST_LEN;
        hashOut.data = hashes + SSL_MD5_DIGEST_LEN;
        hashOut.length = SSL_SHA1_DIGEST_LEN;
        if ((err = SSLFreeBuffer(&hashCtx)) == 0) {    
            if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) == 0) {    
                if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) == 0) {    
                    if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) == 0) {    
                        if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) == 0) {    
                            if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) == 0) {    
                                err = sslRawVerify(ctx,
                                                   ctx->peerPubKey,
                                                   dataToSign,                /* plaintext */
                                                   dataToSignLen,            /* plaintext length */
                                                   signature,
                                                   signatureLen);
                                if(err) {
                                    sslErrorLog("SSLDecodeSignedServerKeyExchange: sslRawVerify "
                                                "returned %d\n", (int)err);
                                }
                            }
                        }
                    }
                }
            }
        }
    }

    SSLFreeBuffer(&signedHashes);
    SSLFreeBuffer(&hashCtx);
    return err;

}

as you can see this version of the function is quite a bit longer as well as being deeply nested. this is the kind of code that actually makes programmers think the use of GOTO isn't as bad as their teachers told them it was, because that deep nesting makes the function seem more complex and more difficult to read. on top of which there is a considerable amount of duplicated code. neither of these things are appealing to programmers because they make reading and maintaining the code more work.

however, this is the most simple-minded and unimaginative way to refactor the original function. if we were to also tackle that complex pattern used in virtually all of the IF statements at the same time as getting rid of the GOTOs, we would instead get something like this:

static OSStatus SSLVerifySignedServerKeyExchange(SSLContext *ctx, bool isRsa, SSLBuffer signedParams, uint8_t *signature, UInt16 signatureLen)
{
    OSStatus        err;
    SSLBuffer       hashOut, hashCtx, clientRandom, serverRandom;
    uint8_t         hashes[SSL_SHA1_DIGEST_LEN + SSL_MD5_DIGEST_LEN];
    SSLBuffer       signedHashes;
    uint8_t         *dataToSign;
    size_t          dataToSignLen;

    signedHashes.data = 0;
    hashCtx.data = 0;
    err = 0;
    
    clientRandom.data = ctx->clientRandom;
    clientRandom.length = SSL_CLIENT_SRVR_RAND_SIZE;
    serverRandom.data = ctx->serverRandom;
    serverRandom.length = SSL_CLIENT_SRVR_RAND_SIZE;


    if(isRsa) {
        /* skip this if signing with DSA */
        dataToSign = hashes;
        dataToSignLen = SSL_SHA1_DIGEST_LEN + SSL_MD5_DIGEST_LEN;
        hashOut.data = hashes;
        hashOut.length = SSL_MD5_DIGEST_LEN;
        
        err = ReadyHash(&SSLHashMD5, &hashCtx);
        if (err == 0)
            err = SSLHashMD5.update(&hashCtx, &clientRandom);
        if (err == 0)
            err = SSLHashMD5.update(&hashCtx, &serverRandom);
        if (err == 0)
            err = SSLHashMD5.update(&hashCtx, &signedParams);
        if (err == 0)
            err = SSLHashMD5.final(&hashCtx, &hashOut);
    }
    else {
        /* DSA, ECDSA - just use the SHA1 hash */
        dataToSign = &hashes[SSL_MD5_DIGEST_LEN];
        dataToSignLen = SSL_SHA1_DIGEST_LEN;
    }

    if(err == 0) {
        hashOut.data = hashes + SSL_MD5_DIGEST_LEN;
        hashOut.length = SSL_SHA1_DIGEST_LEN;
        err = SSLFreeBuffer(&hashCtx);
    }
    if (err == 0)
        err = ReadyHash(&SSLHashSHA1, &hashCtx);
    if (err == 0)
        err = SSLHashSHA1.update(&hashCtx, &clientRandom);
    if (err == 0)
        err = SSLHashSHA1.update(&hashCtx, &serverRandom);
    if (err == 0)
        err = SSLHashSHA1.update(&hashCtx, &signedParams);
    if (err == 0)
        err = SSLHashSHA1.final(&hashCtx, &hashOut);
    if (err == 0) {
        err = sslRawVerify(ctx,
                       ctx->peerPubKey,
                       dataToSign,                /* plaintext */
                       dataToSignLen,            /* plaintext length */
                       signature,
                       signatureLen);
        if(err) {
            sslErrorLog("SSLDecodeSignedServerKeyExchange: sslRawVerify "
                        "returned %d\n", (int)err);
        }
    }
    
    SSLFreeBuffer(&signedHashes);
    SSLFreeBuffer(&hashCtx);
    return err;



not only does this follow almost exactly the same format as the original function (thereby retaining it's readability), it makes the condition checking simpler and easier to read, and it has virtually the same number of lines of code as the original.

combining assignment and equivalence tests into a single line IF statement was clearly intended to reduce the overall size of the source code but it failed, and in the process it made the code more complex and difficult to read. the combined assignment/condition checking and GOTO statements were complementary to each other. they supported each other and jointly contributed to the complexity of the original function.

this third version of the function, by contrast, has neither complex expressions nor the potential for complex control flow. the only real complaint one might make is that after an error occurs in one of the many steps in the method, the computer still needs to perform the "if(err == 0)" check numerous times. however that is only true if the compiler can't optimize that code, and checking the same variable against the same constant value over and over again seems like the kind of pattern a compiler's optimization routines might be able to detect and do something about.

complexity is the worse enemy of security, sloppiness begets complexity, and GOTO is a crutch for sloppy, undisciplined programmers - it is part of that sloppiness and contributes to that complexity, even when it's supposedly used the right way. what i did above isn't rocket science or magic. the same basic technique can be used in any case where GOTO is used to jump forward in the code (if you're using it to jump backward then god help you). the excuses people trot out for the continued use of GOTO not only make them sound like dumb-asses, it leads to lesser programmers trying to follow their lead and doing much worse at it. it is never used as sparingly as the gurus think it should be, and even their own examples occasionally contain redundant invocations of it, thoughtlessly applied.

if you actually work at adhering to structured programming rather than abandoning it the moment the going gets tough, you will eventually learn ways to make it just as easy as unstructured programming, you'll be a better programmer for having done so, and your programs will be less complex, easier to validate, and ultimately more secure.

2 comments:

Vess said...

There is a theorem that if a language has some basic flow control constructs (if/then/else, while/do, begin/end) then any algorithm can be programmed in that language without using goto. The proof of this theorem is constructive - i.e., it show how to take an arbitrary flowchart and re-arrange it so that it uses only the above constructs and no gotos.

But that's just theory. In practice, if the algorithm is designed badly, converting it to a form that doesn't use goto can result in a program which is much more inefficient (has repeated blocks of code) and difficult to read (I kid you not).

One should use the structural approach when designing algorithms in the first place - not convert them artificially into a goto-free form after the fact.

kurt wismer said...

i know that converting to a GOTO-less form can have bad results. i included an example, but i guess it wasn't the worst example i could have done. i could have replaced each GOTO with a copy of the clean-up code and a RETURN statement if i wanted to maximize the amount of duplicated code rather than the amount of nesting.

i absolutely agree that it's better start with structural approaches in the first place.

what i'm concerned about is that many people aren't even trying because the wide acceptance of GOTO gives people a way to avoid actually becoming proficient at structured programming, and that lack of proficiency in turn is leading to myths about structured programming that give people an incentive to accept and continue the bad habit of unstructured programming.

and i'm concerned that that is negatively impacting security because many of the people in charge of writing important security-related code are amongst those who aren't trying. i've looked at 4 different crypto libraries and they're all filled with GOTO.