Final Paper

CSCI 5828:

Foundations of Software Engineering

Prof. Alex Wolf

Patrick Kellogg

#472-94-1930

May 5, 2001

Introduction: Edsger Dijkstra's “Go To Statement Considered Harmful”

When Edsger Dijkstra wrote a letter to the editors of the Communications of the ACM in their March 1968 issue, I don't he suspected the firestorm his ideas would produce. The next several journals of the CACM held several replies and protests, especially from Donald Knuth, one of the more vocal supporters of the go to statement, who followed up in 1974 with his rebuttal, “Structured Programming With Go To Statements”.

A quick web search for “X Considered Harmful” brought up dozens of parodies and takeoffs of Dijkstra's letter, including:

Recursive Make Considered Harmful	AOL Considered Harmful
Csh Programming Considered Harmful	Human Task Switches Considered Harmful
Use of Eval Considered Harmful	Incentive Pay Considered Harmful
TCP Extensions Considered Harmful	Fragmentation Considered Harmful
Strict Guidelines Considered Harmful	Unlocking GDHandles Considered Harmful
“Reply-To” Munging Considered Harmful	Unix Interface Considered Harmful
“Character Set” Considered Harmful	Syslog Considered Harmful
Interface Pointers Considered Harmful	Firewall Systems Considered Harmful
Address Munging Considered Harmful	Code Protection Considered Harmful
W* Effect Considered Harmful	PGP MIME Internet Draft Considered Harmful
Weblogging Considered Harmful	YUV and Luminance Considered Harmful
XSL Considered Harmful	Network Layer Routing Considered Harmful
<FONT FACE> Considered Harmful	Use of Agents Considered Harmful
Spam Considered Harmful	Synonyms Considered Harmful
Spam Control Considered Harmful	Tuples Considered Harmful
Formatting Objects Considered Harmful	Shared Libraries Considered Harmful
Primitive Types Considered Harmful	HTTP 1.0 Logs Considered Harmful
Web RPCs Considered Harmful	DejaNews Considered Harmful
Layered Multiplexing Considered Harmful	Software Test Tools Considered Harmful

The ACM now has a policy that it will no longer print an article taking so assertive a position against a coding practice. Years afterwards, Ruben Frank aired a contrary view in another letter called, inevitably, “‘Goto Considered Harmful’ Considered Harmful”, and of course, there was the ultimate recursion: “Goto Considered Harmful Considered Harmful Considered Harmful Considered Harmful…”

Dijkstra's Reasoning

In his letter, Dijkstra makes two strong points against the use of go to. First of all, he argues that the use of that command makes code confusing and difficult to read. “It is too much an invitation to make a mess of one's program”, he asserts (Dijkstra, 1968). Unbridled use of go to creates “spaghetti code” where the flow of executed lines jumps backward and forward in the code. This criticism is not merely an opinion of programming style, but can seriously affect the number of bugs in the code and how hard it would be do fix those bugs.

Dijkstra's 1968 letter comments, “we should do… our utmost to shorten the conceptual gap between the static program and the dynamic process, to make the correspondence between the program (spread out in test space) and the process (spread out in time) as trivial as possible.” (Dijkstra, 1968) He goes further to claim that the quality of a programmer was inversely proportional to the number of go to statements that they use. While this is an interesting metric for evaluating the quality of a person, Dijkstra is surely not serious; there is more to programming than avoiding the use of “explicit control transfers”. (Marshall, p.2) Instead, his statement is more like the “flame wars” of Internet chat groups— Dijkstra wanted to state a strong position, and perhaps enjoyed insulting other programmers to make his point.

There are better reasons to avoid the use of go to. Steve McConnell notes, “Code containing gotos are hard to format. Indentation should be used to show logical structure, and gotos have an effect on logical structure.” (McConnell, p. 348) Also, go to statements defeat compiler optimization, by disturbing the expected flow of control. McConnell suggests a list of “dos and don'ts” for the command:

Use goto to emulate structured control constructs in languages that don't support them directly. When you do, emulate them directly. Don't abuse the extra flexibility the goto gives you
Don't use goto when an equivalent structured construct is available
Measure the performance of any goto used to improve efficiency
Limit yourself to one goto label per routine unless you're emulating structured constructs
Limit yourself to gotos that go forward, not backward, unless you're emulating structured constructs
Make sure all goto labels are used
Makes sure a goto doesn't create unreachable code
If you're a manager, adopt the perspective that a battle over a single goto isn't worth the loss of the war

(McConnell, p. 358)

The Effect of Go To on Proofs of Correctness

However, the more serious criticism is that use of the go to statement makes proofs of correctness impossible for that program. Dijkstra imagines that the code could be stopped at any point, and the “progress of the process could be characterized by a single textual index.” (Dijkstra, 1968) This textual index could simply be a marker that points “between” two statements (A � B) within a certain routine. We may also need to specify the state of internal variables, such as the value of a loop counter, which would tell us how many times we have executed the loop. Dijkstra calls this a “dynamic index” that may not be explicitly tracked by a variable in the code, since it is “out of the programmer's control.” However, this is extremely unlike a Petri net, where all values must need to be known to determine the state of the system.

Dijkstra tries to imagine how we could enlarge the textual index to also track code with go to statements. He imagines recording a normalized clock, so we would know how many actions were performed since the start of the code's execution. However, he discards this idea, because it would be too hard to work backwards to figure out that the 173^rd line of code executed was actually the 13^th iteration of a given loop. Alternatively, instead of a simple counter, we could keep a complete list of all the statements executed so far, and use that to figure out where we have been. However, searching through a “back trace” like this is incredibly tedious, and does nothing to help prove the reliability of the code; it simply proves that the current stream or execution given the current input was or wasn't correct.

Exceptions To the Rule

Almost all forms of assembly language have the idea of a jump or branch instruction like JMP. In his letter, Dijkstra makes a side comment that the command “should be abolished from all ‘higher level’ programming languages (i.e. everything except, perhaps, plain machine code)” (Dijkstra, 1968). Dijkstra doesn't elaborate on how JMP instructions could be eliminated from assembly code, and perhaps he doesn't think that it would be possible. Or, maybe the implication is that programmers wouldn't perform proofs of correctness of such low-level languages.

On a lighter note, perhaps Dijkstra would approve of the humorous INTERCAL language that includes the command “COME FROM”

(1) PLEASE

(2) DO COME FROM (1)

Anytime the program executes the statement at label 1, it will immediately jump to the statement at label 2 without executing any of the intervening statements. (Garrett) This control structure, while making it difficult to read or understand the code from start to finish, still allows a programmer to create a proof of correctness backwards using “upward induction”. In fact, the PLEASE-DO COME FROM pair would be safer than a standard go to statement, since a compiler could theoretically “straighten” out the code into a unique linear sequence of commands while every command #1 is always followed by command #2 as long as the pair of commands is unique. That is, as long as there are no COME FROM statements that also come from #1, the INTERCAL language is fit for a proof of correctness.

At the same time, go to is often one of the first commands that students are taught when learning about programming computers. Whether in pseudo-code or in a beginning language like BASIC, I believe go to gives a quick way for students to understand control flow in a program. For example, when a program is shown for the first time, the teacher might say, “The first line at the top is executed first, and then the next line and so on,” just like reading a book. Go to is often the first control structure students learn, because it has the intuitive feel of “jumping” ahead when reading a story. Though I have no hard data on this computer science education, I would bet that given a one-hour lesson to elementary school children, they would answer questions about go to statements and labels better than they would answer simple questions about “for-each” or “if-then-else” control structures. Even if go to is bad to use in actual programs, it is easy to understand.

The Effects of Dijkstra's Letter

Dijkstra's letter caused an enormous impact over the next few years after its publication. Throughout the seventies, language designers shunned the use of go to. There isn't a single major language invented after 1968 that includes that statement. The only popular languages today that uses go to are BASIC, which was originally designed in 1964 as a language for teaching programming, or FORTRAN, which dates back to 1954, long before Dijkstra's 1968 letter. Other forms of pseudo-code might often include go to, but presumably they are used to convey an idea such as algorithm design, and those who use that pseudo-code aren't concerned with proofs of correctness at the time. Lindsay Marshall notes that java has goto as one of its reserved words, but doesn't actually make use of it. (Marshall, p. 2) Surprisingly, ADA, one of the most carefully-constructed languages ever created, has an implementation for go to in the famous requirements document created in June 1987, and support for the statement continues through the ADA 95 standard.

Of course, there were other computer researchers that were concerned about proving programs correct. In fact, Dijkstra cites Niklaus Wirth and C.A.R. Hoare with the inspiration of his letter, since their comment on case notation notes that, “like the conditional, it mirrors the dynamic structure of a program more clearly that go to statements and switches.” (Dijkstra, 1966) In the same year, Guiseppe Jacopini proved that all go to statements in any given program can be replaced with alternative control structures, making the command superfluous. However, one could argue that it was Dijkstra's single letter that ushered in the age of structured programming.

Structured Programming

Edsger Dijkstra himself coined the term “structured programming” to describe the ideas expressed in his 1968 letter and other papers. The invention of structured programming promoted the idea of “encapsulation”. Since program control could not be explicitly passed off to another section of code, individual routines became compartmentalized. Programmers were encouraged to think of subroutines as reusable components, which reduced the intellectual complexity of the program. Dijkstra noted, “A study of program structure has revealed that programs can differ tremendously in their intellectual manageability. A number of rules have been discovered, violations of which will either seriously impair or totally destroy the intellectual manageability of the program.... I now suggest that we confine ourselves to the design and implementation of intellectually manageable programs.” (Dijkstra, 1976). Also in the same book, he writes, “Hierarchical systems seem to have a property that something considered as an undivided entity on one level is considered as a composite object on the next lowest level of greater detail: as a result, the natural grain of space or time that is applicable at each level decreases by an order of magnitude when we shift our attention from one level to the next lower one. We understand walls in terms of bricks, bricks in terms of crystals, crystals in terms of molecules, etc.” (Ibid)

Computer scientist Tom Harbron provides an alternative definition, “The Fundamental Principle of Structured Programming is that at all times and under all circumstances, the programmer must keep the program within his intellectual grasp. The well-known methods for achieving this can be briefly summarized as follows: 1) top-down design and construction, 2) limited control structures, and 3) limited scope of data structures.” The tight controls of structured programming help prevent errors by enforcing a compartmentalized architecture where each component can be thoroughly tested and debugged by itself. Not only does this help the human comprehend large complex computer systems, but also it mimics the way the brain actually thinks (for example, see psychological learning theory of “chunking”).

Functional Programming

In fact, since go to is effectively an assignment to the program counter, many computer scientists have argued that any assignment is considered harmful, which is the basis of functional programming. Such functional languages as Haskell and Miranda are considered “purely functional” since all of their computations are performed by function application. That is, everything in the language is treated like a function, and functions can be performed on functions ad infinitum.

Scheme and Standard ML relax this condition a little bit, by allowing side effects that last even after the computation is performed. Since the functions are evaluated according to internal rules, there is no concept of the idea of a go to statement that would redirect processing to another section of code. Tasks are evaluated either sequentially or concurrently in a strict order, and the user is not allowed to change the sequence of operations.

Applications To Modern Language Features

Today's computing languages have a lot of tools that didn't exist when Dijkstra wrote his infamous letter: classes, multi-threaded programs (and multiple processors), event-driven programs, and other object-oriented and client-server design issues. How well does Dijkstra's criticism of go to hold up in the modern computer age?

Classes

If an instance of a class has a global scope, all member variables of that class are also global, whether they are private or not. This is not a criticism, but merely an observation that every instance of a class contributes to the “state” of the program. Computer science students are often warned against using global variables, since they can make a program hard to test and debug. Every global variable multiplies the number of possible states of the system exponentially. For example, say that there are two global Boolean variables in a program. Then, the system can be test exhaustively with only four test cases: 00, 01, 10, and 11. Adding a third Boolean value changes the number of possible states to eight. If the three variables are integers, or worse yet, floats, there could be a gigantic number of sets where only a handful produced the error. Clearly, exhaustive testing is impossible, even for a small number of global variables.

Inheritance makes things worse by “hiding” the state of the base class. Even if variables are “protected”, they are public to any classes inherited from the base class. So, member functions of the base class can set otherwise “private” values that will increase the number of possible state of the program. The only way to improve the situation is to make sure the class contains all the functions that could possibly interact with those states. This method is explained later in this paper as setting up a guarded “monitor” for the system.

Event-driven Programming

All of Microsoft's recent languages are event-driven: Visual Basic, Visual C++, and even their scripting languages such as code for their Access database. Windows has a unified messaging system built into it, so applications has send specific messages to each other, or even send a “broadcast message” that can be received by any application. Then, each application can decide how they want to respond. That way, the operating system can broadcast a “shutdown” message and have each message decide how to exit gracefully (or not at all).

However, with this model, we need to greatly expand Dijkstra's “textual index” which tracks the progress of the code. To correctly follow the control flow of the program, not does all code need to be examined, but the exact timeline of input data would need to be stored. For example, we would need to know which messages were sent first, by which client, and to which server, if not a set of several servers or indeed, all of them. Only then could we decide where the code starts and which agent begins processing. This quickly becomes a difficult problem, with thousands of messages sent internally and externally, and possibly simultaneously.

So, we would also need to know the strict timing of the code, to guess which messages are processed first. This is equivalent to manually analyzing timing reports, a difficult chore, and not easy to do. We can still perform other analysis tasks on event-driven programs, but trying to trace their execution from start to finish with actually running the program is next to impossible. Other analysis tools should be used than simple induction.

For example, Michal Young's team at the University of California has an approach called CATS that stands for a “Concurrency Analysis Tool Suite”. They recommend looking at the code in a number of different ways, starting from the specification, and going through the development process into the testing phase.� They perform “reachability analysis” to make sure that there are no states where the system can become deadlocked or that can never be reached. Other tools analyze task interaction and sequencing constraints, while still others check the underlying model and perform some formal verification. However, on the whole, they have rejected trying to perform a proof of correctness for large-scale concurrent programs.

Concurrent Programming

When several programs are running concurrently, it makes it difficult to create a proof of correctness. If they are all trying to use a single shared resource, they could all trample over each other's changes. This can be especially difficult if they are all trying to write to the same database or read from a single input device. These “concurrent” programs (also called “multiprograms”) have been studied for over twenty years, and are starting to become useful with the advent of cheaper multiprocessor systems and new numerical methods involving several programs operating on the same data set, such as the “Seti@Home” project.

Dijkstra developed the “THE” multiprogramming system in the in the early 60's, which was the first formalization of the “semaphore” abstract data type. He has two special operations, P and V (Dutch for "to try to decrease" and "to release", respectively). The V operation is used to signal the occurrence of an event, while the “P” operation is used to delay a process until a specific event has occurred. Thus, they can be used to implement “mutual exclusion”, or to signal the occurrence of such events as interrupts or busy states.

However, semaphores by them selves are not enough. In a complex system, what should be protected with a semaphore? Adding a locking mechanism to every single global variable would result in the programs running sequentially and not concurrently, as each of them waited for a large number of semaphores to be released before proceeding exclusively. Clearly, a better form of analysis was needed to detect when a set of statements needed to have exclusive access to a resource.

The Guarded Command Language

In 1975, Dijkstra created his “Guarded Command Language” (GCL), which is more of a formal set of logical statements than a “language” as usually defined in computer science. Like a grammar, it allows statements to be constructed and proven, even if they are stochastic.

The code of an “if” statement in GCL occurs within a pair of “if-fi” statements:

         if <boolean expression1> -> <statement-list1>

         [] <boolean expression2> -> <statement-list2>

         [] ...

         [] <boolean expressionN> -> <statement-listN>

fi

The square brackets separate the statements from each other. All of the boolean expressions are evaluated at the same time.
Then, one of true ones is selected non-deterministically (at random), and its associated statement-list is executed. The Guarded
“do-od” is similar, except it repeats until all expressions are false. (Dijkstra, 1976, p. 35) With this construct, Dijkstra was
able to prove the execution and termination of any multiprogram.

The semantics of the GCL allowed the creation of “Conditional Critical Regions” (CCRs), which are entire blocks of code that are executed exclusively. This would be equivalent to the “statement-lists” of a guarded “if-fi”. From there, it was a simple step to create the idea of “monitors” that encapsulate several functions and make sure that they do not interrupt each other. Interaction with the monitor happens only through special monitor procedures. As such, the correctness of a monitor can be proven no matter what other code is in a program. However, not all code in a program can be placed into the abstract representation of a monitor, nor would we want to. The tool should be used to simplify the code, and creating too many arbitrary CCRs could make things worse.

The Gries-Owicki Theorem

The “Gries-Owicki” theorem expands Dijkstra's ideas for multiprograms that can be executed with more than one processor. However, the theorem can also be applied to multithreaded programs running on a single processor. Gries describes the idea of “interference freeness” in the introduction to “On a Method of Multiprogramming”:

“Consider two processes P and Q, which communicate using shared variables. If execution of P doesn't interfere with the proof of correctness of Q, and if execution of Q doesn't interfere with the proof of correctness of P, the P and Q can be executed properly together — their correctness proofs are consistent with their concurrent execution. P and Q are said to be interference free.”

(Feijen/Gasteren, p. vii)

In the simplest case, if the processes have no shared variables, the problem goes away. Likewise, if they are operating on two entirely different parts of the program (i.e. their proofs of correctness do not interfere), they can be run simultaneously with no problem. When, the proofs intermingle, there might be problem, and we need to look further into the preconditions and assertions of their proofs.

Feijen states the Owicki-Gries theory in a different way, by commenting on when assertions can be valid in a multiprogram:

“that it is established by the program in which it occurs, the so-called local correctness of the assertion,

and that it is maintained by the atomic statements of the other programs, the so-called global correctness of the assertion.”

(Feijen, p. 120)

This is a nice definition, since it suggests that the local code should be examined first, to make sure that it is locally consistent. Then, we can examine how it interrelates with other programs. The “core” of the theory is stated as, “The annotation of a multiprogram is correct (if and only if) each assertion is established by the component in which it occurs and it is maintained by all atomic statements of all other components.” (Feijen/Gasteren, p. 28)

However, in some cases, the precondition is too strict, and the proof can't be completed. So, there is a postulate that “correct annotation remains correct by weakening an assertion or weakening the postcondition.” (Feijen/Gasteren, p. 38) This is not a problem when developing programs, compared to trying to adjust the precondition for some existing code. Thus, it can be easy to “paint yourself into a corner” by writing a routine that will be difficult to prove a posteriori. Feijen notes that a good proof is always “within reach, provided one adheres to a design discipline in which one doesn't commit oneself too easily to premature decisions.” (Feijen, p. 120)

Criticisms of Dijkstra's Letter

I have several problems with the use of proofs of correctness for modern programming languages. First, a caveat: as a student, I feel that I haven't read enough of the literature to ensure that these qualms haven't been fully covered an explained in other papers. However, in my former career as a software engineer, I found that something is lacking in the modern process of software development. Too many programs are written that have bugs, crash, or don't work as expected. Certainly, some kind of tool is needed to help ensure that programs work, but I'm not sure it's proofs of correctness.

Criticism #1: Correctness vs. Optimality

My first criticism is that creating a proof of correctness for a program says nothing about its speed. A routine could be flawless, yet useless if it runs too slow or if there is another algorithm that should be used instead.

David Gries responded to this criticism:

Why do you complain that a correctness proof doesn't give you information about efficiency? That's not its intent!

There are two aspects to a program: correctness and efficiency. Use one tool (proofs) to deal with one of them and another tool (analysis of programs) to deal with the other.

Professor Gries is absolutely correct here; it is not fair to fault the method that it doesn't perform a task it wasn't intended to do. However, since we are writing a proof, it would be wonderful if the logic also proved that the algorithm was not only a valid solution, but also that it is the best possible solution.

However, proving the optimality of an algorithm is often impossible to do. Gregory Chaitin suggests that the complexity of a program is related to the smallest set of commands that will produce identical output for identical input. In other words, we could create a duplicate program filled with useless loops and null statements, but it wouldn't increase the total complexity of the algorithm. The unfortunate side effect is that researchers currently think that it is impossible to find the lower bound for any algorithm. In other words, we may find a faster algorithm for the traveling salesman problem in the future, or we might not. Perhaps it's a good thing that proofs of correctness avoid this issue of optimization.

Criticism #2: Correctness Proofs are Hard

First of all, proofs of correctness are hard to find. In his article “A Simple Program Whose Proof Isn't”, Don Knuth spends five pages to create a proof for a simple program: a routine to convert a decimal fraction into the nearest representable binary fraction. (“Beauty Is Our Business”, Feijen, pp. 237-241). He came across this problem when he was writing the T_EX program. Knuth comments, “it was a very short and simple piece of code, yet I could see no easy way to demonstrate its correctness by conventional methods”. (Ibid, p. 233)

The problem can be described in one paragraph. If the decimal fraction is in the form:

where k ≥ 1 and each digit d_j is an integer in the range 0 ≤ d_j < 10 for 1 ≤ j < k, then, we want the nearest integer multiple of 2^-16. Or in other words, we want to round the quantity

to the nearest integer n. The smallest possible value of n is 0, which occurs when the input is less than 0.00000762939453125. The largest value of n is 65535, which occurs when the input is greater than or equal to 0.99999237060546875

Here is Knuth's program find a decimal fraction given the integer n:


         

        
          j:=0; s:=10 * n+5; t:=10;
        

        
          repeat if
            t > 65536 then s:=s+32768-(t div 2);
        

        
                 j:=j+1; d[j]:= s
            div 65536;
        

        
          s:=10 * (s mod 65536); t:=10 * t;
        

        
          until s ≤ t;
        

        
          k:=j;

In this algorithm, t holds the decimal place we are currently examining. It goes from 10 to 100 to 1000, and so on. Eventually, the value of t becomes too large, so the second line of the code is supposed to handle the overflow by rescaling s. The variable s holds the current remainder after division, in the same way that standard decimal longhand division stores the remainder of the dividend mod the divisor.

We could test the algorithm by trying every possible value of n from 0 to 65536 and see if the decimal fraction is correct. However, Dijkstra once said that “testing can reveal the presence of errors but not their absence”. We would like an algorithm that would be correct on machines with a smaller precision than Knuth's 2^-16, so we should be able to construct a proof that his routine was correct. He imagines that if that all the digits d₁�d_j have been calculated already, then the next set of digits d_j+1�d_k would form the full decimal number 0.d_i�d_jd_j+1�d_k. Since we have j=0 initially, then the last set of digits d_j+1�d_k has to fit within the range:

�or, in other variables,

He then notes that the decimal digits have to fit inside [d/10..(d+1)/10), and the intersection of the two equations yields a new range [a��b�):

�which reduces to:

Finally, substituting in the� values of s and t as:

10a = 2^-16(s-t) and 10b = 2^-16s

gives the initial starting values of s = 10n + 5 and t = 10.

Then, with each iteration, we wasn't to change s and t so:

s := 10(s - 2^-16d) and t := 10t

Plugging in these values into the algorithm verifies that it is correct. Prof. Gries goes further by creating a proof of correctness that constructs Knut's algorithm line-by-line, instead of merely verifying an existing algorithm. (Feijen, pp. 141-148) The different here is subtle, but extraordinary. He states, “…it is difficult to prove a program correct after it has been written. It is far easier to develop the proof and program hand in hand — with the proof ideas leading the way”. (Feijen/Gasteren, p. viii) Gries is more formal, and proceeds methodically for each logical statement:


         

        

        

        
          R0: "(j : 0 �
            j < #d : 0 �
            d.j <
            10)           
            Every element d_i is a valid digit
        

        
          S.d: �(j : 0 �
            j < #d :
            d.j/10^j+1)       
            Where S.d stands for the value of the fraction
        

        
          R1: n � 1/2 �
            2¹⁶ * S.d < n +
            1/2         The fraction S.d
            approximates n/2¹⁶ close enough
        

        
          R2 : "(D : seq : RO^d_D �
            R1^d_D : #d �
            #D)   The sequence d is as short as possible
        

        
          R0 and R2 form an invariant, but R1 has to be modified when n !=
            0:
        

        
          P1: 2¹⁶ * S.d < n + � (i.e. 0 < n + 1/2 -
            2¹⁶)

So, now the complete invariant is R0 � R1 � R2


        
           
        

        
          P3: n + 1/2 < 2¹⁶ * (S.d +
            10^-#d)         
            The next-highest fraction is too large
        

        
          P4: s = 10^#d+1 * (n + 1/2 - 2¹⁶ *
            S.d)           
            The loop should terminate when R1 holds
        

        
          P5: t =
            10^#d+1                           
            Rewriting R1 to avoid recalculation of 10^#d+1

Gries rewrites the components of the invariant to make them simpler, and adds a new one:


        
           
        

        
          R0: "(j : 0 �
            j < #d : 0 �
            d.j < 10)
        

        
          P1: 0 < s
        

        
          R2: there is no solution of length less than #d digits
        

        
          P3: s < 10*2¹⁶
        

        
          P4: s = 10^#d+1 * (n + 1/2 - 2¹⁶ * S.d)
        

        
          P5: t = 10^#d+1
        

        
          P6: s is an odd multiple of 2^#d

So, the algorithm would look something like:


        
           
        

        
          d := e;
        

        
          s := 10*n+5;
        

        
          t := e;
        

        
          {invariant P: R0 � P1 � R2 � P3 � P4 � P5 � P6}
        

        
          do
            s > t --> extend d while maintaining P
        

        
          od

This is expanded into:


        
           
        

        
          d := e;
        

        
          s := 10*n+5;
        

        
          t := e;
        

        
          {invariant P: R0 � P1 � R2 � P3 � P4 � P5 � P6}
        

        
          do
            s > t --> d := d^(s/2¹⁶), s := 10*(s
            mod 2¹⁶), t := t*10
        

        
          od

Gries then proves that the invariant holds during the body of the loop. The proofs can be found on pages 145 and 146 of “Beauty Is Our Business” (Feijen). The final form of the algorithm not only proves correctness, but also proves that the largest solution that is the closest:


        
           
        

        
          d := e;
        

        
          s := 10*n+5;
        

        
          t := e;
        

        
          {invariant P: R0 � P1 � R2 � P3 � P4 � P5 � P6}
        

        
          do
            s > t --> d := d^(s/2¹⁶), s := 10*(s
            mod 2¹⁶), t := t*10
        

        
          od
        

        
          if
            t/10 > s/5 + 2¹⁶ --> d.(#d-1) :=
            d.(#d-1)-1
        

        
          []t/10 �
            s/5 + 2¹⁶ --> skip
        

        
          od

This proof is quite a lot, and takes up eight pages of Feijen's text. However, it shows the power of proofs of convergence: they can prove correctness as the code is built. At every step, the programmer is assured that there are no flaws, and that all the invariants hold.

Now, back to my criticism: could a beginning programmer perform the task above? After all, Knuth's algorithm is quite simple, and nowhere near as complex as some problems given in beginning algorithms classes. My implementation in Visual C++ took only thirty lines of code, which included a rudimentary user interface and comment in the code, and it ran almost instantaneously. So, why are proofs of correctness so hard?

I think it has to do with the fact that any mathematical proofs are hard. If I was writing Knuth's algorithm, I would probably write:

float d := n/65536;

The algorithm takes one line of code, and is reasonably close to the correct answer:

N	Using Knuth's Algorithm	n/65536	Difference	Error %
16384	0.25000762939453125	0.25	0.00000762939453125	0.00305166468%
32768	0.50000762939453125	0.5	0.00000762939453125	0.00152855623%
49152	0.75000762939453125	0.75	0.00000762939453125	0.00101724226%
65535	0.99999237060546875	0.999985	0.00000737060546875	0.00073706617%

Don Knuth famously offered the first person to find a bug in his T_EX program one cent. Then, the next bug was worth two cents for the next finder, and so on through an exponential progress. In the preface to his book on the program in 1985, he offered $20.48 if anybody found the twelfth bug. Today, Knuth has scaled back a bit, and offers $2.56 to anybody that finds an error in one of his books. I have to agree that his code and writing are quite free of bugs, even at that reduced price.

However, keep in mind that T_EX is a simple text processing language, without a complex user interface or modern features that today's business users. Though popular in the scientific and academic worlds, it is missing any of the WYSIWYG features of even the most rudimentary word processors today. Granted, it is aimed a more specific use than desktop publishing, but recent attempts at putting a graphical front end onto the program have been clumsy and bug-ridden.

My code above only consists of one line of code, and is readily understandable by any programmer. Additional code could be used to chop off the floating-point number to any desired precision. For a program that lines up text, this value should be “close enough” for any use. Naturally, scientific programming and numerical analysis needs the high precision of Knuth's algorithm, but his example doesn't hold water: text layout is not rocket science, although the proofs of its correctness approaches that level of difficulty.

Criticism #3: Proofs of Correctness Don't help the Debugging Process

My second criticism is a bit harder to describe. Imagine that another programmer hands you a sections of code that is not working. However, he has come up with a proof of correctness for the code, so it should be working. The question is, “Where would you start looking for the problem: the proof itself or the code?”

In this case, the existing proof of correctness wouldn't be very useful to find the bugs. First of all, the proof would need to be verified by hand. The first pass might find any errors where the proof deviated from the actual code, but a second pass would be needed to check the proof itself. Then, the code might need to be examined again, to catch tiny errors that could be hidden by side effects of the code that are not foreseen by the preliminary analysis. Most programmers would rather start over from scratch, devising their own algorithm and the deriving a proof for the new code. Prof. Gries responds to this criticism:

Your second complaint with correctness proofs doesn't make sense to me. So someone gave you a program with bugs in it, even though the person put proof-of-correctness assertions in it. Why do you blame it on the correctness proof documentation, why not simply blame it on the person who wrote the program? How difficult would it be if the person had NOT put any correctness proof documentation in it. Wouldn't it be HARDER to find out what's wrong? At least the correctness-proof documentation gives you some indication of what the programmer intended.

I would disagree that the existing proof might lead the hapless bug-finder “down the garden path”. I've read some mathematical proofs that seem absolutely correct, but there was one incorrect step that invalidates the entire proof. My favorite example is from a high school math class:

1) x = 1

2) x� = x

3) x� - 1 = x -1

4) (x - 1)(x + 1) = x - 1 (by factorizing)

5) x + 1 = 1 (dividing through by (x � 1)

6) 2 = 1 (by substitution)

The error occurs on line 5 where the proof is applying a division by zero: (x � 1) is zero since line 1 sets x = 1. As any math student knows, when 0 divides a finite number, the result is infinity, and further operations may be indeterminate. This example, while rather humorous, shows that trying to validate a proof can be quite difficult, especially when the student proceeds line-by-line without a grasp of the “bigger picture” of the proof.

Proofs of correctness work forward: prove the algorithm is correct, and then check that the implementation matches the algorithm. On the other hand, most modern testing works backwards: the bug-fixer checks to make sure that the code matches the algorithms first, or at very least, that it matches the programmer's comments. Then, in a worst-case scenario, if the code still doesn't work, the bug-fixer would verify the algorithm. I think proofs of verification get “in the way” of testing because they try to impose a top-down methodology towards testing, when most programmers like to debug code from the bottom-up.

For example, if there's a memory protection error is a section of code, most programmers wouldn't start by analyzing the proof of correctness. Instead, they would run the code in a debug mode, and find the exact line that is giving the error. If, let's say, it is an out-of-bounds problem writing to an array, the programmer would examine the array variable, how it is used in the program, and look at it's current size and which array element is being written to. Only after the simple problems are checked like typographical errors, would the bug-fixer examine the entire algorithm and see how the array fits into the larger scheme of things.

This criticism is actually wrapped up into a larger issue: it is very difficult to eliminate bugs in modern code. The use of object-oriented techniques such as encapsulation, data hiding, strong type checking, and data abstraction help a lot by reducing the program's complexity. However, in many companies, tracking down bugs seems to be performed by junior programmers, and is considered a menial job like customer support or documentation. Perhaps it's more enjoyable to create bugs than to fix them. I'm being cynical, but I've never seen formalized algorithm verification performed by a testing team. Then again, while working on his book “The Art of Computer Programming”, Knuth famously wrote “Beware of bugs in the above code; I have only proved it correct, not tried it.” (Knuth, memo to Peter van Emde Boas)

Criticism #4: Proofs of Correctness Are Not Done in “The Real World”

I asked several friends in the software industry if they had ever performed a proof of correctness while writing code at their jobs. None of them said they did, although one remembered, “I think we used a loop invariant once”. These friends of mine aren't computer science slouches, either; one programmer worked for Cray and Sun Microsystems, while another writes code for NASA, definitely a place where program correctness is important. None of them had ever been asked to prove their code was fully functional, although all of them had performed such proofs during their undergraduate or graduate careers.

Criticism #5: Turing's Unprovability

My final criticism comes from a programmer friend of mine who asks, “Is [proving a program is correct] even possible beyond (relatively) trivial problems? Especially, with � extending Gödel's proposition so not only are there special cases in any system that are logically unprovable, but there are entire areas of math that are both irreducible and unprovable.” He is referring to, of course, Kurt Gödel's famous proof that any formal logic system will have statements that can neither be proved true or false. However, this doesn't affect proofs of logic very much, except that there may be program where it is undecidable whether or not they have a proof of correctness. This is similar to Alan Turing's Halting Problem, where it is undecidable whether of not a given program would ever terminate.

For fun, I create a program with a valid proof of correctness without any proof of termination:


         

        
          //Find the mystery number Q
        

        
          // precondition: Q is between 0 and infinity
        

        
          for
            (i = 0; i < infinity; i++) {
        

        
                 // invariant: Q is
            between i and infinity
        

        
                 if i = Q
            then return
        

        
          }
        

        
          // postcondition: i equals Q

This code may never halt, unless given an infinite amount of time. As a quick proof, for any value of I, it could be shown that Q = i+1, and the loop needs another iteration to reach Q. Of course, that iteration may need another iteration, and so on, but there is a proof of convergence since i moves closer to Q every time. Of course, if you're dealing with code like this, you've got other problems than finding a proof of correctness.

Miscellaneous Other Criticisms From the Literature

In his article “A Little Exercise in Deriving Multiprograms”, W.H.J. Feijen answers some other criticisms of the Owicki-Gries theory that he has heard, and responds to them:

(i) isn't that theory too simple to deal with something as complicated as parallel programs?

What else can we say than that the most effective weapon against lurking mathematical complexity is a simple formalism to begin with

(ii) and isn't it the case that theory address partial correctness only, thus completely ignoring the important issues of deadlock and individual starvation?

[We can] derive a whole spectrum of solutions to the programming problem, with at one end of the spectrum a solution that displays a wealth of potential parallelism and doesn't suffer from the danger of starvation, and with at the other end the spectrum a solution that displays deadlock and, hence, no parallelism at all.

(iii) and hasn't that theory been designed just for a posteriori verification of multiprograms?

We have to bear in mind that the theory of Owicki and Gries emerged in a period when computing scientists had just begun to explore the possibility of deriving � sequential � programs� it is understandable why the theory of Owicki and Gries has received the stigma of having been designed just for a posteriori verification: it has always been used for that.

(Feijen, pp. 119-120)

David Gries' Challenge

Prof. Gries suggested that I think about this simple problem:


         

        
          Given is a two-dimensional array b[0..m,0..n]. Each row is in
            ascending order.
        

        
          Each column is in ascending order. You know that a value x lies in
            it:
        

        
           
        

        
              precondition:� x in b[0..m,0..n]
        

        
           
        

        
          Store integers in variables i and j to truthify the postcondition
            R:
        

        
           
        

        
             R: x = b[0..m,0..n]

Attempt #1: Row-Major Order Search

The easiest solution to this problem is a simple row-major order search. Pseudo-code for this routine would look like:


         

        
          for
            i:= 0 to m
        

        
          for
            j:= 0 to n
        

        
                       
            if (b[i,j] = x) then
        

        
                              
            return
        

        
                       
            endif
        

        
                 endfor
        

        
          end for

I have a soft spot for code like this. It's easy to understand, and simple to debug. Even a beginning programmer could quickly examine the bounds of the program and verify that all elements in the array are examined. If this routine was in a program where the array was small and speed was not a major concern, I wouldn't mind including this code in a final release. At a basic level, Edsger Dijkstra might agree.

In other words, the program should look like it will behave at run time. The code above is a search through an array element-by-element, and the code looks and feels like an extended search. In R. Buckminister Fuller's eternal phrase: “form follows function”. As an added plus, this code is also easy to create a proof of correctness:


         

        
          Given the precondition that x is promised to be in the rectangular
            array b[0..m,0..n], its true position b[i,j] has the invariant
            that:
        

        
           
        

        
          0 ≤ i ≤ m and 0 ≤ j ≤ n
        

        
           
        

        
          For any value of i, the array elements b[i,0], b[i,1], ...b[i,n]
            are examined by the inner loop.
        

        
          Then, the outer loop states that all values of i are examined from
            0 to m.
        

        
          So, the entire array b[0..m,0..n] is searched for the value x.

This proof only succeeds because every value in the array is searched. This is similar to playing solitaire with every possible combination of 52 cards, or testing Knuth's decimal conversion routine with all values of n from 0 to 65535. It's a valid proof, and it works even if the values of n and m change, but it's definitely not optimal.

Attempt #2: Using Prior Knowledge

Since we know that each row and column is in ascending order, we can speed up the routine a little bit:


         

        
          for
            i:= 0 to m
        

        
          for
            j:= 0 to n
        

        
                       
            if (b[i,j] = x) then
        

        
                              
            return
        

        
                       
            else if (b[i,j] > x)
        

        
                              
            exitfor
        

        
          endif
        

        
                 endfor
        

        
          end for

This routine takes two additional lines of code, but it stops when the current index is larger than the search value. It still runs in a worst-case of O(n*m) time, but has a shorter “average” time bound. Since x will probably be located around the average of n and m (i.e. at b[n/2,m/2]), only the upper-left “triangle” of the array will have to be searched, or O((n*m)/2). Since this equates to O(n*m), we haven't really bought ourselves anything. Also, the proof of correctness is almost the same:

To the “completeness” proof of #1, add the following lines:

For any b[p,q] > x, b[p+1,q] > x and b[p,q] > x. In fact, b[r,q] > x if r>=p and s>=q. So, there is no need to keep searching those values, and the for loop can terminate for that row (or column).

Attempt #3: Recursion

Or, the problem could be solved with one recursive function:


         

        
          function
            FindArray(i, j)
        

        
          if (b[i,j] = x) then
        

        
                 return
            (i,j);
        

        
                 else if ((i > n) ||
            (j > m))
        

        
                       
            return 0;
        

        
                 else 
        

        
          if
            (i < n) then
        

        
                              
            return FindArray(i+1,j);
        

        
                       
              endif
        

        
          if
            (j < m) then
        

        
                       
            return FindArray(i,j+1);
        

        
                
                   endif
        

        
                 endif
        

        
          wend

Calling the function initially with FindArray(0,0) will start the recursive calls until the desired element x is found in the array. Note that the recursion is “tail-recursive”, which can always be rewritten using a loop, which would result in code that looks like solution #2. Note that our time bound is still O(n*m) for the worst case. Recursion hasn't speed up the program— in fact the overhead needed for pushing calls on the stack might slow down the execution a little bit. However, the code might look cleaner in some circumstances, or might be more appropriate to run on certain architectures that support recursive calls or parallel operations.

Attempt #4: Diagonalization Divide-and-Conquer

I finally figured out this “trick”. It works by using the fact that if� (b[i,j] < x), then the solution is not in the upper-left quadrant “above and to the left” of b[i,j]. Likewise, if (b[i,j] > x), we can rule out the entire bottom-right quadrant “below and to the right” of b[i,j]. If we ever find a diagonal pair b[i,j] and b[i+1,j+1],� we have ruled out two large quadrants that will make up half of the section we are searching:


         

        
          function StartSearch(a, n, b, m)
        

        
          i:=a;
        

        
          j:=b;
        

        
          while (1)
        

        
                 if (b[i,j] = x)
            then
        

        
                 return
            (i,j);
        

        
          else if
            ((i == n) && (j == m)) then
        

        
                       
              return 0;
        

        
          else if
            (b[i,j] > x)
        

        
                
            if (StartSearch(i,n,b,j-1) != 0) then
        

        
          return StartSearch(i,n,b,j-1);
        

        
                       
              else
        

        
          return StartSearch(a,i-1,j,m);
        

        
                       
              endif
        

        
                 endif
        

        
                 if (i < n)
            then
        

        
                       
            i:=i+1;
        

        
                 endif
        

        
                 if (j < m)
            then
        

        
                       
            j:=j+1;
        

        
                 endif
        

        
          wend

I think this algorithm could be simplified a lot, especially in the case where the search is looking along an extreme edge: say the right-most column or the bottom-most row. Also, I am being clumsy by calling StartSearch twice— once to make sure the return value is non-zero, and another time to find the answer. Of course, I would save the value in a variable to avoid calling the function again. Note that this routine could also run backwards, starting at b[n,m] and moving to the upper-left diagonal element. In either case, the algorithm runs a lot faster than attempts #1 through #3.

The performance of this algorithm is a little tricky to compute. It uses a divide-and-conquer strategy, and in a worst-case scenario, it will spend a lot of time walking along diagonals of quadrants where x doesn' exist. In my example, the worst-case is when the solution is in the extreme lower-left square of a symmetrical array where (n == m), and the values of the “secondary” diagonal that runs from the lower-left corner to the upper-right all have values bordering x. So, the algorithm needs to check half of the “primary” diagonal that runs from the upper-left corner to the lower-right corner, which takes O(n/2) (or O(m/2), since n equals m). Then, the upper-right quadrant will be searched, taking O(n/4). However, there are two of these (one in each quadrant), as well as four O(n/8) diagonals. Overall, the series is:

The value of k is logn, since the diagonals are searched in powers of 2: first at 1/2, then at 1/4, then at 1/8 and so on. So, the worst-case time for the algorithm is O(nlogn), equivalent to a search algorithm such as “quicksort” or “mergesort”. This is a huge performance increase, especially for large arrays. Even if we needed to write additional code to get the array in a ascending row and column order, there would be a payoff if we needed to perform more that n searches in the array to find various values of x. Arranging the array in the first place would take a worst-case time of O(nm log nm) (Cormen). Then again, if we were searching for values of x repeatedly, perhaps a two-dimension array might not be the best structure to store the values of b, and we should switch to a hash table or other form of tree that help facilitate searching.

Proof of convergence:

With every iteration of the while loop, i and/or j is incremented by one. Since they m and n are finite and either i or j is incremented by an integer value (i.e. 1), the loop shows signs of convergence.

Proof of stopping:

There's a return statement when the value of x is found in the array. Also, the precondition states that x is in the array b[0..m,0..n]

Proof of success:

This one's difficult... the algorithm starts in the upper-left corner of the array. Every iteration of the loop moves the current index [i,j] one space either to the “right” or “down”. In any case, at least one column or row will be eliminated from the search for every iteration.

The problem states that every row and column is in ascending order. That is:

b[p,q] < b[p+1,q] and

b[p,q] < b[p,q+1] for any p and q

(I'm assuming we can ignore duplicate values in the array, although the proof should work for that case, too)

If b[p,q] < x, then the solution is not in the upper-left quadrant of b[p,q]. Likewise, if b[p,q] > x, the solution is not in the lower-right quadrant of b[p,q].

Attempt #5: Gries' Solution


         

        
          Finally, David Gries provided me with the correct solution:
        

         

        
          Write the precondition Q as
        

        
             Q: x is in b[0..m, 0..n].
        

        
           
        

        
          Write the postcondition R as
        

        
             R: x =� b[i,j].
        

        
            
          
        

        
          We need a loop or recursion to solve this, and we decide on a loop.
            The loop invariant is a generalization of both Q and R, since it is
            true at both places. A generalization is easier to find if Q and R
            are written in similar fashion, if they have the same shape. And we
            can rewrite R so that it has the shape of Q:
        

        
           
        

        
             R: x is in b[i..i, j..j]
        

        
            
          
        

        
          Now, Q says that x lies in a big rectangle; R says that x lies in a
            small rectangle. Maybe the invariant is that x lies in a rectangle.
            To get the invariant, we replace to occurrences of variables in R by
            fresh variables:
        

        
           
        

        
             P:� x is in b[i..h, k..j]
        

        
            
          
        

        
          In the body of the loop, we can see four simple ways to make
            progress toward termination: i:= i+1; h:= h-1; k:= k+1; and j:= j-1.
            We could think of things like i:= (i+h)/2, but let's do simple
            things first. This leads directly to the algorithm:
        

        
           
        

        
          i := 0;
        

        
          h := m;
        

        
          k := 0;
        

        
          j := n;
        

        
           
        

        
          while
            (x != b[i,j]) do
        

        
          if
            b[i,j] < x then
        

        
          i:= i+1;
        

        
          else
        

        
          j:= j-1;
        

        
                 endif
        

        
          wend
        

        
                            
          
        

        
          That's it! We can see that we really don't need variables h and k,
            and we can rewrite the invariant and the algorithm, using Dijsktra's
            notation, as:
        

        
           
        

        
          i := 0;
        

        
          j := n;
        

        
          // invariant: x is in b[i..m, 0..j]
        

        
          do
            b[i,j] < x --> i:= i+1
        

        
          []
            b[i,j] > x --> j:= j-1
        

        
          od
        

        
           
          
        

        
          In the worst case, the time is proportional to O(m+n), the sum of
            the number of rows and the number of columns.
        

        
           
        

        
          One can't do much better than this. For example, if you were told
            in addition
        

        
          that x lies on the second diagonal b[0,j], b[1,j-2], b[2,j-1], ...,
            that whole second diagonal would have to be searched in the worst
            case because it is not ordered.

The algorithm works by reducing the searchable square one row of column at a time. Since there are only n columns and m rows, the greatest time the algorithm can take is O(n+m). The worst-case occurs when x is in the upper-right-hand corner of the array. This algorithm is simple and complete, although without comments, most beginning programmers would understand what it is doing. So, I would recommend for any proof of correctness that stating the invariant is not enough: the programmer should also add a quick explanation about the routine describing what it is supposed to do.

Results

As I expected, the easiest code (attempt #1) was the simplest to prove correct. I would have no problem believing a proof of correctness for that algorithm. The faster the algorithms ran, the more complex the code became. Likewise, the proofs of correctness because very difficult to produce, and I'm still not convinced I am proving attempt #4 completely. This is exactly the opposite of what we would hope; ideally, the harder the code looks, the more a proof would help us understand the code. When the proof of correctness is as opaque as a tricky algorithm, there may not be very much use for it.

After one month of working on Prof. Gries' problem, and after consulting three textbooks on algorithms, I have no small measure of embarrassment that I was unable to find the optimal solution to the array search. I received straight “A's” in three algorithm and theory classes, so I like to think I am well suited for this kind of task. However, my solution is much slower than the optimal one. This failure seems to agree with my earlier criticisms: although Prof. Gries masterfully created his algorithm from first principles, it was difficult to recreate that effort. Nor, I would think, would most programmers start coding by having the proof of correctness decide what the algorithm looks like line-by-line.

Summary

To satisfy my four criticisms earlier, proofs of correctness need to gain new attributes:

1. Proofs of correctness must become easier to discover

I don't have any idea how to make this wish come true. Perhaps advances in artificial intelligence will create computers that can construct proofs for us. Then again, if computers become that smart, they would probably perform all the programming by themselves, and they wouldn't need to provide proofs of correctness in a human-readable format.

2. Proofs must be integrated with the source code

David Rosenblum's APP tool is a great step forward for this, since it expands in-line assertions into actual C code that is compiled with the rest of the source code. Since is uses a simple preprocessor, it doesn't get in the way of the normal software development process or disturb the existing code.

Ideally, as the source code changes, the proof of correctness would change with it. I would love to have my code highlighted in red as I write lines of code that would cause logical problems, the way many modern text editors highlight syntax and compiling errors in the code.

3. Proofs of correctness must be adopted by industry

This condition might be the hardest of the three to become realized. The software industry is notoriously slow to adopt theoretical research in software engineering. Even though there have been numerous papers on program testing, for example, there are thousands of business today that do not perform formal testing of any kind. In my experience, even test plans and code reviews are in short supply.

However, the lack of popularity of correctness proofs hints that something else is wrong: perhaps there is not a large enough advantage to writing them compared to the effort they demand. For example, a program running on the Windows 3.1 platform could be absolutely correct, but could exhibit strange errors caused by the underlying operating system. Why spend time verifying your code when it could be sabotaged at any moment by undiscoverable and unfixable errors? Of course, Microsoft should be using proofs of correctness also to ensure that their operating systems are bug-free, too.

However, modern computers are very complex systems, and the weakest link in the chain causes the low expectations most people have for bug free computing. Once one program, servelet, or agent, is behaving incorrectly, all the other programs could be affected. Possibly, one bad apple could destroy everybody else's proof of correctness, if they assume things like protected memory.

Conclusion

Leslie Lamport notes that, “Finding the proper annotation to prove a property of a concurrent program is a difficult art; anything that makes the task weaker should be welcome.” (Feijen/Gasteren, p. 38) If anything, learning about proofs of correctness has made me realize what an art is really is. I had expected the assertions to “fall out” of the code automatically. Instead, the proofs of correctness stand up against any mathematical proof for sheer rigor and difficulty.

It's easy to introduce bugs into code and difficult to remove them. It is also easy to make a mistake in a proof of correctness. However, the proof has built-in coherency, so it will “tell you” when it is wrong. Machine code doesn't do that, unless the compiler manages to catch a simple syntax or linking error. So, proofs of correctness are really the only tools we have to verify that an algorithm is correct. Like the Scientific Method, it may have some flaws, but it's the only method that works. Either you prove that your code is correct... or you don't. There are no useful half-measures.

Thanks goes to David Gries, who provided immeasurable help in writing this paper. His numerous and timely emails were unexpected and very helpful. Thanks also to Susan Owicki, who provided many excellent resources and papers for me to read at the start of this project. I would like to also credit William Waite at the University of Colorado/Boulder for the inspiration for this paper, and his patience in answering my combative questions during his guest lecture. And finally, thanks to Alexander Wolf, for teaching CSCI 5828: Foundations of Software Engineering. I couldn't have graduated without his help— literally, since I was three credits short of my degree and he let me add the class after the fourth week of lectures.

References

Anderson, Robert B. “Proving Programs Correct”, John Wiley and Sons, New York, 1979

Andrews, Gregory R. “Concurrent Programming: Principles and Practice”, Benjamin/Cummings Publishing Company, Redwood City, California, 1991

Burns, Alan, and Geoff Davies “Concurrent Programming”, Addison-Wesley, Wokingham, England, 1993

Cormen, Thomas H., Charles E. Leiserson, and Ronald L. Rivest “Introduction To Algorithms”, MIT Press, Cambridge, Massachusetts, 1990

Dijkstra, Edsger W. “A Discipline of Programming”, Prentice-Hall Series in Automatic Computation, Prentice-Hall Inc. New Jersey, 1976

Dijkstra, Edsger W. “Go To Statement Considered Harmful”, Communications of the ACM, Vol. 11, No. 3, March 1968

Dijkstra, Edsger W. “The Structure of ‘THE’-Multiprogramming System”, Communications of the ACM, Vol. 11, No. 5, May 1968

Feijen, W.H.J., A.J.M. van Gasteren, D. Gries, and J. Misra (Editors) “Beauty Is Our Business”, Springer-Verlag, New York, 1990

Feijen, W.H.J., and A.J.M. van Gasteren (Editors) “On A Method of Multi-Programming”, Springer-Verlag, New York, 1999

Fox, Geoffrey C., (et al) “Solving Problems On Concurrent Processors: Volume 1 — General Techniques and Regular Problems”, Prentice Hall, New Jersey, 1988

Frank, Ruben “ ‘GOTO Considered Harmful‘ Considered Harmful”, Communications of the ACM, Vol. 30, No. 3, March 1987

Garrett, D. Alexander "Abstraction and Modularity in INTERCAL", submission for his Spring 1997 “CS-538: Theory and Design of Programming Languages” class. A copy of the paper can be found at: http://www.tuxedo.org/~esr/intercal/paper.html

Hinchey, Michael G., and Stephen A. Jarvis “Concurrent Systems: Formal Development in CSP”, McGraw-Hill, London, 1995

Knuth, Donald E. “The Art of Computer Programming: Volume 1 — Fundamental Algorithms”, Addison-Wesley, Reading, Massachusetts, 1967

Knuth, Donald E. “The Art of Computer Programming: Volume 2 — Seminumerical Algorithms”, Addison-Wesley, Reading, Massachusetts, 1969

Knuth, Donald E. “The Art of Computer Programming: Volume 3— Sorting and Searching”, Addison-Wesley, Reading, Massachusetts, 1973

Knuth, Donald E. "Structured Programming with go to Statements," Computing Surveys, Vol. 6, No. 4, December 1974

Marshall, Lindsay, and James Webber “Gotos Considered Harmful and Other Programmers' Taboos”, Department of Computing Science, University of Newcastle upon Tyne, Newcastle, United Kingdom

McConnell, Steve C. "Code Complete: A Practical Handbook of Software Construction", Microsoft Press, 1993

Petzold, Charles "Programming Windows: The Definitive Guide to the Win32 API" Microsoft Press, Redmond, Washington December 1998

Rosenblum, D.S. “A Practical Approach to Programming with Assertions”, IEEE Transactions on Software Engineering, Vol. 21, No. 1, January 1995

Schneider, Fred B. “On Concurrent Programming”, Springer, New York, 1997

Sedgewick, Robert “Algorithms”, Addison-Wesley, Reading, Massachusetts, 1988

Whiddett “Concurrent Programming for Software Engineers”, Ellis Horwood Limited/ John Wiley and Sons, New York, 1987

Young, Michal, Richard N. Taylor, Kari Forester, Debra Brodbeck “Integrated Concurrency Analysis in a Software Development Environment” ICS Department, University of California, Irvine, California, 1989