Structured programming
Programming paradigms |
---|
|
Structured programming is a programming paradigm aimed on improving the clarity, quality, and development time of a computer program by making extensive use of subroutines, block structures and for and while loops—in contrast to using simple tests and jumps such as the goto statement which could lead to "spaghetti code" which is both difficult to follow and to maintain.
It emerged in the 1960s—particularly from Böhm & Jacopini (1966) and a famous letter, Go To Statement Considered Harmful, from Edsger Dijkstra in 1968.[1]— and was bolstered theoretically by the structured program theorem, and practically by the emergence of languages such as ALGOL with suitably rich control structures.
Contents
Low-level structured programming
At a low level, structured programs are often composed of simple, hierarchical program flow structures. These are sequence, selection, and repetition:
- "Sequence" refers to an ordered execution of statements.
- In "selection" one of a number of statements is executed depending on the state of the program. This is usually expressed with keywords such as
if..then..else..endif
,switch
, orcase
. In some languages keywords cannot be written verbatim, but must be stropped. - In "repetition" a statement is executed until the program reaches a certain state, or operations have been applied to every element of a collection. This is usually expressed with keywords such as
while
,repeat
,for
ordo..until
. Often it is recommended that each loop should only have one entry point (and in the original structural programming, also only one exit point, and a few languages enforce this).
A language is described as block-structured when it has a syntax for enclosing structures between bracketed keywords, such as an if-statement bracketed by if..fi
as in ALGOL 68, or a code section bracketed by BEGIN..END
, as in PL/I - or the curly braces {...}
of C and many later languages.
Structured programming languages
It is possible to do structured programming in any programming language, though it is preferable to use something like a procedural programming language. Some of the languages initially used for structured programming languages include: ALGOL, Pascal, PL/I and Ada – but most new procedural programming languages since that time have included features to encourage structured programming, and sometimes deliberately left out features – notably GOTO – in an effort to make unstructured programming more difficult.
History
Theoretical foundation
The structured program theorem provides the theoretical basis of structured programming. It states that three ways of combining programs—sequencing, selection, and iteration—are sufficient to express any computable function. This observation did not originate with the structured programming movement; these structures are sufficient to describe the instruction cycle of a central processing unit, as well as the operation of a Turing machine. Therefore a processor is always executing a "structured program" in this sense, even if the instructions it reads from memory are not part of a structured program. However, authors usually credit the result to a 1966 paper by Böhm and Jacopini, possibly because Dijkstra cited this paper himself. The structured program theorem does not address how to write and analyze a usefully structured program. These issues were addressed during the late 1960s and early 1970s, with major contributions by Dijkstra, Robert W. Floyd, Tony Hoare, Ole-Johan Dahl, and David Gries.
Debate
P. J. Plauger, an early adopter of structured programming, described his reaction to the structured program theorem:
- Us converts waved this interesting bit of news under the noses of the unreconstructed assembly-language programmers who kept trotting forth twisty bits of logic and saying, 'I betcha can't structure this.' Neither the proof by Böhm and Jacopini nor our repeated successes at writing structured code brought them around one day sooner than they were ready to convince themselves.[2]
Donald Knuth accepted the principle that programs must be written with provability in mind, but he disagreed (and still disagrees[citation needed]) with abolishing the GOTO statement. In his 1974 paper, "Structured Programming with Goto Statements",[3] he gave examples where he believed that a direct jump leads to clearer and more efficient code without sacrificing provability. Knuth proposed a looser structural constraint: It should be possible to draw a program's flow chart with all forward branches on the left, all backward branches on the right, and no branches crossing each other. Many of those knowledgeable in compilers and graph theory have advocated allowing only reducible flow graphs[when defined as?].[who?]
Structured programming theorists gained a major ally in the 1970s after IBM researcher Harlan Mills applied his interpretation of structured programming theory to the development of an indexing system for the New York Times research file. The project was a great engineering success, and managers at other companies cited it in support of adopting structured programming, although Dijkstra criticized the ways that Mills's interpretation differed from the published work.
As late as 1987 it was still possible to raise the question of structured programming in a computer science journal. Frank Rubin did so in that year with a letter, "'GOTO considered harmful' considered harmful." Numerous objections followed, including a response from Dijkstra that sharply criticized both Rubin and the concessions other writers made when responding to him.
Outcome
By the end of the 20th century nearly all computer scientists were convinced that it is useful to learn and apply the concepts of structured programming. High-level programming languages that originally lacked programming structures, such as FORTRAN, COBOL, and BASIC, now have them.
Common deviations
While goto has now largely been replaced by the structured constructs of selection (if/then/else) and repetition (while and for), few languages are purely structured. The most common deviation, found in many languages, is the use of a return statement for early exit from a subroutine. This results in multiple exit points, instead of the single exit point required by structured programming. There are other constructions to handle cases that are awkward in purely structured programming.
Early exit
The most common deviation from structured programming is early exit from a function or loop. At the level of functions, this is a return
statement. At the level of loops, this is a break
statement (terminate the loop) or continue
statement (terminate the current iteration, proceed with next iteration). In structured programming, these can be replicated by adding additional branches or test, but for returns from nested code this can add significant complexity. C is an early and prominent example of these constructs. Some newer languages also have "labeled breaks", which allow breaking out of more than just the innermost loop. Exceptions also allow early exit, but have further consequences, and thus are treated below.
Multiple exits can arise for a variety of reasons, most often either that the subroutine has no more work to do (if returning a value, it has completed the calculation), or has encountered "exceptional" circumstances that prevent it from continuing, hence needing exception handling.
The most common problem in early exit is that cleanup or final statements are not executed – for example, allocated memory is not unallocated, or open files are not closed, causing memory leaks or resource leaks. These must be done at each return site, which is brittle and can easily result in bugs. For instance, in later development, a return statement could be overlooked by a developer, and an action which should be performed at the end of a subroutine (e.g., a trace statement) might not be performed in all cases. Languages without a return statement, such as standard Pascal don't have this problem.
Most modern languages provide language-level support to prevent such leaks;[4] see detailed discussion at resource management. Most commonly this is done via unwind protection, which ensures that certain code is guaranteed to be run when execution exits a block; this is a structured alternative to having a cleanup block and a goto
. This is most often known as try...finally,
and considered a part of exception handling. Various techniques exist to encapsulate resource management. An alternative approach, found primarily in C++, is Resource Acquisition Is Initialization, which uses normal stack unwinding (variable deallocation) at function exit to call destructors on local variables to deallocate resources.
Exception handling
A typical example of a simple procedure would be reading data from a file and processing it:
// open file while (/* reading not finished */) { // read some data; if (/* error */) { // stop the subprogram and inform rest of the program about the error; } // process read data; } // finish the subprogram;
The "stop and inform" may be achieved by throwing an exception, returning from the procedure, a labelled loop break, or even a goto.
The procedure above breaks the rules of Dijkstra's structured programming because the while loop can terminate in a way that is not specified in the while condition (the "reading not finished"), and because it has two exit points (the end of the procedure and the "stop" on error). Re-coding this to use a structured while loop and a single point of exit is trivial:
// open file; while (/* no error */) and (/* reading not finished */) { // read some data; if (/* no error */) { // process read data; } } if (/* error */) { // inform rest of the program about the error; } else { // finish the subprogram; }
This removes the unstructured termination of the while loop and gives the procedure a single exit point without any change to the overall function, though at the cost of additional checks and branches. In the case of exceptions in nested code, structured code can require significantly more checks and branches than unstructured code. For this reason many languages provide exceptions, which are similar to return or break statements in providing multiple exits from a function or block, but which continue to unwind the stack (of function calls or nested blocks) until they are caught.
Multiple entry
More rarely, subprograms allow multiple entry. This is most commonly only re-entry into a coroutine (or generator/semicoroutine), where a subprogram yields control (and possibly a value), but can then be resumed where it left off. There are a number of common uses of such programming, notably for streams (particularly input/output), state machines, and concurrency. From a code execution point of view, yielding from a coroutine is closer to structured programming than returning from a subroutine, as the subprogram has not actually terminated, and will continue when called again – it is not an early exit. However, coroutines mean that multiple subprograms have execution state – rather than a single call stack of subroutines – and thus introduce a different form of complexity.
It is very rare for subprograms to allow entry to an arbitrary position in the subprogram, as in this case the program state (such as variable values) is uninitialized or ambiguous, and this is very similar to a goto.
State machines
Some programs, particularly parsers and communications protocols, have a number of states that follow each other in a way that is not easily reduced to the basic structures, and some programmers (including Knuth[citation needed]) implement the state-changes with a jump to the new state. However, it is possible to structure these systems by making each state-change a separate subprogram and using a variable to indicate the active state (see trampoline). Alternatively, these can be implemented via coroutines, which dispense with the trampoline.
See also
- Control flow (more detail of high-level control structures)
- Minimal evaluation
- Object-oriented programming
- Nassi–Shneiderman diagram
- Programming paradigms
- Structured exception handling
- Structure chart
- Switch statement, a case of multiple GOTOs
References
- ^ Dijkstra 1968, "The unbridled use of the go to statement has as an immediate consequence that it becomes terribly hard to find a meaningful set of coordinates in which to describe the process progress. ... The go to statement as it stands is just too primitive, it is too much an invitation to make a mess of one's program."
- ^ Plauger, P. J. (February 12, 1993). Programming on Purpose, Essays on Software Design (1 ed.). Prentice-Hall. p. 25. ISBN 978-0-13-721374-0.
- ^ Donald Knuth - Structured programming with go to statements
- ^ Elder, Jackson & Liblit 2008.
- Edsger Dijkstra, Notes on Structured Programming, pg. 6
- Böhm, C.; Jacopini, G. (May 1966). "Flow diagrams, Turing machines and languages with only two formation rules". Communications of the ACM 9 (5): 366–371. doi:10.1145/355592.365646.
- Dijkstra, Edsger W. (March 1968). "Letters to the editor: Go to statement considered harmful". Communications of the ACM 11 (3): 147–148. doi:10.1145/362929.362947.
- Michael A. Jackson, Principles of Program Design, Academic Press, London, 1975.
- O.-J. Dahl, E. W. Dijkstra, C. A. R. Hoare Structured Programming, Academic Press, London, 1972 ISBN 0-12-200550-3
- this volume includes an expanded version of the Notes on Structured Programming, above, including an extended example of using the structured approach to develop a backtracking algorithm to solve the 8 Queens problem.
- a pdf version is in the ACM Classic Books Series
- Note that the third chapter of this book, by Dahl, describes an approach which is easily recognized as Object Oriented Programming. It can be seen as another way to "usefully structure" a program to aid in showing that it is correct.
- Elder, Matt; Jackson, Steve; Liblit, Ben (October 2008). Code Sandwiches (Technical report). University of Wisconsin–Madison. 1647, abstract
External links
- BPStruct - A tool to structure concurrent systems (programs, process models)
- Algobuild: Free editor for structured flow charts and pseudo code (EN - IT)
- J. Darlinton, M. Ghanem, H. W. To (1993), "Structured Parallel Programming", In Programming Models for Massively Parallel Computers. IEEE Computer Society Press. 1993