Next: Program Structure and Layout Up: Professional Programmer's Guide to Previous: Basic Fortran Concepts

Fortran in Practice

This section describes the steps required to turn a Fortran program from a piece of text into executable form. The main operation is that of translating the original Fortran source code into the appropriate machine code. On a typical Fortran system this is carried out in two separate stages. This section explains how this works in more detail.

These descriptions differ from those in the rest of the book in two ways. Firstly, it is not essential to understand how a Fortran system works in order to use it, just as you do not have to know how an internal combustion engine works in order to drive a car. But, in both cases, those who have some basic understanding of the way in which the machine works find it easier to get the best results. This is especially true when things start to go wrong - and most people find that things go wrong all too easily when they start to use a new programming language.

Secondly the contents of this section are much more system-dependent than all the others in the book. The Fortran Standard only specifies what a Fortran program should do when it is executed, it has nothing directly to say about the translation process. In practice, however, nearly all Fortran systems work in much the same way, so there should not be too many differences between the ``typical" system described here and the one that you are actually using. Regrettably the underlying similarities are sometimes obscured by differences in the terminology that different manufacturers use.

The Fortran System

The two main ways of translating a program into machine code are to use an interpreter or a compiler.

An interpreter is a program which stays in control all the while the program is running. It translates the source code into machine code one line at a time and then executes that line immediately. It then goes on to translate the next, and so on. If an error occurs it is usually possible to correct the mistake and continue running the program from the point at which it left off. This can speed up program development considerably. The main snag is that all non-trivial programs involve forms of repetition, such as loops or procedure calls. In all these cases the same lines of source code are translated into machine code over and over again. Some interpreters are clever enough to avoid doing all the work again but the overhead cannot be eliminated entirely.

The compiler works in an entirely different way. It is an independent program which translates the entire source code into machine code at once. The machine code is usually saved on a file, often called an executable image, which can then be run whenever it is needed. Because each statement is only translated once, but can be executed as many times as you like, the time take by the translation process is less important. Many systems provide what is called an optimising compiler which takes even more trouble and generates highly efficient machine code; optimised code will try to make the best possible use of fast internal registers and the compiler will analyse the source program in blocks rather than one line at a time. As a result, compiled programs usually run an order of magnitude faster than interpreted ones. The main disadvantage is that if the program fails in any way, it is necessary to edit the source code and recompile the whole thing before starting again from the beginning. The error messages from a compiled program may also be less informative than those from an interpreter because the original symbolic names and line numbers may not be retained by the compiler.

Interpreters, being more ``user-friendly", are especially suitable for highly interactive use and for running small programs produced by beginners. Thus languages like APL, Basic, and Logo are usually handled by an interpreter. Fortran, on the other hand, is often used for jobs which consume significant amounts of computer time: in some applications, such as weather forecasting, the results would simply be of no use if they were produced more slowly. The speed advantage of compilers is therefore of great importance and in practice almost all Fortran systems use a compiler to carry out the translation.

Separate Compilation

The principal disadvantage of a compiler is the necessity of re-compiling the whole program after making any alteration to it, no matter how small. Fortran has partly overcome this limitation by allowing program units to be compiled separately; these compiled units or modules are linked together afterwards into an executable program.

A Fortran compiler turns the source code into what is usually called object code: this contains the appropriate machine-code instructions but with relative memory addresses rather than absolute ones. All the program units can be compiled together, or each one can be compiled separately. Either way a set of object modules is produced, one from each program unit. The second stage, which joins all the object modules together, is usually known as linking, but other terms such as loading, link-editing, and task-building are also in use. The job of the linker is to collect up all these object modules, allocate absolute addresses to each one, and produce a complete executable program, also called an executable image.

The advantage of this two-stage system is that if changes are made to just one program unit then only that one has to be re-compiled. It is, of course, necessary to re-link the whole program. The operations which the linker performs are relatively simple so that linkers ought to be fast. Unfortunately this is not always so, and on some systems it can take longer to link a small program than to compile it.

Creating the Source Code

The first step after writing a program is to enter it into the computer: these files are known as the source code. Fortran systems do not usually come with an editor of their own: the source files can be generated using any convenient text editor or word processor.

Many text editors have options which ease the drudgery of entering Fortran statements. On some you can define a single key-stroke to skip directly to the start of the statement field at column 7 (but if the source files are to conform to the standard this should work by inserting a suitable number of spaces and not a tab character). An even more useful feature is a warning when you cross the right-margin of the statement field at column 72. Most text editors make it easy to delete and insert whole words, where a word is anything delimited by spaces. It helps with later editing, therefore, to put spaces between items in Fortran statements. This also makes the program more readable.

Most programs will consist of several program units: these may go on separate files, all on one file, or any combination. On most systems it is not necessary for the main program unit to come first. When first keying in the program it may seem simpler to put the whole program on one file, but during program development it is usually more convenient to have each program unit on a separate file so that they can be edited and compiled independently. It minimises confusion if each source file has the same name as the (first) program unit that it contains.

`INCLUDE` Statements

Many systems provide a pseudo-statement called INCLUDE (or sometimes INSERT which inserts the entire contents of a separate text file into the source code in place of the INCLUDE statement. This feature can be particularly useful when the same set of statements, usually specification statements, has to be used in several different program units. Such is often the case when defining a set of constants using PARAMETER statements, or when declaring common blocks with a set of COMMON statements. INCLUDE statements reduce the key-punching effort and the risk of error. Although non-standard, INCLUDE statements do not seriously compromise portability because they merely manipulate the source files and do not alter the source code which the compiler translates.

Compiling

The main function of a Fortran compiler is to read a set of source files and write the corresponding set of object modules to the object file.

Most compilers have a number of switches or options which can be set to control how the compiler works and what additional output it produces. Some of the more useful ones, found on many systems, are described below.

Almost all compilers can produce a listing file: a text file containing a copy of the source code, with the lines numbered, and with error messages and other useful information attached. A list of all the symbolic names and labels used in the program unit is often provided: this should be checked for unexpected entries as they may be the result of spelling mistakes.
An even more useful addition to the listing is a cross-reference table: this lists every place that each symbolic name has been used. Good compilers indicate which names have only been used once as these often indicate a programming mistake.
Another widely available option is the detection of syntax which does not conform to the Fortran Standard: this helps to ensure program portability.
Often it is possible to choose the optimization level. During program development a low level of optimization should be selected if this makes the compiler run faster; it may improve the error detection. Highly optimised machine code may execute faster but if the source code lines are rearranged error messages may be less helpful.
Many systems allow additional code to be included which check for errors at run-time. Errors such as over-running the bounds of an array or a character string, or arithmetic over-flow can usually be trapped. Such errors are not uncommon, so this assistance is very valuable. Some programming manuals suggest that these options should only be selected during program development and switched-off thereafter in the interests of speed. This is rather like wearing seat-belts in the car only while you are learning to drive and ignoring them as soon as you are allowed out on the motorway. Run-time checks do not usually reduce the execution speed noticeably.

Linking

At its simplest, the linker just takes the set of object modules produced by the compiler and links them all together into an executable image. One of these modules must correspond to the main program unit, the other modules will correspond to procedures and to block data subprogram units.

It often happens that a number of different programs require some of the same computations to be carried out. If these calculations can be turned into procedures and linked into each program it can save a great deal of programming effort, especially in the long run. This ``building block" approach is particularly beneficial for large programs. Many organisations gradually build up collections of procedures which become an important software resource. Procedures collected in this way tend to be fairly reliable and free from bugs, if only because they have been extensively tested and de-bugged in earlier applications.

Object Libraries

It obviously saves on compilation time if these commonly-used procedures can be kept in compiled form as object modules. Almost all operating systems allow a collection of object modules to be stored in an object library (sometimes known as a pre-compiled or relocatable-code library). This is a file containing a collection of object modules together with an index which allows them to be extracted easily. Object libraries are not only more efficient but also easier to use as there is only one file-name to specify to the linker. The linker can then work out for itself which modules are needed to satisfy the various CALL statements and function references encountered in the preceding object modules. Object libraries also simplify the management of a procedure collection and may reduce the amount of disc space needed. There are usually simple ways of listing the contents of an object library, deleting modules from it, and replacing modules with new versions.

All Fortran systems come with a system library which contains the object modules for various intrinsic functions such as SIN, COS, and SQRT. This is automatically scanned by the linker and does not have to be specified explicitly.

Software is often available commercially in the form of procedure libraries containing modules which may be linked into any Fortran program. Those commonly used cover fields such as statistics, signal processing, graphics, and numerical analysis.

Linker Options

The order of the object modules supplied to the linker does not usually matter although some systems require the main program to be specified first. The order in which the library files are searched may be important, however, so that some care has to be exercised when several different libraries are in use at the same time.

The principal output of the linker is a single file usually called the executable image. Most linkers can also produce a storage map showing the location of the various modules in memory. Sometimes other information is provided such as symbol tables which may be useful in debugging the program.

Program Development

The program development process consists of a number of stages some of which may have to be repeated several times until the end product is correct:

Designing the program and writing the source-code text.
Keying in the text to produce a set of Fortran source files.
Compiling the source code to produce a set of object modules.
Linking the object modules and any object libraries into a complete executable image.
Running the executable program on some test data and checking the results.

The main parts of the process are shown in the diagram below.

Figure 1: Compiling and Linking

Handling Errors

Things can go wrong at almost every stage of the program development process for a variety of reasons, most of them the fault of the programmer. Naturally the Fortran system cannot possibly detect all the mistakes that it is possible for human programmers to make. Errors in the syntax of Fortran statements can usually be detected by the compiler, which will issue error messages indicating what is wrong and, if possible, where.

Other mistakes will only come to light at the linking stage. If, for example, you misspell the name of a subroutine or function the compiler will not be able to detect this as it only works on one program unit at a time, but the linker will say something like ``unsatisfied external reference". This sort of message will sometimes appear if you misspell the name of an array since array and function references can have the same form.

Most errors that occur at run-time are the result of programmer error, or at least failure to anticipate some failure mode. Even things like division by zero or attempting to access an array element which is beyond its declared bounds can be prevented by sufficiently careful programming.

There is, however, a second category of run-time error which no amount of forethought can avoid: these nearly all involve the input/output system. Examples include trying to open a file which no longer exists, or finding corrupted data on an input file. For this reason most input/output errors can be trapped, using the IOSTAT= or ERR= keywords in any I/O statement. There is no way of trapping run-time errors in any other types of statement in Standard Fortran.

But, just because a program compiles, links, and runs without apparent error, it is not safe to assume that all bugs have been eliminated. There are some types of mistake which will simply give you the wrong answer. The only way to become confident that a program is correct is to give it some test data, preferably for a case where the results can be calculated independently. When a program is too elaborate for its results to be predictable it should be split into sections which can be checked separately.

Next: Program Structure and Layout Up: Professional Programmer's Guide to Previous: Basic Fortran Concepts

Clive Page
Tue Feb 27 11:14:41 GMT 2001