Personal Blogs

Algorithms 3: Functional integrity

Tuesday, 7 July 2015, 14:15

Visible to anyone in the world

Edited by Martin Thomas Humby, Monday, 27 July 2015, 13:03

Algorithms 2 implemented a very simple algorithm as a program. Despite its simplicity it still relied on two built-in functions to get user input and display output. In all but the most trivial of programs, modularity in the form of functions, objects, classes, components, modules and packages is used to reduce the unassailable complexity that would otherwise arise.

Regardless of complexity, an algorithm will often have applicability extending beyond a particular program and it is generally useful to separate this functionality from program specific code by factoring it off to a separate function. This article takes a look at functions and some other basics, both in general and as implemented by Python.

You may be already acquainted with functions to a greater or lesser extent. A function is a relation that maps an element from a set of inputs to exactly one element in a set of outputs. A relation is defined by input to output mappings: the same input always produces the same output. Two or more inputs can produce the same output or every input can map to a different output. The relation's inner workings, when they extend beyond simple mapping, need not be known. They do not form part of its specification and quite possibly there may be several different algorithms that implement the same relation.

Exactly one output is crucial for a function. Functions often use other functions and we cannot make use a function that returns one output for a particular set of inputs, two for another and fails to deliver at all for a third. The meaning of this single output restriction will be discussed in greater detail later in the next article.

The parts of a function

An application program, application or app. (called this to distinguish application software from system software - the operating system for example) can generally be identified by its performing some task or tasks for users or another application program. It is a top level unit of modularity providing an interface to its users that exposes functionality and hides inner workings, its implementation. Functions exist at the other end of the scale but still present an interface hiding details of the function's implementation.

Consider a function that accepts two numbers, total and number with a string representing an operator. It applies number to total using the operator corresponding to the string and returns the result:

A function interface | implementation

The function name, new_total, and its parameter list are visible to client code that uses the function. The function's parameters may also be referred to as its formal parameters or arguments. The function body, in the function's implementation, is hidden from client code.

Inside the function body its parameters become local variables. Such variables only exist while code in the function executes and are then gone forever. We say the scope of these variables is the function. Assigning a new value to a local variable has no effect on anything outside the function even when an external variable has the same name as the local.

The local variable result declared within the function is even more local than its parameters: there is no way the value of result can be changed externally. However, when the function is called by client code the values of the arguments to the call are passed to the function initializing its parameters to the same values as the arguments. Below old_total, operator and number are passed to new_total() and the return assigned to updated:

old_total = 101
operator = '+'
number = 7
updated = new_total(old_total, operator, number)

The arguments (old_total, operator, number) are sometimes referred to as the actual parameters of the call. The result of the function is returned to the caller where it can be assigned to a variable, used in an expression or simply discarded.

Placing the function in an expression makes assignment to an intermediate variable unnecessary

cumulative = old_total + new_total(old_total, operator, number)

But if we write

new_total(old_total, operator, number)

The function is called but its return value is not used. Doing this is pointless in this case: new_total() has no effect other than returning a value. Other functions may have useful side effects. The built in function print() for example, sends output to the screen.

The scope of variables and functions

Python requires declare before use. The scope of a variable or function is therefore from the point where it is declared or defined up to the end of the file it is declared in. When another file is imported variables and functions in that file will only be visible after the import statement. There are two import options. An entire file or files can be imported:

import math, array

but when anything from the import is referred to it must be prefixed with the file name

print(math.pi)

Alternatively, specific items from a file can be imported and optionally given an alias:

from math import sqrt, pi
from math import sqrt as root, pi as pye
print(sqrt(2))
print(root(3))
print(pi)
print(pye)

Doing this the file name prefix is not part of an item's name in the current file but all three options can be used together.

When a file is imported any executable code it contains is run to provide for initialization of variables in the import. However, execution can be made conditional by putting code, and any import directives, within an if statement:

if __name__ == '__main__':
    inc_count()
    import math
    print(math.pi)

With the __name__ condition shown code placed within the if for testing or debugging will only be run when the file itself is run not when it is imported.

Within the body of a function all items whose scope it falls within are visible. Any external function can be called and there is read only access to external variables. Optionally a function can be given additional write access to a variable by declaring it global within the body:

count = 0
def inc_count():
    global count
    count += 1  # read this as increment count by 1

Possibly useful for file initialization, maintaining a counter for debugging or performance evaluation but generally to be avoided, I think.

Function return values

A return statement in a function stops its further execution and the value from any variable or expression that follows the return is passed back to the caller. There is of course a variation to this no further execution rule. When the return is inside a try - finally block, code in the finally part will be executed come what may. This option can be left aside for the present.

There can be as many return statements in a function as required each return representing a separate exit point. Using this style new_total() can be implemented as

def new_total(total, operator, number):
    if operator == '+':
        return total + number
    if operator == '-':
        return total - number
    # etc.
    return None

When operator is not a '-' execution drops through to the next test of operator, and so on. The elif: (else if) can be replaced with a simple if: there is no need to step over tests that follow when a return takes care of a success.

The return None statement is in fact redundant. When a function does not contain a return statement or return has nothing to return, None is returned by default. You can see None being returned by a built-in function that only produces a side effect and has nothing to return by executing print(print()).

However, returning None, perhaps to show that none of the conditions in a function was True, the return needs to be explicit for clarity as to what is intended. By returning None we can write

result = new_total(old_total, operator, number)

if result != None:
    ... # some actions
else:
    ... # some other actions

Because None is logically equivalent to False in Python, we could write if result: ... but doing that is probably going to be wrong. A zero value, integer or floating-point, also has a logical equivalent of False so the else actions will be selected for all these cases. If that is the requirement then a remark in the code to explain this is essential but that will mean just as much typing as putting both conditions into if not (result == None or result == 0):.

Using single or multiple exit points

The opinion that a function should only have a single exit point is sometimes seen. This idea probably dates back to a very long time ago and supporters of old style Pascal. At that time Pascal was a language from academia that attempted to promote structured programming by enforcing a single exit at the end of a function, as opposed to C a language designed to cater for a specific range of needs in real world programming.

As far as I can see a single exit has nothing to do with structured programming. Execution in a function with multiple exit points follows the form of a tree. In an equivalent with a single exit point execution tends to the form a graph with nodes linked back to the trunk, an altogether more complex structure. Introducing variables only required to hold a return value until the end of the function introduces more clutter and room for error.

The only advantage of a single exit was simplification of resource deallocation by putting it at a single point, a non-requirement when try - finally and garbage collection is available. Go for simplicity and clarity over opinion with no logical basis and use single or multiple exit points as indicated to get these requirements, maybe?

Mutable and immutable

All the types we have looked at so far, integers, real numbers and strings, are strict values. Such values are immutable: they cannot be modified. Changing the value of the number 2 to make it 3 so all twos in a program become threes is an obvious absurdity. It should not be and is not allowed. We can change the value at a variable by assigning a new value but the value itself is inviolate.

This is not the case with all types. Some types are mutable and allow their content to be changed. When an instance of one of these types is assigned to a variable a reference to the instance is assigned. Assignment from that variable to another variable copies the reference and both variables now refer to the same instance.

Not surprisingly modifications to an instance from one variable are seen from all the variables that hold a reference to it. Similarly, when a reference to a mutable type is passed as an argument to a function, modifications to it as a local variable are reflected to all other references. Some locals are more local than others.

The built-in type list is mutable. It supplies a sequence of any type and elements can be modified by indexing into the list and assigning new values. In common with all similar Python types indexing is zero based, the first element is at index [0]. For example:

def increment_elements(numbers_list):
    index = 0
    for number in numbers_list:
        numbers_list[index] = number + 1
        index += 1

a_list = [1, 2, 3, 4]
increment_elements(a_list)
print(a_list)

output: [2, 3, 4, 5]

As can be seen modifications in the function increment_elements are seen from the variable a_list. Each iteration of the for loop assigns successive elements of the list to num. The loop can be made slightly neater by using the built-in function enumerate that returns both the element and its index:

def increment_elements(numbers_list):
    for index, number in enumerate(numbers_list):
        a_list[index] = number + 1

The enumerate() function complies with the single return rule by returning a tuple. The tuple is unpacked transparently to the variables index and number.

A tuple is similar to a list - elements have a specific and usually a meaningful order, but a tuple is immutable once created.

Summary

Here we have looked at the usage and structure of functions and the scope of entities within a program including local and other variables. Algorithms 4 looks at functions with an indeterminate number of parameters. Lazy evaluation and closures will be considered in Algorithms 5.

[Algorithms 3: Functional integrity (c) Martin Humby 2015]