OU blog

Personal Blogs

ExLibris

Algorithms 1: Algorithms and data structures

Visible to anyone in the world
Edited by Martin Thomas Humby, Monday, 3 Aug 2015, 11:40

According to Niklaus Wirth, 'Algorithms plus Data Structures equals Programs' (Wirth 1976). Efficient and well constructed programs are an ongoing interest of mine so doing a series of blogs on this subject will provide an archive and an aid mémoire for what I have found out so far.

The easy to learn programming language Python looks to be favourite for implementing the algorithms and will be the main language used. Python has fewer gotchas and 'must do it this ways' than any other I have come across. The fewer the language features to get in the way the more scope there will be for the algorithms, or that is what I am hoping. To keep things simple Windows will be the host operating system unless there is a good reason otherwise.

The only prior knowledge assumed will be a general idea of the basic flow of control structures found in all mainstream programming languages: if something do this, else do that, and repeating actions while a particular condition applies. Other assumed prerequisites will be some acquaintance with the mathematical entities integer, real number, set and sequence and an understanding that in computing terms a string is a sequence of characters 'abc def', not something that holds up your tracksuit trousers.


Getting started

I am not among those who believe that typing code into a text editor and running it with the interpreter from the command line makes you a better person. This means the first requirement is for an Integrated Development Environment (IDE) to manage and debug the Python code as it is developed.

Jetbrains PyCharm is an easy to use IDE providing good functionality straight out of the box, no plugins required, and is the best bet to make a start with I think. Initially CPython will be used. This Python version incorporates the latest language revisions as they are agreed. It provides multithreading but does not support multiprocessing - only one thread can run at any one time. To investigate parallel algorithms another version will be needed. This may mean using a different IDE, but for the present let's keep it simple. Best to download and save the required installers before running them just in case something goes wrong.

The IDE requires Java version 6+ so go to http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html or where ever. This Java Development Kit (JDK) download includes the runtime environment (JRE) needed to run Java programs. Initially I will be using the 32 bit x86 version under Windows 7. Run this installer accepting default settings when offered.

Next go to https://www.python.org/downloads/windows/ and click on the latest stable release of Python 3 (at the top of the page).

Currently the link wil take you to https://www.python.org/downloads/release/python-343/ Select the installation for your machine (near the bottom of the page) Windows x86-64 MSI installer for a64 bit Windows. I ran Windows x86 MSI installer under 32 bit Windows 7. Generally accept defaults for installation but in the second dialogue that appears, 'Customize Python 3.4.3',Scroll down the list of options and turn on Add python.exe to path. A detailed installation guide can be found at http://www.howtogeek.com/197947/how-to-install-python-on-windows/

Having installed Java and Python we are ready to install the IDE. Download PyCharm Community Edition from https://www.jetbrains.com/pycharm/download/ and run the installer. Accept defaults but on the second screen check the box to associate the IDE with .py Python files. When installation finishes check the box to run PyCharm. Accept defaults and eventually a dialogue opens. Click Create New Project and an oversized approximation of the New Project dialogue we will be using from now on appears. 

Initial Project

The point to note is that the text in the top edit box includes the project title at the end. Change this text to your version of C:\Users\Martin\PycharmProjects\ALGORITHMS\Greeting, maybe?

Create the project and after its initial 'Indexing', taking PyCharm quite some time, you should see something like that shown below. The tab bars at the bottom and side can be hidden or shown by clicking the icon that looks a bit like a grey monitor in the bottom left corner and the tabs themselves dragged into position at the ends of the bars or between bars according to taste.

Add a file

If the project tree is not visible it can be shown by clicking the Project tab or double clicking on the project folder in the tool bar labelled Greeting. Right click on Greeting in the project tree (not the Greeting on the tool bar) to get the context menu and select New > Python File as shown. Name this file 'hello' and it opens in the editor window.


A greeting algorithm

As a starting point a simple requirement might be to develop an algorithm that will read in the names of any users who happen to be around and output a greeting that includes the user's name. If no name is input when the return ('Enter') key is pressed the program that implements the algorithm terminates.

A first stab at doing this might be the following lines of Python code:

print("Hello, g'day, its great to see you\n")

while
name != '': # while name is not equal to an empty string
  
name = input('Please input your name >')
   
print('Hello', name)
print('
Welcome to algorithms and data structures in Python.\n')

print
('\nBye')

Readers who are already acquainted with Python can skip this section, otherwise a few features are worth noting:

print() is a built in function so there is no need to write it or import it from somewhere else (in fact it is imported by default from a package 'builtins'), The print() function simply outputs whatever arguments are passed to it, as many as you like. By default the results of these arguments are separated by spaces in output and print() puts the cursor on a new line afterwards but this behaviour is easily modified. input('user prompt ') is another built-in function to read user input.

When looking at Python code posted on the internet much of this is in older versions of Python. Prior to version 3 print was an operator - print 'something'. This code will get an error under 3 but is easily fixed by inserting brackets - print('something').

The # character indicates the start of a remark or comment: explanatory text not executed when the program runs.

"Hello, g'day and great to see you!" is a string literal. Such strings can be enclosed in double quotes " ", in which case the string can include single quotes as in "g'day". Alternatively the start and end of the string can be indicated by single quotes and it can contain " characters.  '\n' included in a string represents a new-line character over and above anything print()may add to output.

In Python each line of code generally comprises a program statement. To start a new statement simply put it on a new line. Terminators such as ; are not required but a semicolon can be used to separate multiple statements on the same line if you must. Very long statements can go on multiple lines joined by a back-slash \ but this is seldom required and is not always compulsory.

Compound statements, like while are terminated with a colon :The body of the statement is defined by indentation, no curly brackets or begin ... end required. Thus indentation is critical, it is the only indication of the statements forming part of a compound block and separates those statements from any statements that follow, not part of the block. In 'hello.py' above the compound while includes the three indented lines of code.print('Bye') is not part of the while and is not indented. Python compound statements are:

    while <condition>:
for <a_variable> {,<a_variable>} in <interval | sequence | set>:

if <condition>:
elif <condition>: # else-if
else:

try:
except [<exception class>[as <variable>]{,<exception class>[as <variable>]}] :
finally:

with <an object> [as <variable>]:

def <a_function>:
class <a_class>:

In these definitions < > brackets enclose an item description and are not Python syntax. Likewise | indicates alternatives, [ ] brackets enclose an optional inclusion and { } braces enclose an option repeated as many times as may be required. Above, the last two cases are function and class definitions respectively, rather than executable compound statements as such but the indentation rules still apply.

Using the IDE the Tab key inserts an indentation at the cursor, the delete back-arrow key deletes the entire indentation and Shift + Tab removes an indentation from the line the cursor is on. If one or more lines are highlighted by dragging the cursor all the selected lines are relocated with Tab or Shift + Tab, handy when major code restructuring is required. Try highlighting the three lines of code after the while and shifting them around, for example.

Add the Python code shown above to hello.py.


Running the program

The next step is to run the Greeting program and see what happens. Any Python file, also referred to as a script, can be run as it stands. No need for a main() function or a program file. The Python interpreter simply starts at the top of the page and works down executing any executable statements it finds on the way, looping and/or branching as directed by the code. If any other files are referenced in import directives these files are accessed when the import is encountered. Imports can go anywhere in a file but the default import of 'builtins' can be thought of as being right at the top of the page.

There are many ways of running a script from the IDE. A straightforward way is to right click on a file that is open in the editor or on the file-name in the project tree and select Run 'hello' from the context menu (next to the green triangle). Do this and an error message appears

Error message

The variable name is not defined. Reading down from the top 'name' has not been encountered before the while loop entry condition is found. As far as the interpreter is concerned this variable simply does not exist. Double click on the file path underlined in blue and the IDE will put the cursor at the start of the line where the error was found. If the required file is not open in the editor it is opened.

To overcome this problem something needs to be assigned to name before the while. Anything will get an entry to the loop apart from the empty string ''.. As can be seen from the assignment to name from the input() function, the assignment operator in Python is  = . We could write

    name = 42

This assignment can be read as 'Let name equal 42' or 'Name takes the value 42'. Some people like 'Name becomes 42' but this is too mushy for my taste: name doesn't become 42, it is still the variable name before and after the assignment.

Python makes no objection to the same variable holding an integer in one place, a string in another or any other type somewhere else but whether it is sensible to use this facility is another question. A general solution for the current requirement is to use the special object None. A variable holding None is never equal to any other variable except another one holding None.

Before the while type in the line

    name = None

Run 'hello.py' again and the 'Please input your name >' prompt appears. To input names click in the Run pane to relocate the cursor, type in the name and hit return. Do not worry when the cursor appears before the prompt, typing anything puts it in the right place. The program runs successfully and exits when return is pressed without typing a name. However, its exit behaviour is not really what is required.

If it is possible to mess up something this simple perhaps there is more to this algorithms thing that might at first be thought but further discussion will have to wait for another time.


Reference
    Wirth 1976, 'Wirth, N, Algorithms + Data Structures = Programs', Prentice Hall

[Algorithms 1: Algorithms and data structures      (c) Martin Humby 2015]


Permalink
Share post