OU blog

Personal Blogs

ExLibris

Algorithms 12: Basic OOP – a List type

Visible to anyone in the world
Edited by Martin Thomas Humby, Friday, 4 Sep 2015, 13:39

Because they are stored in a single contiguous block of memory arrays are not extendable. Memory is allocated to an exact fit, nothing more. This characteristic was reflected by the maximum size of a Stack. Now the requirement is for a List type. As we are coding the List ourselves we can give it any functionality we like including unlimited increase in size and the ability to add items not just at the end but to insert an item anywhere along a List's length.

Similarly, the requirements for List can dictate that when an item is removed at any index the list will close up to fill the gap. Additionally, List can be given a method to add a guard value at the end of the list for use when the List is searched .

Ignoring these new requirements List has several characteristics in common with Stack and one way to make a start on coding List is to copy Stack's code, paste it into a new file or the same file and rename Stack to List. OOP provides an alternative that brings some advantages: inheritance. Using inheritance all that needs to be done to get Stacks fields and functionality into List is to include Stack in the class declaration:

from my_OOP_stack import stack as Stack
class List(Stack):
    pass

Readers will note I have imported stack as Stack. The Python convention for class names is to use CamelCase but when it started out stack was not a class as such, simply the next best thing to a record. A short breakdown of Python naming conventions can be found here. The keyword pass allows a class or function to be defined with an empty body.

If you would like to follow the incremental development of List, open the List Investigations project from the download in PyCharm. Create a new file list.py and copy the above code into it. There is a copy of my_OOP_stack_test.py in the project with the import statement changed to: from list import List as stack. Without any doubt the test code in this file views a List as a stack. Run the test script and the output is identical to that got in Algorithms 11 when it was run with a genuine stack.

This is not surprising. Currently List has no functionality or fields other than those it has inherited from stack. Even so, it demonstrates three facts: List has inherited stack's interface - if it hadn't calling stack's methods would get an error. Similarly it has inherited stack functionality - if it hadn't output would differ or an error would result (despite the test script being far from comprehensive please take my word for it). Together these two points comprise what is sometimes known as class inheritance (as opposed to pure interface inheritance, to be dealt with in a later article).

The third fact is that List can be viewed as a stackList is a subclass, a subtype, a child class or a descendant of stackStack is a superclass, parent or ancestor of List. The ability to view a subclass as a superclass is known as subtype polymorphism.

In some languages a view may become a reality. C++ features class instances by reference and by value. When a by-value instance is assigned to a variable of the same class all its fields are copied getting a new instance. If it is assigned to a superclass variable any additional fields it has are truncated off getting an instance of the superclass.

In languages that use static typing, assigning a reference to a subclass to a variable declared to be a reference a superclass is allowed. In Python assigning to a reference viewed as a superclass has the same effect.

Assignment super to subclass is not allowed under static typing: the subclass may have declared additional methods that do not exist in the superclass and in turn these methods may access fields that the superclass does not possess. In Python you can do what you like but errors are likely if this rule is violated.

Redefining push( )

Because push() adds items to Stack's array it has the capability of causing an overflow. The answer is simple: when the array is full allocate a bigger one and copy the items from the old array to the new one. 'But doesn't that take a long time to do?' you may wonder.

Well yes, doing this for a medium to large array coded to show how it works in Python is quite slow. Python features list slicing getting an equivalent to C's memcpy() but currently that will not be used. When coded in C or Delphi Pascal or assembly language for example, copying is much faster. Over time CPUs have been optimized to do various tasks and copying blocks of memory is one of them.

Another point, we are not copying actual items only references to them, considerably smaller than a record with even a few fields. Faced with a really big sequence other techniques would have to be considered. Adding a new array linked to the old one is one possibility but for small to medium arrays copying is fine and most likely more efficient.

Copying arrays

Copying will also be needed to modify the list when an item is removed or inserted so, to avoid code duplication, functions to do the copying can be coded as utility functions in 'array_types' called by List methods. To prevent overwriting of items to be copied with those already copied two copying functions are needed:

Copy left - right

The copy_left() function is simple. It is passed the source array, the destination array, the source index and the destination index together with the number of items to be copied. Starting at the lowest index in both arrays it iterates over the given section of the source copying items to the destination. If both the source and destination array are the same the element at the destination index is overwritten deleting that element.

copy_right() duplicates the element at a source index in the same array after moving other elements right to make room. The original element can be overwritten with an insertion. To avoid overwriting elements copying must start on the right and work leftwards. Coding it is slightly trickier, mostly because of the non-intuitive syntax of the range() generator particularly when counting down.

A generator comprises a special code construct, generally within a function. Each call gets the next value in a sequence of values.  The function call can be put in a for loop to obtain successive values one per iteration. Writing generators will be detailed at a later time.

Counting up with range() we get range(start_value, last_value + 1, increment). When there is a single argument, range(n)for example,  n is the number of integers generated including zero. A start_value is optional, default zero, but required if there are other arguments in addition to n.  The increment defaults to 1 when not included.

Counting down with range(), this time we get range(start_value, last_value - 1, increment) with start_value greater than or equal to last_value. The increment is of course negative.

The code for my copy_right() is:

def copy_right(source, s_index, dest, d_index, length):
    sx = s_index + length - 1
    dx = d_index + length - 1
    for sx in range(sx, s_index - 1, -1):
        dest[dx] = source[sx]
        dx -= 1  # decrement destination index by 1

This function is in 'array_types.py' in the download with a small test script. Readers following incremental development are invited to code a similar copy_left() function in array_types, otherwise the code is in 'array_types.py' in the 'solutions' package.

To complete the redefinition of push() an instance variable capacity or some such will be needed to hold the current size of the array. The revised version of __init__ shown below calls the superclass __init__ to set the array's initial size. Another option can be included to provide Python style typecasting of any sequence to the List type.


Type casts

In languages with static typing, compiled to native code, a typecast often means 'view the object at this variable not as the type of the variable but as some other type.' Generally there is a restriction that both types should have the same footprint in memory and perhaps a few more, but when an object is by reference all references have the same size.

(native code is machine code executed directly by the Central Processing Unit (CPU) and native to the target platform for an application. A platform is the environment an application runs in supplied by the combination of the CPU, the operating system and sometimes a Graphical User Interface (GUI))

In Python what may appear to be a typecast is in fact a call to a constructor. __init__ is called and initializes the new object based on the object passed in. For List we might have  a_list = List('abcd'). Adding code that includes this option to List.__init__ gets:

from array_types import copy_left, copy_right
from collections.abc import Iterable

DEFAULT_CAPACITY = 2    # use a low value   testing

class List(Stack):
    def __init__(self, data):
        if isinstance(data, Iterable) and len(data) > 0:
            data_len = len(data)
            self.capacity = data_len * 3 // 2
            super().__init__(self.capacity)
            copy_left(data, 0, self.array, 0, data_len)
            self.SP = data_len
        else:
            if data == None or data < DEFAULT_CAPACITY:
self.capacity = DEFAULT_CAPACITY
           else:
self.capacity = data
super().__init__(self.capacity) self.max_size = sys.maxsize # set inherited max_size to a very large number

Iterable is the common ancestor of all classes that can be iterated over in a for loop. It is the base class for these types. As yet List is not iterable so an instance cannot be 'cast' to a List to get a new copy. The guard element, for use in a linear search of the List, will be discussed in a later article.

The method that checks whether overflow of the array is imminent will be shared by Put() and an insert_at(index) method. It can be coded as a method _check_capacity():

def _check_capacity(self):
    if self.SP < self.capacity - 1: # leave a spare element for a guard
        return
    self.capacity = self.capacity * 3 // 2
    new_array = array_(self.capacity)
    copy_left(self.array, 0, new_array, 0, self.SP)
    self.array = new_array

The new array and the existing are separate so either of the copy functions could be used: use the simpler copy_left(). The method name starts with an underscore to indicate that it is a helper method belonging to List and not for public use by client code. I will endeavour to keep to this rule.  push(item) calls the method which will make space for the addition when required:

def push(self, item):
    self._check_capacity()
    super().push(item)

Here super().push(item) calls Stack's push to do the addition. If I was coding this for real, the original push is trivial and I probably would put its code inline rather than calling it but here the full possibilities of inheritance are used.

The methods insert_at(index) and delete_at(index) are left for interested readers to code with the proviso that the methods are included in the 'solutions' package version. Inherited pop() requires no modification unless we want to copy a reduced number of items to a smaller array, perhaps later, so the final task is to write a method to insert the guard value.


Adding a guard

From the provision in check_capacity() there will always be sufficient room to accommodate a guard at the end of the array. A guard is not a permanent addition so, unlike insertions and deletions, adjusting the inherited stack pointer self.SP is not required:

    def add_guard(self, item):
    self.array[self.SP] = item

Summary

In this section we have seen how an array can be made extensible by copying and visited two more OOP basics : class inheritance and polymorphism.

Two different requirements for copying elements left or right within the same array were identified. The single memcpy() function or its equivalent seen in many languages computes which of the alternatives to use when source and destination overlap. An array_cpy() function is provided in the solutions package. 

[Algorithms 12: Basic OOP - a List type (c) Martin Humby 2015]

Permalink Add your comment
Share post