Edited by Martin Thomas Humby, Friday, 4 Sept 2015, 13:39
Because they are stored in a single
contiguous block of memory arrays are not extendable. Memory is allocated to an
exact fit, nothing more. This characteristic was reflected by the maximum size of
a Stack. Now the requirement is for a List type. As we are coding the List
ourselves we can give it any functionality we like including unlimited increase
in size and the ability to add items not just at the end but to insert an item anywhere
along a List's length.
Similarly, the requirements for List can
dictate that when an item is removed at any index the list will close up to
fill the gap. Additionally, List can be given a method to add a guard value at
the end of the list for use when the List is searched .
Ignoring these new requirements List has
several characteristics in common with Stack and one way to make a start on
coding List is to copy Stack's code, paste it into a new file or the same file
and rename Stack to List. OOP provides an alternative that brings some
advantages: inheritance. Using inheritance all that needs to be
done to get Stacks fields and functionality into List is to include Stack in
the class declaration:
from my_OOP_stack import stack as Stack
class List(Stack):
pass
Readers
will note I have imported stack as Stack. The Python convention for class names is to use CamelCase but when
it started out stack was not a class as
such, simply the next best thing to a record. A short breakdown of Python
naming conventions can be found here. The keyword pass allows a class or
function to be defined with an empty body.
If you would
like to follow the incremental development of List, open the List
Investigations project from the download in PyCharm. Create a new file list.py
and copy the above code into it. There is a copy of my_OOP_stack_test.py in the
project with the import statement changed to:from list import List as stack. Without any doubt the test code in this file views a List as a stack. Run the test script
and the output is identical to that got in Algorithms 11 when it was run with a
genuine stack.
This is not surprising. Currently List
has no functionality or fields other than those it has inherited from stack. Even so, it demonstrates three facts: List has inherited stack's
interface - if it hadn't calling stack's methods would get an error. Similarly
it has inherited stack functionality - if it hadn't output would differ or an
error would result (despite the test script being far from comprehensive please
take my word for it). Together these two points comprise what is sometimes
known as class inheritance (as opposed to pure interface
inheritance, to be dealt with in a later article).
The third fact is that List can be viewed as a stack. List is a subclass,
a subtype, a child class or a descendant of stack. Stack is a superclass, parent or ancestor of List. The ability to view a subclass as a superclass is known as subtype
polymorphism.
In some languages a view may become a reality. C++
features class instances by reference and by value. When a by-value
instance is assigned to a variable of the same class all its fields
are copied getting a new instance. If it is assigned to a superclass
variable any additional fields it has are truncated off getting an
instance of the superclass.
In languages that use static typing, assigning a reference to a subclass to a variable declared to be a reference a
superclass is allowed. In Python assigning to a reference viewed as a
superclass has the same effect.
Assignment super to subclass is not
allowed under static typing: the subclass may have declared additional methods
that do not exist in the superclass and in turn these methods may access fields
that the superclass does not possess. In Python you can do what you like but
errors are likely if this rule is violated.
Redefining push( )
Because push() adds items to Stack's array it has the capability of causing an
overflow. The answer is simple: when the array is full allocate a bigger one
and copy the items from the old array to the new one. 'But doesn't that take a long time to do?' you may wonder.
Well yes, doing this for a medium to
large array coded to show how it works in Python is quite slow. Python features
list slicing getting an equivalent to C's memcpy() but currently that will not be used. When coded in C or Delphi
Pascal or assembly language for example, copying is much faster. Over time CPUs
have been optimized to do various tasks and copying blocks of memory is one of
them.
Another point, we are not copying actual
items only references to them, considerably smaller than a record with even a
few fields. Faced with a really big sequence other techniques would have to be
considered. Adding a new array linked to the old one is one possibility but for
small to medium arrays copying is fine and most likely more efficient.
Copying arrays
Copying will also be needed to modify the list when an item is removed or inserted so, to avoid code duplication, functions to do the copying can be coded as utility functions in 'array_types' called by List methods. To prevent overwriting of items to be copied with those already copied two copying functions are needed:
The copy_left() function is simple. It is passed the source array, the destination
array, the source index and the destination index together with the number of
items to be copied. Starting at the lowest index in both arrays it iterates
over the given section of the source copying items to the destination. If both
the source and destination array are the same the element at the destination
index is overwritten deleting that element.
copy_right() duplicates
the element at a source index in the same array after moving other elements
right to make room. The original element can be overwritten with an insertion.
To avoid overwriting elements copying must start on the right and work
leftwards. Coding it is slightly trickier, mostly because of the non-intuitive
syntax of the range() generator particularly
when counting down.
A generator comprises a
special code construct, generally within a function. Each call gets the next
value in a sequence of values. The
function call can be put in a for loop to obtain
successive values one per iteration. Writing generators will be detailed at a
later time.
Counting up with
range() we get range(start_value,
last_value + 1, increment). When there is a single
argument, range(n)for example, n is the
number of integers generated including zero. A start_value is optional, default zero, but required if there are other
arguments in addition to n. The increment defaults to 1 when not
included.
Counting down with range(), this time we get range(start_value, last_value - 1,
increment) with start_value greater than or equal to last_value.
The increment is of course negative.
The code for my copy_right() is:
def copy_right(source, s_index, dest, d_index, length):
sx = s_index + length - 1
dx = d_index + length - 1
for sx in range(sx, s_index - 1, -1):
dest[dx] = source[sx]
dx -= 1 # decrement destination index by 1
This function
is in 'array_types.py' in the download with a small test script. Readers following
incremental development are invited to code a similar copy_left() function in array_types, otherwise the code is in 'array_types.py'
in the 'solutions' package.
To complete
the redefinition of push() an instance variable capacity or some such will be needed to hold the current size of the array.
The revised version of __init__ shown below calls
the superclass __init__ to set the
array's initial size. Another option can be included to provide Python style
typecasting of any sequence to the List type.
Type casts
In languages with
static typing, compiled to native code, a typecast often means 'view the object
at this variable not as the type of the variable but as some other type.'
Generally there is a restriction that both types should have the same footprint
in memory and perhaps a few more, but when an object is by reference all
references have the same size.
(native
code is machine code executed
directly by the Central Processing Unit (CPU) and native to the target platform
for an application. A platform is the environment an application
runs in supplied by the combination of the CPU, the operating system and
sometimes a Graphical User Interface (GUI))
In Python what
may appear to be a typecast is in fact a call to a constructor. __init__ is called and initializes the new object based on the object passed
in. For List we might have a_list
= List('abcd'). Adding code that includes this option
to List.__init__ gets:
from array_types import copy_left, copy_right
from collections.abc import Iterable
DEFAULT_CAPACITY = 2 # use a low value testing
class List(Stack):
def __init__(self, data):
if isinstance(data, Iterable) and len(data) > 0:
data_len = len(data)
self.capacity = data_len * 3 // 2
super().__init__(self.capacity)
copy_left(data, 0, self.array, 0, data_len)
self.SP = data_len
else:
if data == None or data < DEFAULT_CAPACITY: self.capacity = DEFAULT_CAPACITY else: self.capacity = data super().__init__(self.capacity)
self.max_size = sys.maxsize # set inherited max_size to a very large number
Iterable is the common
ancestor of all classes that can be iterated over in a for loop. It is the base class for these types. As yet List is not iterable so an instance cannot be 'cast' to a List to get a new copy. The guard element, for use in a linear search of
the List, will be discussed in a later article.
The method that checks whether overflow
of the array is imminent will be shared by Put() and an insert_at(index) method.
It can be coded as a method _check_capacity():
def _check_capacity(self):
if self.SP < self.capacity - 1: # leave a spare element for a guard
return
self.capacity = self.capacity * 3 // 2
new_array = array_(self.capacity)
copy_left(self.array, 0, new_array, 0, self.SP)
self.array = new_array
The new array and the existing are
separate so either of the copy functions could be used: use the simpler copy_left(). The method name starts with an underscore to indicate that it is a
helper method belonging to List and not for public
use by client code. I will endeavour to keep to this rule. push(item)
calls the method which will make space for the addition when required:
Here super().push(item) calls Stack's push to do the
addition. If I was coding this for real, the original push is trivial and I
probably would put its code inline rather than calling it but here the full
possibilities of inheritance are used.
The methods insert_at(index) and delete_at(index) are left
for interested readers to code with the proviso that the methods are included
in the 'solutions' package version. Inherited pop() requires no modification unless we want to copy a reduced number of
items to a smaller array, perhaps later, so the final task is to write a method
to insert the guard value.
Adding a guard
From the
provision in check_capacity() there will always be sufficient room to accommodate a guard at the
end of the array. A guard is not a permanent addition so, unlike insertions and
deletions, adjusting the inherited stack pointer self.SP is not required:
def add_guard(self, item):
self.array[self.SP] = item
Summary
In this section we have seen how an array can
be made extensible by copying and visited two more OOP basics : class
inheritance and polymorphism.
Two different requirements for
copying elements left or right within the same array were identified. The single memcpy()
function or its equivalent seen in many languages computes which of the
alternatives to use when source and destination overlap. An array_cpy()
function is provided in the solutions package.
[Algorithms 12: Basic OOP - a List type (c) Martin Humby 2015]
Algorithms 12: Basic OOP – a List type
Because they are stored in a single contiguous block of memory arrays are not extendable. Memory is allocated to an exact fit, nothing more. This characteristic was reflected by the maximum size of a Stack. Now the requirement is for a List type. As we are coding the List ourselves we can give it any functionality we like including unlimited increase in size and the ability to add items not just at the end but to insert an item anywhere along a List's length.
Similarly, the requirements for List can dictate that when an item is removed at any index the list will close up to fill the gap. Additionally, List can be given a method to add a guard value at the end of the list for use when the List is searched .
Ignoring these new requirements List has several characteristics in common with Stack and one way to make a start on coding List is to copy Stack's code, paste it into a new file or the same file and rename Stack to List. OOP provides an alternative that brings some advantages: inheritance. Using inheritance all that needs to be done to get Stacks fields and functionality into List is to include Stack in the class declaration:
from my_OOP_stack import stack as Stack class List(Stack): pass
Readers will note I have imported stack as Stack. The Python convention for class names is to use CamelCase but when it started out stack was not a class as such, simply the next best thing to a record. A short breakdown of Python naming conventions can be found here. The keyword pass allows a class or function to be defined with an empty body.
If you would like to follow the incremental development of List, open the List Investigations project from the download in PyCharm. Create a new file list.py and copy the above code into it. There is a copy of my_OOP_stack_test.py in the project with the import statement changed to: from list import List as stack. Without any doubt the test code in this file views a List as a stack. Run the test script and the output is identical to that got in Algorithms 11 when it was run with a genuine stack.
This is not surprising. Currently List has no functionality or fields other than those it has inherited from stack. Even so, it demonstrates three facts: List has inherited stack's interface - if it hadn't calling stack's methods would get an error. Similarly it has inherited stack functionality - if it hadn't output would differ or an error would result (despite the test script being far from comprehensive please take my word for it). Together these two points comprise what is sometimes known as class inheritance (as opposed to pure interface inheritance, to be dealt with in a later article).
The third fact is that List can be viewed as a stack. List is a subclass, a subtype, a child class or a descendant of stack. Stack is a superclass, parent or ancestor of List. The ability to view a subclass as a superclass is known as subtype polymorphism.
In some languages a view may become a reality. C++ features class instances by reference and by value. When a by-value instance is assigned to a variable of the same class all its fields are copied getting a new instance. If it is assigned to a superclass variable any additional fields it has are truncated off getting an instance of the superclass.
In languages that use static typing, assigning a reference to a subclass to a variable declared to be a reference a superclass is allowed. In Python assigning to a reference viewed as a superclass has the same effect.
Assignment super to subclass is not allowed under static typing: the subclass may have declared additional methods that do not exist in the superclass and in turn these methods may access fields that the superclass does not possess. In Python you can do what you like but errors are likely if this rule is violated.
Redefining push( )
Because push() adds items to Stack's array it has the capability of causing an overflow. The answer is simple: when the array is full allocate a bigger one and copy the items from the old array to the new one. 'But doesn't that take a long time to do?' you may wonder.
Well yes, doing this for a medium to large array coded to show how it works in Python is quite slow. Python features list slicing getting an equivalent to C's memcpy() but currently that will not be used. When coded in C or Delphi Pascal or assembly language for example, copying is much faster. Over time CPUs have been optimized to do various tasks and copying blocks of memory is one of them.
Another point, we are not copying actual items only references to them, considerably smaller than a record with even a few fields. Faced with a really big sequence other techniques would have to be considered. Adding a new array linked to the old one is one possibility but for small to medium arrays copying is fine and most likely more efficient.
Copying arrays
Copying will also be needed to modify the list when an item is removed or inserted so, to avoid code duplication, functions to do the copying can be coded as utility functions in 'array_types' called by List methods. To prevent overwriting of items to be copied with those already copied two copying functions are needed:The copy_left() function is simple. It is passed the source array, the destination array, the source index and the destination index together with the number of items to be copied. Starting at the lowest index in both arrays it iterates over the given section of the source copying items to the destination. If both the source and destination array are the same the element at the destination index is overwritten deleting that element.
copy_right() duplicates the element at a source index in the same array after moving other elements right to make room. The original element can be overwritten with an insertion. To avoid overwriting elements copying must start on the right and work leftwards. Coding it is slightly trickier, mostly because of the non-intuitive syntax of the range() generator particularly when counting down.
A generator comprises a special code construct, generally within a function. Each call gets the next value in a sequence of values. The function call can be put in a for loop to obtain successive values one per iteration. Writing generators will be detailed at a later time.
Counting up with range() we get range(start_value, last_value + 1, increment). When there is a single argument, range(n)for example, n is the number of integers generated including zero. A start_value is optional, default zero, but required if there are other arguments in addition to n. The increment defaults to 1 when not included.
Counting down with range(), this time we get range(start_value, last_value - 1, increment) with start_value greater than or equal to last_value. The increment is of course negative.
The code for my copy_right() is:
def copy_right(source, s_index, dest, d_index, length): sx = s_index + length - 1 dx = d_index + length - 1 for sx in range(sx, s_index - 1, -1): dest[dx] = source[sx] dx -= 1 # decrement destination index by 1
This function is in 'array_types.py' in the download with a small test script. Readers following incremental development are invited to code a similar copy_left() function in array_types, otherwise the code is in 'array_types.py' in the 'solutions' package.
To complete the redefinition of push() an instance variable capacity or some such will be needed to hold the current size of the array. The revised version of __init__ shown below calls the superclass __init__ to set the array's initial size. Another option can be included to provide Python style typecasting of any sequence to the List type.
Type casts
In languages with static typing, compiled to native code, a typecast often means 'view the object at this variable not as the type of the variable but as some other type.' Generally there is a restriction that both types should have the same footprint in memory and perhaps a few more, but when an object is by reference all references have the same size.
(native code is machine code executed directly by the Central Processing Unit (CPU) and native to the target platform for an application. A platform is the environment an application runs in supplied by the combination of the CPU, the operating system and sometimes a Graphical User Interface (GUI))
In Python what may appear to be a typecast is in fact a call to a constructor. __init__ is called and initializes the new object based on the object passed in. For List we might have a_list = List('abcd'). Adding code that includes this option to List.__init__ gets:
Iterable is the common ancestor of all classes that can be iterated over in a for loop. It is the base class for these types. As yet List is not iterable so an instance cannot be 'cast' to a List to get a new copy. The guard element, for use in a linear search of the List, will be discussed in a later article.
The method that checks whether overflow of the array is imminent will be shared by Put() and an insert_at(index) method. It can be coded as a method _check_capacity():
def _check_capacity(self): if self.SP < self.capacity - 1: # leave a spare element for a guard return self.capacity = self.capacity * 3 // 2 new_array = array_(self.capacity) copy_left(self.array, 0, new_array, 0, self.SP) self.array = new_array
The new array and the existing are separate so either of the copy functions could be used: use the simpler copy_left(). The method name starts with an underscore to indicate that it is a helper method belonging to List and not for public use by client code. I will endeavour to keep to this rule. push(item) calls the method which will make space for the addition when required:
def push(self, item): self._check_capacity() super().push(item)
Here super().push(item) calls Stack's push to do the addition. If I was coding this for real, the original push is trivial and I probably would put its code inline rather than calling it but here the full possibilities of inheritance are used.
The methods insert_at(index) and delete_at(index) are left for interested readers to code with the proviso that the methods are included in the 'solutions' package version. Inherited pop() requires no modification unless we want to copy a reduced number of items to a smaller array, perhaps later, so the final task is to write a method to insert the guard value.
Adding a guard
From the provision in check_capacity() there will always be sufficient room to accommodate a guard at the end of the array. A guard is not a permanent addition so, unlike insertions and deletions, adjusting the inherited stack pointer self.SP is not required:
self.array[self.SP] = item
Summary
In this section we have seen how an array can be made extensible by copying and visited two more OOP basics : class inheritance and polymorphism.
Two different requirements for copying elements left or right within the same array were identified. The single memcpy() function or its equivalent seen in many languages computes which of the alternatives to use when source and destination overlap. An array_cpy() function is provided in the solutions package.
[Algorithms 12: Basic OOP - a List type (c) Martin Humby 2015]