Edited by Martin Thomas Humby, Wednesday, 1 Apr 2015, 14:16
Seems to me that the workings of an AVL self balancing binary search tree are easier to understand if all functionality is either in the tree or in the nodes, one or the other.
Node functionality fits well when using recursion to navigate the tree but has the disadvantage that for an empty tree the root is null. Thus we see code in the tree itself with most method bodies prefixed with
if root != None:
This implementation uses dynamic dispatch to overcome this problem. It features three classes of 'node': EmptyTree, a RootNode and Node. The behaviour of methods such as put(), to add a new key-data pair to the tree, and remove(key) vary according to the class of node they are called against.
If put(key, data) is called against an EmptyTree at the root, it creates a new RootNode using the key-data and assigns it to the root. When the tree becomes empty because all the nodes have been deleted the RootNode assigns an EmptyTree.
The power of Python allows the AVLTree class to be implemented without writing numerous methods that call the root. A single method binds the current root and its public methods to appropriately named instance variables:
class AVLTree(): def __init__(self): self.root = EmptyTree(self) self.setBindings()
Everything else is hidden and access to the root is completely transparent. A method call is anAVLTree.put(key, data) for example, where anAVLTree is an instance of the class. Binding and rebinding only takes place when the tree's state changes from empty to not-empty or vice versa, infrequent events for most applications.
The code is in the zip. If you would like to have a look extract all files to the same folder. A few points: if something: is used rather than if something != None: or if something != []: etc. I also make use of the fact that int(aBoolean) gets 1 if aBoolean is True, 0 otherwise, thus
def put(self, key, data = None, parent = None): # return 1 if the height of the subtree from this node has increased, 0 otherwise .......... balance = self.balance if key < self.key: if self.leftChild: self.balance += self.leftChild.put(key, data, self) ..........etc. return int(self.balance != balance and self.balance != 0)
And then of course there is += equivalent to Delphi's inc(a, byB) that may be unfamiliar to those who feel traditional Pascal represents a good style ........
A complete explanation of adjusting balance after a rotation can be found at (Brad Appleton 1997) but note that Brad uses right_height – left_height to determine balance while, by following maths' anticlockwise tradition and Wikipedia, I have ended up with the opposite.
One interesting point was that although in-order iteration while deleting nodes in the loop does not work (some nodes are not found) by doing this the tree is not corrupted. However, because I have used single linking, the tree can be cleared by assigning an empty tree to the root and in CPython all its nodes should be deallocated without waiting for garbage collection.
Update, stress test added: put 100,000 records, check they are in the tree, remove them all.
I was quite impressed by the time it takes to check the 100,000. Almost no time at all!
An AVL Tree in Python
Seems to me that the workings of an AVL self balancing binary search tree are easier to understand if all functionality is either in the tree or in the nodes, one or the other.
Node functionality fits well when using recursion to navigate the tree but has the disadvantage that for an empty tree the root is null. Thus we see code in the tree itself with most method bodies prefixed with
This implementation uses dynamic dispatch to overcome this problem. It features three classes of 'node': EmptyTree, a RootNode and Node. The behaviour of methods such as put(), to add a new key-data pair to the tree, and remove(key) vary according to the class of node they are called against.
If put(key, data) is called against an EmptyTree at the root, it creates a new RootNode using the key-data and assigns it to the root. When the tree becomes empty because all the nodes have been deleted the RootNode assigns an EmptyTree.
The power of Python allows the AVLTree class to be implemented without writing numerous methods that call the root. A single method binds the current root and its public methods to appropriately named instance variables:
Everything else is hidden and access to the root is completely transparent. A method call is anAVLTree.put(key, data) for example, where anAVLTree is an instance of the class. Binding and rebinding only takes place when the tree's state changes from empty to not-empty or vice versa, infrequent events for most applications.
The code is in the zip. If you would like to have a look extract all files to the same folder. A few points: if something: is used rather than if something != None: or if something != []: etc. I also make use of the fact that int(aBoolean) gets 1 if aBoolean is True, 0 otherwise, thus
And then of course there is += equivalent to Delphi's inc(a, byB) that may be unfamiliar to those who feel traditional Pascal represents a good style ........
A complete explanation of adjusting balance after a rotation can be found at (Brad Appleton 1997) but note that Brad uses right_height – left_height to determine balance while, by following maths' anticlockwise tradition and Wikipedia, I have ended up with the opposite.
One interesting point was that although in-order iteration while deleting nodes in the loop does not work (some nodes are not found) by doing this the tree is not corrupted. However, because I have used single linking, the tree can be cleared by assigning an empty tree to the root and in CPython all its nodes should be deallocated without waiting for garbage collection.
Update, stress test added: put 100,000 records, check they are in the tree, remove them all.
I was quite impressed by the time it takes to check the 100,000. Almost no time at all!