Edited by Martin Thomas Humby, Tuesday, 7 Apr 2015, 23:03
A Python implementation of a generic graph structure makes various algorithms from M269 Unit 5 easy to understand.
A graph comprises nodes(aka vertices) representing entities and edges, conceptualized as lines joining the nodes and representing the relationships between them. In real world applications the graph can model a situation involving relationships between concrete entities: towns and the roads between them, connection points on a PCB or microchip, buildings requiring provision of a cabled or piped service, the sides and base of a folded metal part, super-pixels in an image used for object recognition, and so on.
A graph can be directed: its edges can be traversed in only one direction, or not: two way traversal is permitted. Edge traversal may involve a cost and edges can have a weightrepresenting this cost. A single edge joins only two nodes and where there is no relationship between the two no edge joins them.
In this implementation nodes are represented by a base class Vertex:
class Vertex( ): def __init__(self, name, ID): self.ID = ID # integer identifier mapped to/from name self.name = name # any or multiple types relating to the application self.edges = StableSet() # addition order of vertices in the set is maintained self.distance = 0 # basic attribute e.g distance from nearest town
Edges are represented by an Edge class:
class Edge( ): def __init__(self, toVertex, delta = 0): self.toVertex = toVertex # the vertex at the far end of this edge self.weight = weight # the weight, cost, distanced got by traversing
By default a Graph object is directed – edges run in only a single direction toVertex, but a graph can be constructed as not-directed. When edges are added to a not-directed graph the vertices they reference are added, as for directed, but edges running in the reverse direction are also put to the set of edges for each additional node.
When a vertex is added to a graph any type can be used as its 'name': the identifier apparent to users. A name is maintained as unique as is the integer identifier it maps to.
A sensible way to construct the graph from Unit 5 Figure 5.20 is
g.addEdges('w', ('x',2), ('z',4)) g.addEdges('x', ('v',6), ('y',3)) etc........... then g.addEdges('w', ('u',2))
gets much the same graph although algorithms such as depth first show different output because edges in the StableSet are in a different sequence and vertices are found in a different order:
A Graph will construct any class of vertex or edge passed to its constructor. Now we can define Prim's algorithm as
Add the graph vertices to a minimum-priority queue with a start node distance zero, the rest with distance infinity. While there are vertices in the queue:
Dequeue a current-vertex and examine its edges. If any edge-weight to another node is less than the other node's distance and the other is still in the queue update its distance changing its queue position and set its currentNearest vertex to the current-vertex.
The node at the front of the queue now carries the minimum distance to get to it from any node and thus from its currentNearest so append the edge, front's currentNearest -> front-node, to a list of such edges.
Join up the edges in the edge list by matching start node to end node to get a new graph, the minimum spanning tree of the original.
General application
This project was implemented as a quick guide to Unit 5 graph algorithms. To make it generally applicable some modification might be good. Name mapping to integers looks OK: vertices are in a lookup table getting fast access through the graph structure. Implementing not-directed using double directed edges is an easy way to do it. Whether this will get any snags for more advanced work is unknown.
Adding algorithm specific attributes by sub-classing is not generally applicable I would think. Could be better to handle these by making the algorithm function add these attributes and delete them afterwards. The current minimum-priority queue is probably more a good laugh than a serious contender.
Single class implications
In current Python an object can be given a new attribute at any time simply by assigning a value to it, and subsequently removed using del()
This works with new or old-style objects. Providing the multipurpose Vertex.distance field is retained so that comparison works as before the impact on the algorithms from using this strategy is generally minimal. The exception is topological-sort requiring no less than four loops through the entire graph to deal with the required incoming edges attribute.
The zip has been updated to include these investigations in TopSort2.py, Prims2.py, etc.
To deal with TopSort requirements looked at some variations to Graph and Vertex and included a fromVertex attribute for Edge and an in_edges attribute for Vertex. Initial TopSort incoming is now aVertex.in_order(). These variations are in Graphs3.zip
Easy graph algorithms
A Python implementation of a generic graph structure makes various algorithms from M269 Unit 5 easy to understand.
A graph comprises nodes (aka vertices) representing entities and edges, conceptualized as lines joining the nodes and representing the relationships between them. In real world applications the graph can model a situation involving relationships between concrete entities: towns and the roads between them, connection points on a PCB or microchip, buildings requiring provision of a cabled or piped service, the sides and base of a folded metal part, super-pixels in an image used for object recognition, and so on.
A graph can be directed: its edges can be traversed in only one direction, or not: two way traversal is permitted. Edge traversal may involve a cost and edges can have a weight representing this cost. A single edge joins only two nodes and where there is no relationship between the two no edge joins them.
In this implementation nodes are represented by a base class Vertex:
Edges are represented by an Edge class:
By default a Graph object is directed – edges run in only a single direction toVertex, but a graph can be constructed as not-directed. When edges are added to a not-directed graph the vertices they reference are added, as for directed, but edges running in the reverse direction are also put to the set of edges for each additional node.
When a vertex is added to a graph any type can be used as its 'name': the identifier apparent to users. A name is maintained as unique as is the integer identifier it maps to.
A sensible way to construct the graph from Unit 5 Figure 5.20 is
but
gets much the same graph although algorithms such as depth first show different output because edges in the StableSet are in a different sequence and vertices are found in a different order:
To provide additional vertex attributes, as needed by Prim's algorithm for example, an extended class can be used:
A Graph will construct any class of vertex or edge passed to its constructor. Now we can define Prim's algorithm as
Add the graph vertices to a minimum-priority queue with a start node distance zero, the rest with distance infinity.
While there are vertices in the queue:
Dequeue a current-vertex and examine its edges. If any edge-weight to another node is less than the other node's distance and the other is still in the queue update its distance changing its queue position and set its currentNearest vertex to the current-vertex.
The node at the front of the queue now carries the minimum distance to get to it from any node and thus from its currentNearest so append the edge, front's currentNearest -> front-node, to a list of such edges.
Join up the edges in the edge list by matching start node to end node to get a new graph, the minimum spanning tree of the original.
General application
This project was implemented as a quick guide to Unit 5 graph algorithms. To make it generally applicable some modification might be good. Name mapping to integers looks OK: vertices are in a lookup table getting fast access through the graph structure. Implementing not-directed using double directed edges is an easy way to do it. Whether this will get any snags for more advanced work is unknown.
Adding algorithm specific attributes by sub-classing is not generally applicable I would think. Could be better to handle these by making the algorithm function add these attributes and delete them afterwards. The current minimum-priority queue is probably more a good laugh than a serious contender.
Single class implications
In current Python an object can be given a new attribute at any time simply by assigning a value to it, and subsequently removed using del()
This works with new or old-style objects. Providing the multipurpose Vertex.distance field is retained so that comparison works as before the impact on the algorithms from using this strategy is generally minimal. The exception is topological-sort requiring no less than four loops through the entire graph to deal with the required incoming edges attribute.
The zip has been updated to include these investigations in TopSort2.py, Prims2.py, etc.
To deal with TopSort requirements looked at some variations to Graph and Vertex and included a fromVertex attribute for Edge and an in_edges attribute for Vertex. Initial TopSort incoming is now aVertex.in_order(). These variations are in Graphs3.zip