OU blog

Personal Blogs

David Pennington

TMA02 marked and now only the EMA to go

Visible to anyone in the world
Edited by David Pennington, Friday, 29 July 2016, 17:14

I can't believe that we are almost there. The EMA is due in early September and it is now the end of July. In my last post I was explaining that I was working at getting just enough to pass both TMAs with a 40% average. I started on the EMA and had a major issue (about using my own code to do the work rather than Python/Pandas) and was worried about my TMA mark. My tutor had to go away so, even though I got the work in early, it was going to be some time before I got the result. I was talking to her about how to approach the TMA with my coding issues when she threw in. - well your recent work has been very good!

When I got the TMA back, amazingly I had got 77% so I even beat the first one. Wow. Maybe I did understand more than I thought. My tutor advised me that, although I could use whatever I wanted to complete the EMA, I should only expect the marker to understand the tools in the course. She also advised me that I should choose two reasonable questions but it is the nature of this type of data that any question may prove to be unanswerable.

I spent many days investigating the bat data using both Excel and my own code and slowly formed some questions that looked reasonable. Everyone seems to be having a difficulty in incorporating the k-nn or k-means calculations into the final report but, suddenly one night around 1am when I woke up, I could see a use of k-means.

Finally, in the last tutorial, I also became aware of another aspect of all of this. This EMA isn't about coding. It is about analysis and reporting and, although there is a big weighting on the techniques, this doesn't actually mean that I have to write loads of code - just use the tools we have been given (which includes Excel) and "write a good report". The bulk of the marks are got by this and my tutor has given me some really great advice - both in the tutorial and in my TMA.

Incidentally, I got 60% in the iCMA for answering only four of the eight question. I did this by re-writing the code in Smalltalk. I now have coded k-nn, k-means and Cosine Distance which means that I have a great handle on what these are doing.

Final word. I am getting the data sorted and getting the nuts and bolts of the report under way. I can't see me looking at the last units of this course until I have submitted the EMA. I am interested in the "Semantic Web" but, as it isn't tested in the EMA, it can wait. I don't even need to do any of the last iCMA. 

Permalink Add your comment
Share post
David Pennington

ICMA45 and TMA02 - thoughts (but no solutions)

Visible to anyone in the world
Edited by David Pennington, Thursday, 26 May 2016, 14:29

I decided that I had to go back to the MongoDB part as, although I was now conversant with Riak, I didn't know enough about MongoDB to tackle the assignments. In the process, I installed on my Mac (using Brew) and found, to my surprise, that it worked first time! I was then able to play with the code in the "Definitive Guide" book. I then went back to the Notebooks and got confused - again - as it appears that this uses its own MongoDB database. I was expecting it to use the one I installed if it was there listening on the right port. I forgot that the Notebook is running inside Vagrant so couldn't see the other instance. Having messed about like this, I decided to finish off the ICMA - which contains questions that require coding to answer!

I answered all of the usual questions and got 29% - I need 30% in 5 ICMAs so if I could get a few more marks, I could put this to bed and stop worrying about ICMAs for the rest of the course. I went through the coding questions and came to one where they were looking for the month with the least accidents. I guessed a month and found that it was right! 10 marks! I ended up with 51.66% so quickly submitted it and now I am ICMA free.

I have now completed Question 1 of TMA 02 so I have plenty of headroom for the rest of the assignment as it isn't due until 16th June.

To bore you with more of my Riak activity, I have been using my son-in-law's expertise to play with the interface from my Smalltalk code into the database. Riak has an HTTP interface, which is what I am using. It also has a direct socket access which requires converting all the data into hex and so on. It has one advantage in that, once connected, Riak keeps the socket open. This should, in theory, provide a much faster interface. However, I must be doing something wrong because, as we now have the GET operation coded using both HTTP and the Protocol Buffers, I did some timings and found that my HTTP interface was almost twice as fast as the direct socket version. I think that we have to examine what our socket stuff is doing as it should be the other way round.

Lastly, I was having problems using JSON as the data format - the JSON interface that I was using comes from the Squeak world and I had to recode it to run under my VisualAge Smalltalk. In the end, I decided that the idea that Riak is a JSON database is a bit of a misnomer as it will accept anything that you give it. I decided to rework the data formats and save all of my objects in CSV format. I already had all the code to do this as it was CSV based before I started with Riak. Anyway, it is all converted and my software is running perfectly! I can now put the Riak stuff to bed for now and, hopefully, pick it up again when I propose my project at the end of the degree (12 months away).

Permalink Add your comment
Share post
David Pennington

Getting more concerned as time goes on + news of Raik

Visible to anyone in the world

I am bashing away at this but feeling that I missing too much as I go. As I mentioned before, my strategy is to crash through the course work and get myself plenty of headroom for the TMA. I am currently at the start of week 15 and have completed part 16 so I am about 2 weeks ahead. Tomorrow, I will start on question one of TMA 02 having submitted ICMA 44 some time ago. 

I got 55.67 out of a possible 60 for ICMA44 which gives me another result in the low 90s. 92.78% to be precise. I felt that i could have got 100% if I had tried another go round but I can't see the point in doing that when all I need is 30% to pass the module. So far, I have averaged 85.6% for the ICMAs so I am quite pleased. With 72% for TMA01, I am satisfied that my strategy is working.

Now, to part 15. Wow, what a devil of a part. This part concentrates on extracting data from a MongoDB database and mapping the results. I am finding this incomprehensible. The code is complex and arcane in its presentation and use and there is zero explanation of what is going on. As an example, in 15.1 it shows you how to map accidents in the area of North Wales Police. It does this by giving you a code - 60 - which represents this police area. In the assignment bit, it asks you to map accidents in the Hampshire police area with no indication of the code for Hampshire. In the solution, it shows you how to get the code - but we have never seen this before so I am not sure quite how I was supposed to know how to do it. So it goes on... 

Yet again, I decided to skim the solutions and to make sure I knew where to come back to when/if I need to. I have never had to tackle an OU course in this manner before and I hope not to have to again. I am very fortunate in have a long experience in coding object oriented software against SQL and NoSQL databases so, although I am not rigorously trained in the theory, my experience of the practice seems to be getting me through.

Now, to more interesting things. I have been working quite hard - outside of my allocated OU hours - on getting my new Riak database integrated into some of my fun code. Now that I am retired, I can work on my own projects rather than those for other people (well a lot of the time anyway as I still get paid consultancy). I have a project that, for convenience sake, was using CSV files as its data store. I am slowly converting that to use the Riak database I have running on Amazon AWS. I had most of the CRUD working but had a strange thing when trying to delete objects from buckets. I could delete a single object but when I tried to loop around the keys for the bucket, nothing was getting deleted. I ran the single code through Wireshark and compared it to the same call using Curl and couldn't see any difference. It took me hours to realise that I was switching the parameters to the single delete method when calling from the loop so the http call was badly formed. Did I feel stupid - yes, I did. Still it is sorted now and I can get on with the integration proper.

(Wireshark records all TCP/IP traffic down a network connection so you can see what is being sent. Curl is a Unix command line program that formats and sends HTTP calls.)

One last thing that I found - one of my classes has lots of entries and I firstly put them individually into a single bucket but I found that creating and saving - and reading and decoding the JSON was taking too long so I now put all of the individual items in a JSON array and save them in one go. Much quicker!

Permalink Add your comment
Share post
David Pennington

A hard 3rd week but a bit of realisation

Visible to anyone in the world

I have not found this third week particularly easy. I found that Notebook 04.5 particularly time consuming. I am not sure quite where the authors of the course got their timings from but I thought that the following was a bit optimistic:

"Activity 4.5 Notebook - 30 minutes"! At the end, it presents you with a list of tasks:

  • a) Show a count of the number of sales records for each District.
  • b) Show a count of the number of sales records for each Team in each District, including the Team and District margin totals.
  • c) Show the total sales value for each Team in each District summed over the year.
  • d) Show the total sales value for each Team Member in each District over the year, showing the District and Team member margin totals. (Remember you need the team name and salesperson name to identify each person uniquely.)
  • e) Show a bar chart of the number of sales each month. 
  • f) Show a bar chart of the total sales each month.
  • g) Show a scatter plot showing the Item Cost v. the number of Units in each record.
  • h) Add a Season column to the DataFrame. For each sale record, the value for Season will be derived from the month: (11,12,1) are Winter, (2,3,4) are Spring, (5,6,7) are Summer, (8,9,10) are Autumn. From the sales in each Season calculate the number, average, maximum, minimum and total sale amount over the season (that is, from all the sales records grouped by season report the number of records, and the average, maximum, minimum and total sales amounts).

These would prove to take me 1 1/2 hours on their own and I never did finish so quite where the 30 minutes for the whole notebook came from, I have no idea. Perhaps one of the course authors timed themselves? Anyway, I think that you would need to be very adept at Python and pandas to achieve the time. 

I left that notebook and tackled the next one - about Regular Expressions. I found this not quite so difficult but it did require some careful thinking. By this time, I was getting quite depressed about my chances of even getting to do TMA01, let alone complete it and continue. Fortunately, there was a post on the forum which set me back on the path of confidence (well - a bit). Someone mentioned that there wasn't an exam at the end so we had no need to have everything welded into our brains and that we could manage throughout by referring back to the course notes. A sign of understanding and relief was heard in my household.

On that basis, I tackled the first part of TMA01 and found that I could manage quite well! So, onward and upwards, as the saying goes.

I have been writing successful software for the last 35 years - some of it in mission critical situations such as bank currency trading rooms and so on. This has been done in Basic, which was taught to me by the OU in M251 An algorithmic approach to computing (in 1977), in UCSD Pascal (which was the OU's next teaching language and, finally, Smalltalk - which was the OU's third teaching language (replaced by Java and now Python to some extent). In all that time I have relied heavily on having documentation to hand as I have never managed to learn by rote all of the syntax of these languages. As an aside, Smalltalk (like pandas) has a very easy syntax but an extremely rich library so I spend some time regularly looking up stuff.

The upshot is, I think, that I should assume that I will do the same with Python, pandas and Regex and not try and have it all in my brain. I have one day left to complete week 4 to maintain my one week buffer so it is eyes down tomorrow for a good stint.

One last comment. Last week, I signed up for TM354 - Software Engineering, which starts in October. I have to say that those who know me from my development background expressed opinions that this would be quite boring given my CV. I felt that it would be a reasonable step to the final project course (TM470). However, it turns out that it has an examination. Ignore what I said above as I would be quite happy to sit this if it wasn't for my stupid arthritis and the problems it brings with continuity of concentration. As a result, I cancelled the application and signed up for TM352 - Web, mobile and cloud technologies. Firstly, this doesn't have an exam component but also it ties in very nicely with my thoughts on a project for TM470. More on this later.

Permalink Add your comment
Share post
David Pennington

Week One is nearly over

Visible to anyone in the world

I had planned to be well ahead by now but moving apartments put a stop to that. As it is, I have managed to get through to the end of week 2 and have the first ICMA (computer marked assignment) nearly completed. So far I have answered 4 questions correctly and one question that needed a second go. As I have to score 30% in five of the seven ICMAs I am quite pleased.

First reactions? Well, the Python bits are easy. Programming is what I do and the course is well presented through the iPython notebooks. I have had a few issues with data paths and, although I seem to have problems that others aren't getting, I am able to work my way around them. I don't seem to be able to make a CSV/JSON combination that works with CSVLint. My JSON description file won't be recognised by CSVLint. I have tried getting it through JSONLint but I can't even do that. So, that is in abeyance for now. I am having more difficulty with the theoretical parts but then I always thought that I would. It is, after all, 35 years since I have done any studying and my arthritis gives me issues with concentration. Funnily enough, I have no problem concentrating when I am coding. I think that I regard issues there as opportunities rather than as problems and I can keep going. Reading theoretical descriptions - I don't find that as gripping - so I have to work harder at it.

Reading the course description of databases has caused me to go back and think about the databases that I have worked with over the years. It is quite an interesting list.

  1. Linked Lists - I wrote my first database software back in 1980 when I created a Trading Room Front Office package using North Star Basic on a NS Horizon. For this, I wrote software for a series of linked lists that held transactions, exchange rates, dates, etc. all with an active re-use of deleted data spaces. The main problem with this was the fixed data widths.
  2. More Linked Lists. When I retired from the City and started my own software company we switched to UCSD Pascal and a complete re-write of the system. The product was called "Integrated Dealer Support System" (IDSS). To ensure that the database was robust, I got a guy called Mark Woodman to write the linked list code for me. At the time, he was an Open University lecturer (he is now a Professor at Middlesex University).
  3. In 1990, I switched to Visual Smalltalk (VSE) from Digitalk. With this, I wrote a Barrister support package based firstly on VOSS from LogicArts and then using Tensegrity from Polymorphic. These are/were Object Oriented Databases (OODB) that were integrated into the Smalltalk environment.  I went on to incorporate Tensegrity into my Smalltalk based Trading System which was called Powerdesk Trader.
  4. In later years, I have mostly written software for my own use. Between 2004 and 2012, I owned and ran a shop with my wife and daughter. This shop had a very active web site which, eventually when we closed the physical shop, became our main means of transacting business. The web software was based around a database of my own devising. Built using IBM's (latterly Instantiations) Visual Age Smaltalk - VAST - this utilised the built-in high speed directory system within Windows by having individual directories for all of the main database elements - products, users, sales, etc.  Within these directories I used VASTs own object dumper and loader mechanisms to save the various objects directly into their relevant directories with the file names providing the indexing. This worked remarkably well with the performance of the web site matching that of those with much more sophisticated (and expensive) systems.
  5. More recently, I have been building a freight car routing system for my US Outline model railroad. This utilises CSV files for its data. This option allows me to make changes to the data without having to write a lot of data management screens. For instance,when I purchase a new freight car, I can pull up the cars.csv file and add the new details within Excel.
  6. Finally, my serious programming has been for a US based insurance company where I my most recent project was to develop a insurance premium calculator based on normalising risk data for the clients. This gets its data from an MS SqlServer RDBMS.

Wow, that's a lot of variation through the years. It will be interesting to see how I really do cope with all the theoretical stuff, given that I have spent 35 years playing with all of the above!

Permalink Add your comment
Share post
David Pennington

The course is getting closer

Visible to anyone in the world
Edited by David Pennington, Thursday, 14 Jan 2016, 19:58

Getting closer meant that I had to get more serious about learning Python. 

I run a web and e-mail server from home. The e-mail server is for the family domain and there are three web sites that are running on the machine as well. I thought that I had put together a reliable setup when I  purchased a brand new Lenovo desktop box running Windows 8 (which was fairly quickly upgraded to Windows 10). However, the box freezes occasionally. I can't find any reason for this so I just have to manage the situation. I find the the Pingdom service, which is supposed to e-mail me when my server goes down isn't as reliable as I need.

I was casting around for another program to write in Python to extend my knowledge of the language so I thought that I would write my own Ping software. Now a ping is something that gets sent to a web address and reports back if the server at that address responds or not. What my software had to do was to regularly connect to the server and, if the server failed to respond, send me an e-mail. There was a few problems with me doing this in Python.

  1. I had no idea how to connect to a web server
  2. I had no idea how to send an e-mail
  3. I had no idea how to log these actions

So, this was going to test my ability to learn the inner details of Python. It was actually easier than I thought. Python has a rich set of libraries so I included the following in my program:

import smtplib
import requests
import time
import configparser
import os
These gave me everything that I needed. I used these libraries and very quickly I had my web checker program running. It runs in the background as a script on my iMac and has now been running for a few weeks without incident. It has shown me how useful Python can be. It would have been a good bit more complicated to do this simple task in Smalltalk - but then there are many things that I have done in Smalltalk that would be nigh on impossible in Python so, whilst I am pleased that I now know how to code in Python, I won't be giving up on Smalltalk anytime soon!

My next progression towards the course is a giant backward step. We have decided to move apartments and the move takes place on 23rd January. We are only moving from one floor to another (the new apartment is much bigger than the existing one) but, for various reasons, we are to lose our broadband internet connection for about 10 days - which covers the date of the start of the course! I am hoping that I can use my iPhone as a router for my iMac in the meantime. As we have a nice 4G signal here, that shouldn't be a hardship.

Lastly, I have quite severe Arthritis in a lot of my body so I have registered, through the OU, as disabled. Following an assessment of my problems, I have been awarded a grant that will provide me with equipment that will help me in my studies. I have difficulty sitting for too long in one position and find concentrating for long periods a trial due to pain levels.  Yesterday, I had a visit from a representative of an Ergonomics company who has carried out an assessment for a new desk, chair and reading aids. It seems that I am to get a desk that is electrically height adjustable, a fully configurable chair, an iPad holding device and a footrest. All of this is designed to get me into the best possible position for sitting at my iMac for extended periods. It seems to take about 6 weeks for this to happen but it does mean that, soon after the course starts, I will be able to get on with more success. My existing hobby room is way to small to take the new desk along with my other stuff so it is good that I am moving to a bigger apartment as the 2nd bedroom in that one is substantially larger than now. I might even have room for a decent model railway - at last!!!

Permalink Add your comment
Share post
David Pennington

Getting down to the prep.

Visible to anyone in the world
Edited by David Pennington, Tuesday, 8 Dec 2015, 11:01

I have been writing Smalltalk software since about 1990ish, firstly with VSE and then, from around 1995, IBM VisualAge(now Instantiations). This means that I am very embedded in the world of Object Oriented Software development (OO) and with the standard Smalltalk paradigm where everything is an object. Part of that experience means that I think in objects and write new classes for everything. This isn't always the case for people new to OO where they tend to write sequential code until they are forced into making a class. As an example, I used to work in a large banking software company where I was in charge of development of a reporting system that was being written in Java. This was the first Java development for the team and I was the only person that had really been exposed to OO (I am ignoring the fact the programmers had all been using C++ but I am guessing that they still basically wrote C code).

I was asking one of the senior guys (he had been in the company some 15 years) how it was getting on and his comment was "Glyn (one of the senior developers) tells me that I have to create more classes than currently. I can't see why. The software works, doesn't it?". 

OK, back to the present. The course that I am taking involves using the Python programming language for a lot of the data manipulation. There is a Python package called Pandas that has a huge library of tools for this. I did a three week on line course in Data Manipulation which gave an introduction into Pandas but felt that I really needed to get to grips with Python before the course starts at the end of January. I was mainly concerned with the way that OO works in Python as it looked very easy to write sequential code and ignore OO altogether.

Recently, I have been getting back into the groove of studying for 8 hours, or so, a week. If I don't get into the habit now then it will be difficult when the time comes. Hence, I have been carrying out a few tasks. These have included a course on Calculus, the Data Manipulation course mentioned above and a home study course in Latin (which recently morphed into Dutch). The calculus course was very trivial and I should have kept working through the home study book that I got for my birthday back in March. The data course was very good. The Latin/Dutch was hard because I have forgotten how to sit and learn vocabulary ready for the next session (shades of French and Latin back in my school days so many years ago!). I decided to write a testing program for the vocabulary. I did this in Smalltalk and it works very well as a web development.

Now - to learn Python. Why not recreate the vocabulary program in Python. Admittedly, it will be a command line program so nowhere near the Smalltalk one for presentation but functionality is everything in Python, rather than making nice windows etc. I spent the last 4 of my 2 hour sessions slowly working through the steps and ensuring that the Python code was fully object oriented. Well, yesterday, I was able to go through 10 questions of Dutch words and getting feedback on the right and wrong answers. It still has a way to go but it has proved to me that I "can" do this Python thing. I still have 7 weeks until the course starts so I still have plenty of time to get proficient.

The current Python code can be seen HERE. I will update it as I make further changes.

Lastly, the really good news is that I have been accepted for a Disabled Students Allowance with may provide me with a new, ergonomic, chair and desk that I can sit at for longer then 45 minutes at a time. I will report back on that one.

Permalink Add your comment
Share post
David Pennington

Back to the OU after 35 years!

Visible to anyone in the world
Edited by David Pennington, Friday, 15 Jan 2016, 17:40

I last studied with the OU back in 1980 so all of my previous credits are from that time. I have an original E student code - which amazingly I still remember. Due to Student Loan requirements, I am unable (as explained later - also unwilling) to take the two required prerequisites for the course I have signed up for. This course is TM351 Data management and analysis. 


I studied computer subjects back in the 1970s previous to the advent of micro-computers but was prepared for them when they finally arrived around 1978. At the time I was a Foreign Exchange Dealer in the City and, up to them, had assumed that the OU course was a side show and not relevant to my daytime occupation. Having attended a one week summer school at Reading University for my T101 Mathematics course, I returned with a program to do some elementary calculations on Money Market problems. My boss liked what I had done and arranged for me to obtain one of the first Z80 micro-computers to design a suite of programs for the dealing room. This was a North Star Horizon. This, along with an Elbit VDU was used to build a good suite of software. The bank, eventually, bought an Horizon for me to have at home.


I moved jobs and arranged to keep the Horizon. From there I wrote a whole package of front office software for the bank - including a package for our branch in Greece - built in Microsoft Basic on an Apple II. Eventually, I left the City and formed my own software company - in 1985. For the next 20 years I spent my time developing software in the Smalltalk programming language. Smalltalk is the original Object Oriented language - designed at Xerox PARC in the mid-1970s.


Consequently, I saw no reason for me to study OO programming using Java (which was one of the prerequisites) as I was already very competent in this field. After me writing a CV and obtaining a reference from an insurance company in Connecticut, USA (I had been writing custom software for them for the last 20 years!) the course team agreed that I had sufficient background to take the course without the prereqs.


I will still need to get up to speed with Python. To assist this, I am currently working on the Future Learn three week course - “Learn to Code for Data Analysis” which uses Python to analyse spreadsheet data. Once this is complete, I have designed a few projects to fully understand Python - one of which is to convert my existing Smalltalk web program (using Seaside) which tests vocabulary knowledge for those learning Latin!


So. Onwards. More on the Data Analysis course and Python next time.

Permalink Add your comment
Share post

This blog might contain posts that are only visible to logged-in users, or where only logged-in users can comment. If you have an account on the system, please log in for full access.

Total visits to this blog: 36210