OU blog

Personal Blogs

Christopher Douce

Git, GitHub and Java

Visible to anyone in the world
Edited by Christopher Douce, Wednesday 12 November 2025 at 07:35

As a part of my involvement with three different software engineering modules, I’ve recently spent a few days looking at something called Git and GitHub. There are a couple of reasons to do this. The first is to get up to speed with current software engineering technologies and tools. The second reason is to try to get a bit of inspiration for ideas for a software engineering project that use the Java programming language.

What follows is a set of notes that I have made whilst having a good look around. (I expect I will be updating this post from time to time as I find out more.) It also shares some useful resources that I have collated, along with some links that I may wish to return to later. I have written it in the style that I might write module materials.

Before looking at some repositories, you will look at the basics (this is the style adopted for module materials), and then look at some useful (and related) resources.

The basics

Git is as a version control system (VCS). It is an essential software engineering tool that helps you to keep your software safe and secure, whilst also enabling you to collaborate with other software engineers. Your fellow collaborators might be in the same building, or might located in an entirely different country.

A key idea is that Git can help to save you from yourself. If you make a mistake whilst creating your software, it gives you the ability to return to an earlier version.

An important point is that Git and GitHub is not the same thing. You use the Git software to store your files in a repository. Your repository can be held in GitHub. Here are links to these two key elements:

Through Git, you can access the software files held in a GitHub repository in different ways. You can use what is known as a command line, or you can use a graphical user interface (GUI). You might also access the files through an Integrated Development Environment (IDE).

When learning to use Git, it is recommended that you use what is known as a command line. Think of the command line as a clear and direct way to give instructions (commands) to a computer. Professionals often use command lines, since they are more powerful and more expressive than using a limited set of instructions that can be given through a user interface.

If Git has been installed in your computing platform, you will be able to issue Git instructions through a command line that is available through a terminal window. The idea of a ‘terminal’ dates back to the early days of computing, when users would use the facilities of a larger (and much more expensive) mainframe computer through a ‘dumb terminal’, which is, essentially, a screen and a keyboard without and accompanying computing capabilities. A terminal window is, simply, a way to issue instructions (input) and to receive confirmation that they have been carried out (output).

Some videos

Whilst having a look at all this, I have discovered some useful videos. Here are three that I have found most useful:

Out of all these, I prefer the ‘for dummies’ tutorial. A useful activity is to make a note of the commands that were used in this video.

Some teaching materials

I’ve been made aware of some useful teaching tools and materials. We were made aware of GitHub Classroom during a recent school seminar. There’s a lot to this, and I’ve not spent any time looking at it, but it strikes me as a really powerful tool.

A really useful (and practical) resource is the Git tutorial that is provided by Software Carpentry, which “develops and teaches workshops on the fundamental programming skills needed to conduct research”. It aims “to provide researchers high-quality, domain-specific training covering all aspects of research software engineering”. Since computing is a tool for carrying out research, it offers a useful Git tutorial for novices.

Following on from the earlier comment about Git commands being issued through a terminal window, there is also something called a ‘shell’. A shell is a command line runs within a terminal window, and used to give instructions to a computer. The term comes from the idea that it is a ‘thin cover’ around your computer’s operating system. Some operating system can have different ‘shells’, and some of them can have unusual names, like Bash. More information about what shells are, and how they can be harnessed is available through the Software Carpentry The Unix Shell for novices resource.

Some books and articles

A really useful book to look at is:

Ponuthorai, P.K. and Loeliger, J. (2022) Version Control with Git. 3rd edn. O’Reilly Media.

This book is available in the OU library through the O’Reilly Safari bookshelf.

Appendix A, History of Git, is particularly useful and summarised a history that I was never aware of. It begins as follows: “No cautious, creative person starts a project nowadays without a backup strategy. Because data is ephemeral and can be lost easily—through an errant code change or a catastrophic disk crash, say—it is wise to maintain a living archive of all work.” It goes to highlight its significance and use within teams: “For text and code projects, the backup strategy typically includes version control, or tracking and managing revisions. Each developer can make several revisions per day, and the ever-increasing corpus serves simultaneously as repository, project narrative, communication medium, and team and product management tool.”

The history of Git is introduced through the following paragraph: “Git, a particularly powerful, flexible, and low-overhead version control tool that makes collaborative development a pleasure, was invented by Linus Torvalds to support the development of the Linux kernel, but it has since proven valuable to a wide range of project.” The appendix presents a brief summary of some influential predecessors.

On the topic of predecessors, a colleague shared the following article, which should be available to all students:

Ruparelia, N, B.(2010) The history of version control. SIGSOFT Software Engineering Notes, Vol 35, No 1, pp.5–9. Available at: https://doi.org/10.1145/1668862.1668876.

On the surface of it, it is quite a dry and difficult read, especially if you are unfamiliar with the background or terminology.

The paper presents a ‘brief taxonomy of version control’. Rather than being a set of categories, this section is a set important terms and concepts, some of which you may recognised from the earlier video. It also mentions something called IT Service Management (ITSM), which is a topic that is discussed in a later module.

It is worth saying something about the notion of Open Source Software (OSS), since this is a key element of the ‘historical perspective’ discussion. OSS is software where the source code, the lines of code that collectively create the software, is available for other people to use and modify. OSS can, of course, be contrasted with proprietary software, which cannot be viewed or changed by others.

The article summarises to story of development and refinement of VCS’s. You don’t need to worry about how versions of software are saved in a repository. The evolution of the approaches have been driven be developments in computing, such as increases in processing power, storage capacity, and the increases in network connectivity. The most important narrative in this story relates to the open source version control systems, rather than the commercial (proprietary) equivalents.

The article, along with the appendix in the O’Reilly text suggest the emergence of a tension between a proprietary version control system, and the open source community that support the development of the Linux operating system. The article states ‘not being allowed to see metadata and compare past versions was a major drawback of the community version of BitKeeper, and one that specifically inconvenienced most Linux Kernal developers’ (Ruparelia, 2010, p.7). Metadata is, of course, data about data. One of the advantages of version control systems is that you can add metadata, in the form of explanatory comments, when you make a change to your software. Metadata enables you to keep track of what has changed, and why those changes have been made.

The open source section of Ruparelia’s article ends by highlighting the beginning of the development of Git. It is also useful to look at the ‘version control’ in the future section, which emphasises something called a DVCS, a distributed version control system: ‘DVCS is becoming increasingly popular in the open source community and, over time, will replace centralized system’ (Ruparelia, 2010, p.8). With Git being a DVCS, and becoming the dominant tool, this prediction has become true.

Ruparelia also suggests that a VCS will ‘become increasingly integrated with a) the entire software life-cycle, from requirements capture to defect tracking, and b) the broader configuration and change management tools and processes as defined by ITSM frameworks such as ITIL’ (ibid., p.8). The point here is that it is important to keep track of a variety of digital artefacts within the software development lifecycle: requirements, code and tests. Also, software engineering processes must necessarily adapt and change. To keep track of what is changing, it is useful apply under version management.

GitHub has become the world’s largest source code repository, with tens of millions of public projects. This concentration of code has had an interesting effect. Source code has become training data for generative artificial intelligence software, which has led to the development of an ‘AI accelerator’ known as GitHub Copilot. This is, of course, not without concerns about ethics, privacy and security. There are also legitimate concerns about how much energy AI tools consume, which is a current topic of research.

To complement Ruparelia’s article and the earlier video resources, you might find this summary article useful:

Munezero, P. (2024) Top Ten Git Commands for Systems Engineering Digital Artifacts Authors and Analysts, IEEE International Systems Conference (SysCon), Montreal, QC, Canada, pp. 1-3, Available at: https://doi.org/10.1109/SysCon61195.2024.10553438.

Look at the list of commands you have noted down from the videos you have watched. Are there any differences?

Exploring GitHub

I spent about a day looking through GitHub to get some ideas about Java projects.

I quickly noticed a couple of things. The first is that some projects have been given ‘stars’, which I am assume relate to their popularity. I soon decided to look through pages of projects until I got through to ‘zero stars’. Secondly, different projects use different software licences.

Software licences is a topic all of its own, and I’m no expert. Two importance licences that I’m aware of are the GNU General Public ‘copy left’ licence, and the BSD licence. With the GNU Licence, you can take existing software and code that is published under the GNU licence, modify it, use it, and distribute it (along with its source code) also under the GNU licence. A significant restriction is that you are not allowed to create proprietary software; software that is sold for a profit. A BSD licence (which is similar to the MIT and Apache licence) is different, in the sense that you can create software that can be sold. The point here is that if you look at code, you need to also keep in mind its licence.

My starting point was a site called CodeTriage which highlights Java projects that you can contribute to. It was all a bit overwhelming, but Book Project piqued my attention. From here I decided to look at GitHub topics. Plus, this software was a web based project. Perhaps there were Java projects that ran a simpler desktop application?

Topics

Here are some topics that I had a browse through. Do note that the numbers that are shared are correct at the time of writing. By way of comparison, I’ve also provided a link to Python topics:

  • Java: nearly 280k public repositories.
  • Python: approaching 570k public repositories.

Turning to the idea of a standalone Java application, I clicked through the following topics. Swing is the name of a cross-platform user interface library:

Whilst browsing through these repositories, I have noticed some other interesting topics (or tags):

I'm assuming that this list of projects are do not include those that are hosted through GitHub Classroom.

Browsing through the repositories

Whilst browsing through the repositories, mainly looking at standalone Java desktop projects, I started to notice some patterns and themes amongst repositories.

At the time of writing, TM354 Software Engineering makes use of a case study, a hotel booking system. Interestingly, there are quite a lot of them. Here are three examples. One of my tasks will be to have a look at these (and other examples) in a bit more detail.

TM354 also contains some examples of other systems, such as Library management systems, which are described using the UML modelling language. I found seven different examples of these, written as standalone Java projects, which also made use of a SQL database. Here is a notable example:

I also found different types of management applications. I found an inventory management system, restaurant management application, a drink ordering tool, a dental surgery application, a school administration system, a gym management tool, a laboratory management tool, and a school administration system.

I also found good number of games that run as Java Swing applications, such as Connect 4, tic-tac-toe, Sudoku, Othello, Battleships, Minesweeper, Tetris, and Space Invaders. There were also some study tools, such as software to create flash cards and quizzes (which reminds me of a project that is applied in TM112 Introduction to Computing and Information Technology 2).

It is notable that a lot of these GitHub projects are also clearly computing projects for software development or software engineering classes. This takes me to look at a topic that I mentioned earlier: group projects (over 1000 public repositories).

Group Projects

A range of different languages and technologies are used with group-project topic repositories. Without digging too far, it is possible to see repeated patterns. There seems to be management systems of various flavours, games and tools.

Agile tools projects

Reflecting on the idea of tools or utilities takes me to look at another category: agile software development tools. Narrowing this further, I remembered the idea of a Kanban board, an ‘information radiator’ that is used to share project status to agile teams. Since software engineering can be carried out at a distance, Kanban boards have moved from the physical to the virtual. Without really looking very hard, I found six different examples of Java-based Kanban board projects.

Here are three of them:

Project ideas

I really like the idea of a project that relates to the process of software engineering, which is why I went looking for Kanban board projects. Maybe there could be other projects that relate to requirements gathering (requirements engineering), or perhaps even testing. It would be interesting to look in GitHub to see what I can find. I have also noticed that there were quite a few ‘quiz and test’ projects, or projects that relate to flashcards.

All this has led to a collision of ideas. Over ten years ago, the university initiated a project that was known as SocialLearn, ‘a cross-University project to combine the best of social web technologies with those of online social learning’. It was a grand idea, but having a quick read of a paper that was published at the time suggests that it was way too broad and ill defined.  Technology was prioritised over pedagogy, and what was described wasn’t awfully clear. Clear requirements are essential.

Here's my idea: a social quizzing application. One student can challenge another student with questions that relate to their study of a particular module. These questions might be from a ‘question’ library, or they might be ‘one off’ questions written by themselves. Rather than having answers marked by computers, they are ‘assessed’ by fellow students. The quality of questions can be graded. Requirements can be added. Also, an ‘evil genius’, who works in the module team could post conflicting requirements which everyone has to figure out. How to dealing with security is also going to be a necessity.

Another thought is some kind of ‘study buddy’ tool, which might be a combination of a note taking tool and a study schedule planner. When working as software engineer, I kept a notebook, which was invaluable. Project logs are emphasised in the project module.

A final thought is some kind of simple ecommerce solution, such as a shop front (which reflects some of the projects, such as the inventory management systems that I’ve uncovered during my browse of GitHub). Admittedly, this isn’t very exciting, but it might well depend on what services are offered or sold. It might be for a community café or museum; scenarios that have been used in TM354.

Maybe this is all too complicated. Maybe the answer is a hotel booking system.

Further resources

Before sharing some final thoughts, and having found so many examples of projects within GitHub, it is worth mentioning the following article that also reflects the seminar that I mentioned at the start of this post:

Tu, Y.-C. et al. (2022) GitHub in the Classroom: Lessons Learnt, in J. Sheard and P. Denny (eds) Proceedings of the 24th Australasian Computing Education Conference. New York, NY, USA: ACM, pp. 163–172. Available at: https://doi.org/10.1145/3511861.3511879.

Reflections

The first ever version control system I was exposed to was CVS. A system administrator I worked with used it to keep track of text files that described user permissions. It wasn’t too long before I understood its usefulness. The grumpy director of the research institute where I worked asked: ‘when did you add all those users?’ A quick look at the change log gave exact dates when the change was made. A comment also indicated why the changes were made. The reply was: ‘I added them, because you asked for them to be added two years ago’.

When I changed jobs, I started to use Microsoft Visual Source Safe (VSS), which is mentioned in the Ruparelia article. Again, I could see its usefulness. We could add labels to entire code bases, enabling us to retrieve all the source code that was a part of a very specific release of software. We could roll back if we needed to. It was a tool that we could use to protect ourselves, from ourselves. VSS has since been discontinued. Microsoft now uses Git.

There is always more things to do. I’ve installed Git. I’ve made a note of all of the commands. Although I’m more familiar with Java, I need to know more about Python, and how it can be used to manipulate XML files and help with audio processing. I’m then going to create a Git repository.

Acknowledgements

Many thanks to Tamara Lopez for kindly sharing some of the resources and articles that are mentioned in this blog.

Permalink
Share post
Christopher Douce

TM470 Resources

Visible to anyone in the world
Edited by Christopher Douce, Wednesday 5 November 2025 at 16:42

This post shares some resources that might be useful during the course of your project module. Some of what is presented here may be already familiar to you, having studied earlier modules.

What you will need will, of course, vary depending on your project. You will need to make choices about what you need. In your project report, you also need to say something about why you have chosen what you have chosen. Also, do remember, that what you find might be even more helpful than what is given here.

Diversity, Equality and Inclusion

In the UK, the Equality Act 2010 is an important piece of legislation, which defines a number of protected characteristics.

When it comes to computing projects, the W3C WAI WCAG is an important resource, especially if your software product makes use of web technologies.

Ethics

Software systems affect people and society. Requirements for software systems can come from people and society. Wherever there is people, there is also ethics. Computing professionals must behave in a way that is ethical. To help with the Legal, Social, Ethical and Professional Issues, the following sets of guidelines are considered to be helpful:

Gantt chart tools

As suggested in earlier posts, it is a good idea to create a Gantt chart for your project to create an overview of what you expect will happen and when

It is completely up to you how you create your Gantt chart. You might choose to create one using many of the different spreadsheet templates that are available. Alternatively, you could download a software package, such as Project Libre Desktop which is an alternative to the popular Microsoft Project package.

Another approach would be to use one of the many cloud-based Gantt tools that are available, such as GanttPro. These cloud-based products often offer a trial period, after which you have to pay a monthly fee.

Every tool has its own advantages and disadvantages and will have learning curve if you haven’t used one of these tools before.

Generative AI

GenAI can be considered to be a useful resource, but it must be used with caution. Every student has access to a Microsoft product called CoPilot. If AI is used in the creation of code that is used within a software solution, you must document what prompts have been submitted to which language model. If you use AI as a part of a solution to a wider problem, it is important and necessary to discuss the implications of its use, in terms of risks, ethics and bias.

University level guidance is available through this resource: Generative AI in learning, teaching and assessment at the OU. This provides a link to guidance for students. Guidelines for referencing the use of GenAI can be found, in part, on the CiteThemRight website.

Literature review

The literature review is, as suggested earlier, a summary of all your reading that has contributed to the development and completion of your project. Whatever you mention in your literature review rection should be used or applied in some way.

In addition to some of the skills resources that follow, the following resources may be useful:

An introduction to software development Open Learn Badged Open Course (BOC), which contains a very useful section, Finding and reading academic articles.

Srinivasan Keshav’s article entitled How to read a paper offers some practical guidance about how to read and analyse an academic article.

Project management resources

All the guidance you need to complete your project is presented within the module materials (and within this accompanying guide). There are, of course, other resources in the world that can offer some complementary guidance.

One such resource is the Project Management for IT-Related Projects: 3rd edition, published by the British Computer Society (BCS). You don’t need to buy this text, but you may be able to access parts of it through the OU Library.

Although this text is intended for industry professionals, it may be useful for your project. The guidance about project models reflects some of the advice shared in module materials.

Prototyping tools

There are different approaches to prototyping. One of the simplest and most useful tools is, of course, pencil and paper. It is acceptable to draw prototypes of your software product and share these within your project report. An earlier article, TM470 Considering prototyping offers a bit more guidance about the concept of prototyping.

If you wish to use a tool to help you with your prototyping, the following might be useful:

If you wish to go beyond creating prototypes of user interfaces, you could use other tools to create prototype designs of your software system. One such tool is Visual Paradigm which can be useful with drawing of UML diagrams, and diagrams that describe cloud computing infrastructure. Other cloud-based drawing tools could, of course, be used.

Products such as Balsamiq and Visual Paradigm can be used with an academic licence.

Further academic guidance about prototyping can be found in the following text book:

Sharp, H., Rogers, Y. and Preece, J. (2023) Interaction design : beyond human-computer interaction. 6th edition. Milton: Wiley. 

Which is available through the OU Library

Risk assessment and management

Accompanying the description of the lifecycle model are two useful sections: risk assessment and risk management. Risk assessment concerns with considering what the risks to your project might be. Risk management concerns with approaches to deal with those risks. The risk assessment approach presented in the module is relatively simple. It considers risks in terms of impact and likelihood, which is sufficient for the needs of your project report.

Every project is, however, different in the ways that it need to take account of risk. Some wider issues that might be helpful includes the NIST Cybersecurity Framework. There is also ISO 27001, an international standards which concerns information security. Risks can be managed not just in terms of how technology is used, but also how human processes are applied.

Skills, writing and studying

The following links offer some useful practical guidance about writing and studying:

Within the library pages, the following two pages are particularly helpful:

Do also refer to sections in this guide that refer to the writing of your project report and the appropriate uses of Generative AI.

Software development tools

Your choice of tools will depend, in part, on the characteristics of your project, and what skills you wish to develop. An important tool is the integrated development environment (IDE), which often integrates together a text editor, a debugger, and a way to run your software. IDEs can also be connected with version control software (to keep your software safe) and AI assistants, which help you with your learning. Historically, IDEs used to focus on a single programming language, such as Java or Python. Popular IDEs now work with multiple programming languages.

What follows is a useful summary of some popular IDEs:

Apache Netbeans. Netbeans primarily supports the Java programming language but can be used with other languages through extensions.

Eclipse. Eclipse is an enterprise level ‘cloud native’ IDE that supports a wide variety of languages.

IntelliJ. IntelliJ is a popular commercial IDE that was notable for its usability and functionality. A ‘dev toolkit’ pack is available for students.

Microsoft Visual Studio Code. Visual Studio Code is notably the most popular IDE. It can be used with multiple languages. VS Code is not to be confused with Microsoft Visual Studio, which is a different similarly named product.

PyCharmPyCharm is a Python IDE from JetBrains who also have created IntelliJ. An alternative tool for Python developers is the Jupyter Notebook environment.

Software testing

A thorough project will move through an entire cycle of gathering requirements, implementing those requirements, through to carrying out of testing to make sure that requirements have been implemented correctly.

There are a number of tools that can help with the software testing (which is not to be confused with usability testing).

In terms of source code, there is a unit testing framework which is known as xJunit, where the ‘x’ refer to the initial of a programming language. The xUnit.net site relates to a unit testing framework that can be used with a number of different Microsoft .NET languages. JUnit.org if specifically concerned with unit testing of Java code. There is also PyUnit which concerns the testing of Python code.

Unit testing is testing that operates at a low level. Moving up a level, there are other tools, such as Cucumber. There are other tools out there, such as Selenium, but this takes us beyond the boundaries of a project, and towards testing at an industrial level.

An important point to remember is that the extent of testing that is necessary depends on what the impact of errors might be. Testing is guided by risk.

Version management

It is important to keep the software you create safe. The best way to do this is to use a version management system. The dominant tool used for version management is called Git. Git allows you to save your software into a safe repository called GitHub. The following introductory videos offer some useful explanations:

To begin to use Git, you must install the Git software on your local computer. If you are using Windows, you will install something called a ‘terminal’, MinTTY, which allows you to execute Git commands and upload your code to GitHub. There is also a graphical user interface version, but in terms of gaining practical experience, it is best to use the terminal (unless, of course, you manage to use Git as a part of your IDE).

By way of further information, the following resource is helpful: Ponuthorai, P.K. and Loeliger, J. (2022) Version Control with Git. 3rd edition. O’Reilly Media. Appendix A, History of Git, offers a lot of useful background information, which is worth a read.

Reflections

It is hoped this blog post offers some useful resources to complement what is shared within the module materials. Since your project is intended to be an individual project, I have not covered tools that support collaboration in an agile environment, such as Jira. Similarly, I have not shared resources that concerns covered cloud computing, since there is no need to consider deployment beyond your own personal computer, unless you decided this is an important element of your project.

A key point is that you must choose resources that you feel you need to use and apply within your project. You also should find the time to write about what these are. You should also say something about the skills you need to acquire to use or apply these resources.

A final note: none of the software products mentioned here are official university recommendations; these are personal opinions (which may, or may not, be useful).

Permalink
Share post

This blog might contain posts that are only visible to logged-in users, or where only logged-in users can comment. If you have an account on the system, please log in for full access.

Total visits to this blog: 3303394