OU blog

Personal Blogs

Christopher Douce

Generative AI and assessment in Computing and Communications

Visible to anyone in the world
Edited by Christopher Douce, Wednesday, 27 Nov 2024, 14:06

On 13 November 2024 I attended a workshop that was all about Generative AI and assessment for OU computing modules that was organised by colleagues Michel Wermelinger (academic co-lead GAI in LTA), Mark Slaymaker (C&C Director of Teaching) and Anton Dil (Assessment Lead of C&C Board of Studies). What follows are a set of notes and reflections which relates to the event. I also share some useful links and articles that I need to find the time to follow up on. This summary is, of course, very incomplete; I only give a very broad sketch of some of the discussions that have taken place, since there is such a lot of figuring out to do. Also, for brevity, Generative AI is abbreviated to GenAI.

Introduction and some resources

The workshop opened with a useful introductory session, where we were directed various resources which we could follow up after the event. I’ll pick on a few:

To get a handle on what public research has been published by colleagues within OU, it is worth reviewing ORO papers about Generative AI

The following notable book was highlighted:

There were also a few papers about how GenAI could be used with the teaching of programming:

To conclude this very brief summary, there is also an AL Development Resource which has been produced by Kevin Waugh, available under a Creative Commons Licence.

Short presentations

There were ten short presentations which covered a range of issues, from how GenAI might be used within assessments to facilitate meaningful learning, through to the threats it may offer to academic integrity. What follows are some points that I found particularly interesting.

One of the school’s modules, TM352 Web Mobile and Cloud technologies, has a new block that is dedicated to tools that can be used with software development. Over the last couple of years, it has changed quite a bit in terms of the technologies it introduces to learners. Since software development practices are clearly evolving, I need to find the time to see what has found its way into that module. It will, of course, have an influence on what any rewrite to TM354 Software Engineering might contain.

I learnt about a new software project called Llama. From what I understand, Llama is an open source large language model (LLM) engine that can be locally installed on your desktop computer, from where it can be fed your own documents and resources. The thing is: LLMs can make things up and get things wrong. Another challenge is that LLMs need a lot of computing resources to do anything useful. If students are ever required to play with their own LLMs as a part of a series of learning activities, this raises the subject of computing equity: some students will have access to powerful laptops, whereas other might not. Maybe there will a point when the school may have to deploy AI tools within the OpenSTEM Labs?

Whether you like them or not, a point was made that these tools may begin to find their way closer to the user. Tony Hirst made the point that when models start to operate on your own data (or device) this may open up the possibility of semantically searching sets of documents. Similarly, digital assistants may begin to offer more aggressive help and suggestions about how to write bits of documents. Will the new generation of AI tools and digital assistants be more annoying than the memorable Microsoft Clippy

GenAI appears to be quite good at helping to solve well-defined and well-known programming problems. This leads us to consider a number of related issues and tensions. Knowing how to work with programming languages is currently an important graduate skill; programming also develops problem decomposition and algorithmic thinking skills. An interesting reflection is that GenAI may well help certain groups of students more than others.

Perhaps the nature of programming is changing as development environments draw upon coding solutions of others. Put another way, in the same way that nobody (except for low level software engineers) knows assembly language these days, perhaps the task of software development is moving to a higher level of abstraction. Perhaps developing software will mean less coding, but more about how to combine bits of code together. This said, it is still going to be important understand what those raw materials look like.

An interesting research question was highlighted by Mike Richards: can tutors distinguish between what has been generated by AI and what has been written by students? For more information about this research, do refer to the article Bob or Bot: Exploring ChatGPT’s answers to University Computer Science Assessment (ORO) A follow on question is, of course: what do we do about this?

One possible answer to this question may lie in a presentation which shared practice about the conducting of oral assessments, which is something that is already done on the postgraduate M812 digital forensics module. Whilst oral assessments can be a useful way to assess whether learning has taken place, it is important to consider the necessity of reasonable adjustments, to take account of students who may not be able to make an oral assessment, either due to communication difficulties, or mental health difficulties.

The next presentation, given by Zoe Tompkins, helped us to consider another approach to assessment: asking students to create study logs (which evidence their engagement), accompanied by pieces of reflective writing. On this point, I’m reminded of my current experience as an A334 English literature student, where I’m required to make regular forum postings to demonstrate regular independent study (which I feel that I’m a bit behind with). In addition to this, I also have an A334 reflective blog. A further reflection is that undergraduate apprentices have to evidence a lot of their learning by uploading digital evidence into an ePortfolio tool, which is then confirmed by a practice tutor. Regular conversations strengthen academic integrity.

This leads onto an important question which relates to the next presentation: what can be done to ensure that written assessments are ‘GenAI proof’? Is this something that can be built in? A metaphor was shared: we’re trying to ‘beat the machine’, whilst at the same time teaching everyone about the machine. One way to try to beat the machine is to use processes, and to refer to contexts that ‘the machine’ doesn’t know about. The context of questions is important. 

The final presentation was by one of our school’s academic conduct officers. Two interesting numbers were mentioned. There are 6 points that students need to bear in mind when considering GenAI. If I’ve understood this correctly, there are 19 different points of guidance available for module teams to help them to design effective assessments. There’s another point within all this, which is: tutors are likely to know whether a bit of text has been generated by a LLM.

Reflections

This event reminded me that I have quite an extensive TODO list: I need to familiarise myself with Visual Studio Code, have a good look at Copilot, get up to speed with GitHub for education, look at the TM352 materials (in addition to M813 and M814, which I keep meaning to do for quite a while), and review the new Software Engineering Body of Knowledge (SWEBOK 4.0) that has been recently released. This is in addition to learning more about the architecture of LLMs, and upskill myself when it comes to the ethics, and figure out more about the different dimensions of cloud computing. Computing as moved on since I was last a developer and software engineer. With my TM470 tutor hat on, we need to understand how and where LLMs might be useful, and more about the risks they post to academic integrity.

At the time of writing, there is such a lot of talk about GenAI (and AI in general). I do wonder where we are in the Gartner hype cycle (Wikipedia). As I might have mentioned in other blogs, I’ve been around in computing for long enough to know that AI hype has happened before. I suspect we’re currently climbing up the ‘peak of inflated expectations’. With each AI hype cycle, we always learn new things. I’m of the school of thought that the current developments represent yet another evolutionary change, rather than one that offers revolutionary change.

Whilst studying A334, my tutor talked a little about GenAI in an introductory tutorial. In doing so, he shared something about expectations, in terms of what was expected in a good assessment submission. If I remember rightly, he mentioned the importance of writing that answered the question (a context that was specific, not general), demonstrated familiarity with the module materials (by quoting relevant sections of course texts), and clear and unambiguous referencing. Since the module is all about literature, there is scope to say what we personally think a text might be about. These are all the kind of things that LLMs might be able to do at some level, but not to a degree that is yet thoroughly convincing. To get something convincing, students need to spend time doing ‘prompt engineering’.

This leads us to a final reflection: do we spent a lot of time writing prompts and interrogating a LLM to try to get what we need, or would that time be spent more effectively writing what needed to be written in the first place? If the writing of assessments are all about learning, then does it matter how learning has taken place, as long as the learning has occurred? There is, of course, the important subject of good academic practice, which means becoming aware of what the cultural norms of academic debate and discourse are all about. To offer us a little more guidance, in the coming months I understand there will be some resources about Generative AI available on OpenLearn

Acknowledgments

Many thanks to the organisers and facilitators. Thanks to all presenters; there were a lot of them!

Addendum

Edited on 27 November 24, attributing Zoe Tompkins to one of the sessions. During the event, a session was given by Professor Karen Kear, who demonstrated how Generative AI can struggle with very specific tasks: creating useful image descriptions. Generative AI is general; it doesn't understand the context in which problems are applied.

Permalink 2 comments (latest comment by Christopher Douce, Monday, 18 Nov 2024, 10:58)
Share post
Christopher Douce

Generative AI- AL Professional Development

Visible to anyone in the world
Edited by Christopher Douce, Wednesday, 29 May 2024, 12:36

On 23 May 24 I attended an AL development event (in my capacity as an OU tutor) that was all about Generative AI (which is abbreviated to here as GenAI). This blog sits alongside a couple of other blogs that I shared last year that also relate to GenAI and what this means for education, distance learning, and education practice.

What follows is some notes that I made during a couple of the sessions I attended, and what points and themes I took away from them. I also share some critical perspectives. Since GenAI is a fast moving subject, not just in terms of the technology, but in terms of policy and institutional responses, what is presented here is also likely to age quickly.

Opening keynote

The event opened with a keynote by Mychelle Pride which had the subtitle: Generative AI in Learning, Teaching and Assessment I won’t summarise it at length. Instead, I’ll share some key points that I noted down.

One important point was that AI isn’t anything new. A couple of useful resources were shared, one from the popular press, How AI chatbots like ChatGPT or Bard work – visual explainer (The Guardian) and another from industry: The rise of generative AI: A timeline of breakthrough innovations (Qualcomm).

An interesting use case was shared through a YouTube video: Be My Eyes Accessibility with GPT-4. Although clearly choreographed, and without any indication of whether any of this was ‘live’, one immediately wonders whether this technology is solving the right problems. Maybe this scenario implicitly implies that visually impaired people should adapt to the sighted world, whereas perhaps a better solution might be for the world to adapt to people with visual impairments? I digress.

There are clear risks. One significant concern lies with the lack of transparency. Tools can be trained with data that contains biases; in computing there’s the notion of GiGO: garbage in, garbage out. There’s also the clear potential that GenAI tools may accept and then propagate misinformation. It is clear that “risks need to be considered, along with the potential opportunities”.

A point was shared from a colleague Michel Wermelinger who was quoted saying “academic conduct is a symptom, not the problem”, which directly takes us to the university’s academic conduct policies about plagiarism.

In this session I learnt a new term: “green light culture”. The point here was that there are a variety of positions that relate to GenAI: in HE there are policy decisions that range from ‘forbid’ to ‘go ahead’.

I made a note of a range of searching questions. One of them was: how might students use Generative AI? It might become a study assistant, it might facilitate language learning, or support with creative projects. Another question was: how could pedagogies be augmented by AI? Also, is there a risk of over dependence in how we use these tools? Could it prevent us from developing skills? How can we assess in a generative AI world? Some answers to this question may be to have project-based assessment, collaborative assessment, to use complex case studies, and to consider the use of oral assessments. 

A point is that students will be using Generative AI in the future, which means that the university has a responsibility to educate students about it

Towards the end of the keynote, there was some talk about all this being revolutionary (I’ll say more about this later). This led onto a closing provocative question: what differentiates you (the tutor) from Generative AI?

During the keynote, some interesting resources were shared:

Teaching and learning with AI across the curriculum

The aim of a session by Mirjam Hauck was to explore the connection between AI and pedagogy, and to also consider the issue of ethics.

Just like the previous presentation, there were some interesting resources that were shared. One of them was a talk: TED Talk: How AI could save (not destroy) education.

Another resource was a recent book, Practical Pedagogy: 40 New Ways to Teach and Learn by Mike Sharples which students and staff can access through the OU Library.

I had a quick look at it, just to see what these 40 new ways were. Taking a critical perspective, I realised that the vast majority of these approaches were already familiar to me, in one way or another. These are not necessarily ‘new’ but are instead presented in a new way, in a useful compendium. The text also shares a lot of informal web links, which immediately limits the longevity of the text. It does highlight academic articles, but it doesn’t always cite them within a description of a pedagogy. My view is: do consider this text as something that shares a useful set of ideas, rather than something that is definite.

During this session, there were some complementary reflections about how GenAI could be linked with pedagogy: it could be used to help with the generation of ideas (but to be mindful that it might be regenerating ideas and bits of text that may be subject to copyright), play a role within a Socratic dialogue, or act as a digital assistant for learning (which was sometimes called an AIDA – an AI digital assistant).

Power was mentioned in this session, with respect to the power that is exerted by the corporations that develop, run, and deploy AI tools. The point I had in my mind during this part of the session was: ‘do be mindful about who is running these products, why, and that they hope to get from them’.

A brief aside…

Whilst I was prepping this blog, I was sent a related email from Hello World magazine, which is written for computing educators. In that email, there was a podcast which had the title: What is the role of AI in your classroom? 

There was an interesting discussion about assessment, and asking the question of ‘how can this help with pedagogy?’ and ‘how can we adapt our own practices?’ A further question is: ‘is there a risk that we dumb down creativity?’

A scholarship question?

A few times this year tutors have been in touch with me, to ask the question: ‘I’ve seen a students answer in a script that makes me think they may well have used Generative AI. What do I do?’ Copying TMA questions, or any other elements of university materials into a Generative AI tool represents a breach of university policy, and can potentially be viewed as an academic conduct issue. The question is: what do tutors do about this? At the moment, and without any significant evidence, tutors must mark what they have been given.

An important scholarship question to ask is: how many tutors think they are being presented with assessments that may have been produced by Generative AI tools?

Reflections

There was a lot of take on board during this session. I need to find the time to sit down and work through some of the various resources that were shared in this session, which is (in part) the reason for this blog.

When I was a computing undergraduate I went to a couple of short seminars about the development of the internet. When it came to the topic of the web browser, our lecturer said: “this is never going to catch on; who is going to spend time creating all these web pages and HTML tags?” Every day I make use of a web browser; it is, of course, an important bit of software that is embedded within my phone. This connects with an important point: it is notoriously difficult to predict the future, especially when it comes to how technologies are used. There are often unintended consequences, both good and bad.

Being a former student of AI (late in the last century) I’m aware that the fashions that surround AI is cyclical. With each cycle of hype, there are new technologies and tools. Following an early (modern) cycle of AI, I remember a project called SHRDLU, which demonstrated an imaginary world, where users could interact with natural language. This led to an expression that they key challenges had been solved, and all that needs to be done is to scale everything up. Reality, of course, is a whole lot more complicated.

A really important point to bear in mind is that GenAI (in the general sense) cannot reason. You can’t play chess with it. There are, however, other tools within the AI toolset that can do reasoning. As a postgrad student, I had to write an expert system that helped to solve a problem: to figure out a path through a university campus.

I’ve also been around for long enough to see various cycles of hype regarding learning technologies: I joined when e-learning objects were the fashion of the day, then there was the virtual learning environment, and then there was a craze that related to learning analytics. In some respects, the current generation of AI feels like a new (temporary) craze.

Embedding AI into educational tools isn’t anything new. I remember first hearing about the integration of neural networks in the early 2000s.  In 2009 I was employed on a project that intended to provide customised digital resources for learners who have different requirements and needs.

As the models get bigger, more data they hoover up, and the greater potential of these tools generating nonsense. And here lies a paradox: to make effective use of GenAI within education, you need education.

Perhaps there is a difference between generally available generative AI, to generative AI that is aligned to particular contexts. This takes me to an important reflection: no GenAI tool or engine can ever know what your own context is. You might ask it some questions and get a sensible sounding response, but it will not know why you’re asking a question, and what purpose your intended answer may serve. This is why the results produced by a GenAI tool might look terrible, or suspicious if submitted as a part of an assessment. Context is everything, and assessments relate to your personal understanding of very particular learning context.

Although the notion of power and digital corporations was mentioned, there’s another type of power that wasn’t mentioned: electrical power. I don’t have figures to hand, but large language models require an inordinate amount of electrical energy to do what they do. Their use has real environmental consequences. It's easy to forget this.

Here is my view: it is important to be aware of what GenAI is all about, but it is also really important not to get carried away and caught up in what could be thought of as technological froth. It’s also important to always remember that technology can change faster than pedagogy. We need to apply effective pedagogy to teach about technology. 

In my eyes, GenAI, or AI in many of its other forms isn’t a revolution that will change everything, or is an existential threat to humanity; it is an evolution of a set of existing technologies.

It’s important to keep everything in perspective.

Resources

A number of resources were highlighted in this session which are worth having a quick look at:

Acknowledgements

Many thanks to the presenters of this professional development event, and the team that put this event together. Lots to look at, and lots of think about.

Permalink
Share post
Christopher Douce

Generative AI and the future of the OU

Visible to anyone in the world
Edited by Christopher Douce, Tuesday, 20 June 2023, 10:24

On 15 June 2023 I attended a computing seminar about generative AI, presented by Michel Wermelinger.

In some ways the title of his seminar is quite provocative. I did feel that his presentation relates to the exploration of a very specific theme, namely, how generative AI can play a role in the future of programming education; a topic which is, of course, being explored by academics and students within the school.

What follows is a brief summary of Michel's talk. As well as sharing a number of really interesting points and accompanying resources, Michel did a lot of screensharing, where he demonstrated what I could only describe as witchcraft.

Generative AI tools

Michel showed us Copilot, which draws on code submitted through GitHub. Copilot is said to use something called OpenAI Codex. The witchcraft bit I mentioned was this: Michel provided a couple of comments in a development environment, which were parsed by the Copilot, which generated readable and understandable Python code. There was no messing about with internet searches or looking through instruction books to figure out how to do something. Copilot offered immediate and direct suggestions.

Copilot isn’t, of course, the only tool that is out there. There are now a bunch of different types of AI tools, or a taxonomy of tools, which are emerging. There are tools where you pay for access. There are tools that are connected with integrated development environments (IDEs) that are available on the cloud, and there are tools where the AI becomes a pair programmer chatbot. There are other tools, such as learning environments that offer both documentation and the automated assessment of programming assignments.

The big tech companies are getting involved. Amazon has something called CodeWhisperer. Apparently Google has something called AlphaCode, which has participated in competitive programming competitions, leading to a paper in Nature which questions whether ChatGPT and AlphaCode going to replace programmers? There’s also something called StarCoder, which has also been trained on GitHub sources.  

AI can, of course, be used in other ways. It could be used to offer help and support to students who have additional requirements. AI could be used to transcribe lectures, and help student navigate across and through learning materials. The potential of AI being a useful learning companion has been a long held dream, and one that I can certainly remember from my undergraduate days, which were in the last century.

Implications

An important reflection is that Copilot and all these other AI tools are here to stay. It wouldn’t be appropriate to try to ban them from the classroom since they are already being used, and they already have a purpose. Michel also mentioned there is already a textbook which draws on Generative AI: Learn AI-assisted Python programming

Irrespective of what these tools are and what they do, everyone still needs to know the fundamentals. Copilot does not replace the need to understand language syntax and semantics and know the principles of algorithmic thinking. Developers and engineers need to know what is meant by thorough testing, how to debug software, and to write helpful documentation. They need to know how to set breakpoints, use command prompts, and also know things about version and configuration management.

An important question to ask is: how do we assess understanding? One approach is an increasing use of technical interviews, which can be used to assess understanding of technical concepts. This won’t mean an academic viva, but instead might mean some practical discussions which both help to assess student’s knowledge, and help them to prepare for the inevitable technical interviews which take place in industry.

New AI tools may have a real impact on not only what is taught but how teaching is carried out, particularly when it comes to higher levels of study. This might mean the reformulation of assignments, perhaps developing less explicit requirements to expose learners to the challenge of working with ambiguity, which students must then intelligently resolve.

Since these tools have the potential to give programmers a performative boost, assignments may become more bigger and more substantial. Irrespective of how assignments might change there is an imperative that students must learn how to critically assess and evaluate whatever code these tools might suggest. It isn’t enough to accept what is suggested; it is important to ask the question: “does the code that I see here make sense of offer any risks, given what I’m trying to do?”

A term that is new to me is: prompt engineering. This need to communicate in a succinct and precise way to an AI to get results that are practical and useful within a particular context. To get useful results, you need to be clear about what you want. 

What is the university doing?

To respond to the emergence of these tools the university has set up something called the Generative AI task and finish group. It will be producing some interim guidance for students and will be offering some guidance to staff, which will include the necessity to be clear about ethical and transparent use about AI. It is also said to highlight capabilities and limitations.  There will also be guidance for award boards and module results panels. The point here is that Generative AI is being looked at. 

Michel suggested the need for a working group within the school; a group to look at what papers coming out, what the new tools are, and what is happening across the sector at other institutions. A thought that it might be useful to widen it out to other schools, such as the School of Physical Sciences, and any others which make use of any aspect of coding and software development.

Reflections

Michel’s presentation was a very quick overview of a set of tools that I knew very little about. It is now pretty clear that I need to know a lot more about them, since there are direct implications for the practice of teaching and learning, implications for the school, and implications for the university. There is a fundamental imperative that must be emphasised: students must be helped to understand that a critical perspective about the use of AI is a necessity.

Although I described Michel’s demonstration of Copilot as witchcraft all he did was demonstrate a new technology.

When I was a postgraduate student, a lecturer once told me that one of the most fundamental and important concepts in computing was abstraction. When developers are faced with a problem that becomes difficult, they can be said to ‘abstract up’ a level, to get themselves out of trouble, and towards another way of solving a problem. In some senses, AI tools represent a higher level of abstraction; it is another way of viewing things. This doesn’t, of course, solve the problem that code still needs to be written.

I have also heard that one of the fundamental characteristics of a good software developer or engineer is laziness. When a programmer finds a problem that requires solving time and time again, they invariably develop tools to do their work for them. In other words, why write more code than you need to, when you can develop a tool that solves the problem for you?

My view is that both abstraction and laziness are principles that are connected together.

Generative AI tools have the potential to make programmers lazy, but programmers must gain an appreciation about how and why things work. They also need to know how to make decisions about what bits of code to use, and when. 

It takes a lot of effort to become someone who is effective at being lazy.

Permalink Add your comment
Share post
Christopher Douce

ChatGPT school seminar

Visible to anyone in the world
Edited by Christopher Douce, Sunday, 21 May 2023, 09:49

On 19 April 2023, I arrived slightly late for an online seminar about ChatGPT and generative AI. This blog post share some of the notes that I made during the session. It might be useful to read this post in conjunction with an earlier blog that was written on the same topic that summarises a workshop organised by the OU Knowledge Media Institute (KMI). These notes are pretty rough-and-ready, since they were edited together a month after the event took place.

Seeking opinions

Mike Richards, from the School of Computing and Communications, began by summarising some research that he had carried out with a number of colleagues. Five tutors were interviewed. When it comes to reviewing and marking assignments, it was noted that tutors are sensitive to changes in formatting style, voice and vocabulary.

Tutors rely on module teams and central systems for plagiarism detection, but they can and do pick up on things themselves. ALs don’t like referring students to disciplinary processes. They are cautious; they usually have a very high level of suspicion before they contact staff tutors and invoke the academic conduct processes. In the cases where the identify issues, they take opportunities to make a teaching point to students.

Tutors wish to maintain positive relationships with students, but they are worried about the implications of raising academic conduct referrals and potential professional consequences if they raised unwarranted academic conduct concerns. Of course, there are no consequences for tutors. It is, of course, the academic conduct officers who make the decisions.

Key points

During the session, I captured the following important points. The first point was that assessment is vulnerable to ChatGPT. Specifically, highly structured essays are vulnerable, but these type of essays are used to develop student skills.

ChatGPT perform less well with anything to do with reflections about learning, since anything that is produced will not sound genuine.

There is a role for ChatGPT (or generative AI) detection software, but there are issues with detection tools, since they present a high rate of false positives. Detectors only gives you a probability that something is synthetic, but doesn’t provide evidence like TurnItIn.

Tutors are very important. They are able to spot synthetic solutions; they can identify bland, superficial, repetitive and irrelevant materials in a way that automated tools cannot. To assist with this, and to help our tutors, the university needs to provide better plagiarism training.

A recognised issue is that ChatGPT will generate superficially compelling references that are completely fake. Asking ALs to scrutinise the referencing would go some way to determine whether a chunk of text has been automatically generated. ChatGPT doesn’t currently do referencing at the moment, but there is a possibility this might change if it is connected with public databases.

The next step of this project is to write up findings and to have conversations with other faculties. There is also a university working group which aims to generate an assessment authoring guide to mitigate against generative AI. There is, of course, the need to do more studies. There might also be the need to adopt subject or discipline specific approaches. 

The closing thoughts shared during the seminar are important: we need to teach all students about the consequences of AI. Perhaps there needs to be some Open Educational Resources on the topic, perhaps something on OpenLearn that offers a sketch of what it can and cannot do. A closing point was that there are no ‘no-cost’ options. The university needs to carefully consider the role and purpose of assessments. Doing nothing is not an option.

During the discussion session, I noted down a couple of interesting questions: what question types would cause large language modules to perform sufficiently bad from caring to not caring? Also, what limits its abilities? ChatGPT writes in generalities. Its responses comes from how questions are worded. There is also the issue of concreteness. Assessment tasks are often related to specifics, in terms of activities texts, module materials, and forum posts. If generative AI cannot access the texts that students need to access and critically evaluate to develop their skills, its uses are, of course, limited.

Reflections

One of the key points that was emphasised was the importance of the tutor. They have such an important role to play in not only identifying instances of potential academic misconduct, but also in educating students about generative AI, and the risks these tools present.

It is also useful to reflect on the point that tutors can spot changes in writing style. There is the possibility that the stylistic quality of generated text is a characteristic that could be used to respond to not only ChatGPT, but also contract cheating. At the time of writing, anti-plagiarism detection tools such as TurnItIn only evaluate individual assignments. In the arms race to ensure academic integrity, the next generation of tools might analyse text across a number of submissions whilst taking into account the characteristics or structure of individual assessments.

I expect there will be a multi-faceted institutional response to generative AI. There will be education: of students, tutors, and module teams. Students will be informed about the ethical risks of using generative AI, and the practical consequences of academic misconduct. Tutors will be provided with more information about what generative AI is, and offered more development to facilitate sessions to help students. Module teams will have an increasing responsibility to develop assessment approaches that proactively mitigate against the development of generative AI. Also, technology will play a role in detecting academic misconduct, and new procedures will be developed to assist academic conduct officers.

Acknowledgements

An acknowledgement is due to Mike Richards and everyone who took part in aspects of research which is summarised here. A thank you goes to Daniel Gooch, who facilitated the event.

Permalink Add your comment
Share post
Christopher Douce

ChatGPT and Friends: How Generative AI is Going to Change Everything

Visible to anyone in the world
Edited by Christopher Douce, Sunday, 2 Apr 2023, 10:37

On 23 March 2023 the OU Knowledge Media Institute hosted a hybrid event, which had the curious title: How Generative AI is Going to Change Everything. More information about the details of this event is available through a GenAI KMi site.

I think I was invited to this event after sharing the results of a couple of playful ChatGPT experiments on social media, which may have been seen by John Domingue, the OU KMi director. In my posts, I shared fragments of poetry which had been generated about the failures of certain contemporary political figures.

The KMi event was said to be about “ChatGPT and related technologies, such as DALL E 2 and Stable Diffussion” and was described as an “open forum” to “allow participants to first get an understanding of what lies underneath this type of AI (including limitations)” with a view to facilitating discussions and potentially setting up an ethical workshop.

What follows is a very brief summary of some of the presentations, taken from notes I made during each of the talks. Please do view this blog as simply that, a set of notes. Some of these may well contain errors and misrepresentations, since these textual sketches were composed quite quickly. Do feel free to contact individual speakers.

Introduction and basics of ChatGPT/GPT-3/GPT-4

The event was opened by John who described it as a kick-off event, intended to bring people together. He introduced the topic, characterising the GPT projects as a very sophisticated text predictor, with GPT3 being described as “a text predictor on steroids”. An abbreviation that was regularly used was: LLM. This is short for “large language model”; a term that I hadn't heard before.

We were introduced to the difference between the different versions of GPT. An interesting difference being the amount of text these LLMs have processed and how much text they can generate. We were told that GPT2 was released in 2018 and the current version, GPT4, can make use of images (but I’m not quite sure how).

John shared a slide that described something called the OU’s AI agents ecosystem, which had the subtitle of being an AI strategy for the OU.

There were some pointers towards the future. Some of these new fangled tools are going to find their way into Microsoft 365. I’m curious to learn how these different tools might affect or change my productivity.

What follows is a summary of some of the presentations that were made during the event. Most of the presentations were made over a course of 5 minutes; the presenters had to pack in a lot over a very short amount of time. There is, of course, a risk that I may well have misrepresented some aspects of the presentations, but I hope I have done a fair job in capturing the main points and themes each speaker expressed.

Short presentations

ChatGPT: Safeguards, trustworthiness and social responsibility

The first short presentation was by Shuang Ao from the Knowledge Media Institute. Shuang suggested that LLMs are “uncontrollable, not transparent and unstable” and had limitations in terms of their current ability to demonstrate reasoning and logic. They also may present factual errors, and demonstrate bias and discrimination, which presents real ethical challenges.

But can it make decisions?

Next up was Lucas Anastasiou, also from the Knowledge Media Institute. Lucas had carried out some experiments. ChatGPT can’t play chess at all well, but it does know how to open a game well, since it knows something about chess game opening theory. But how about poker? Apparently there’s something called a poker IQ test. I’m not sure if I remember exactly, but I seem to recall that they’re not great at playing poker. How about a stock portfolio or geo-political forecasting? We were offered a polite reminder that a computer can never be held accountable, but perhaps its users, and developers could be?

ChatGPT attempts OU TMAs

The next speaker was Alistair Willis, School of Computing and Communications. Alistair is a module chair for TM351 Data management and analysis. He asked a simple question, but one that has important implications: can ChatGTP answer one of his TMA questions? 

His TMA was a guided investigation, and was split into two parts: a coding bit, and an interpretation bit. The conclusion that was good at the coding bit (or, potentially, helping with the coding bit), but rubbish at the interpretation. Overall, a student wouldn’t get a very high score.

From the module team perspective, a related question was: could it be used to create module materials?

These questions is all very well, but if text and answers can be generated, is there a way to determine whether a fragment of prose was generated by ChatGPT? Apparently, there is a tool which can highlight which bits of text may have been written using ChatGPT.

Five key learnings from our use of Chatbots

Barry Verdin has an interesting role within the OU; he is an assistant director student support innovation. I have heard of Barry before; he keeps inviting me to meetings about systems thinking, but I keep being too busy to attend (but I do welcome his invitations!) His interest lies in supporting a chatbot that offers support to students. He shared an interesting statistic that the chatbot can answer around 80% of queries. Clearly, AI has the possibility of helping with some types of student enquiries.

Experiments with ChatGPT

It was my turn. I wear a number of hats. I’m a student, an associate lecturer, and a staff tutor.

Wearing a student hat

Whilst wearing my student hat, I’ve been studying a module called A230 Reading and studying literature. When I had completed and submitted one of my Tutor Marked Assignments, I submitted an abridged version of my TMA question to ChatGPT. The question I gave it was: “Compare and contrast Shelly’s Frankenstein with Wordsworth’s Home at Grasmere”. I admit that there was a part of me that took pleasure in asking an artificial intelligence what it thought about Frankenstein.

I found the response that I got interesting. Firstly, it was pretty readable, and secondly, it helped me to understand what I had understood when preparing the assignment. For example, it enabled me to check my own understanding of what literary romanticism was all about. Another point was that there was no way that ChatGPT could have responded to the detail specifics of the essay question, since we were asked to interpret a very specific section of Wordsworth’s epic (and we have already learnt that ChatGPT isn’t good at logic). The text that we was working with was only available to OU students in a very specific form.

My study of literature helps me to develop specific skills, such as close reading, and adopting a critical approach to texts. Students, of course, also need to show an understanding of module materials too. If large language models don’t have access to those texts, they’re not going to even attempt to quote from them. This means that a vigilant tutor is likely to raise a curious eyebrow if a student submits a neatly written essay which is devoid of quotes from texts, or from module materials.

Wearing a tutor hat

Picking up on the role of a tutor, another hat I wear is a tutor for M250 Object-oriented Java programming I confess to doing something similar to Alistair. I fed ChatGPT a part of a TMA question which instructed a student to write bits of code to model a scenario. It did well, but it did too much: it produced bits of code that were not asked for. It produced too much. This said, drawing on my experience of programming (and of teaching) I could understand why it suggested what had been produced.

From the tutor’s perspective, if I had received a copy of what had been produced, I would be pretty suspicious, since I would be asking: “where did our student get all that experience from, when this is module that is all about introducing key concepts?”

Wearing a staff tutor hat

For those who are unfamiliar with the role of a staff tutor, a staff tutor is a tutor line manager. We’re a bit of academic and administrative glue in the OU system which makes things work. We get to deal with a whole number of different issues on a day-to-day basis, and a couple of times a year academic conduct issues cross my desk.

The university has to deal with and work with a number of existing threats to academic integrity, such as well-known websites where students can ask questions from subject matter experts and fellow students. Sometimes solutions to assignments are shared through these sites. Sometimes, these solutions contain obvious errors, which we can identify.

Responses to the threats to academic integrity include the use of plagiarism detection software (such as TurnItIn), the use of collusion detection systems (such as CopyCatch), the vigilance of tutors and module teams, the referral of cases to university Academic Conduct Officers, running of individual support sessions to help students to develop their study skills to ensure they do not accidentally carry out plagiarism, and effective record keeping to tie everything together.

When arriving at this event, one question I did have was: could it be possible to create an AI to detect answers that had been produced by an AI? Alistair’s earlier reference to a checker had partially answered my own question. Further question are, of course: how should such detection tools be used within an institution, and to what extent should academic policies be adapted and changed to take account of large language models?

Bring textual wishes to life

Christian Nold from the School of Engineering and Innovation (E&I) shared some information about an eSTEeM project with Georgy Holden. Students were encouraged to send postcards about their experience at level 1 study, sharing 3 wishes. The question that I have noted down was: wow can we use AI tools to generate personas from 3 wishes? Tools such as ChatGPT integrates different bits of text together and the generation personas could help us to think differently.

Core-GPT

Matteo Cancellieri and David Pride, both from the Knowledge Media Institute gave what was pitched as a KMi product announcement: they introduced CORE-GPT. Their project aims to combine open access materials with AI for credible, trustworthy question answering. The aim is to attempt to reduce the number of ‘hallucinations’ (made up stuff) that might be produced through tools such as ChatGPT, drawing on information from open access papers. More information about the initiative is available through a blog article: Combining Open Access research and AI for credible, trustworthy question answeringMore information is available through the Core website.

ChatGPT and assessment

Dhouha Kbaier from School of Computing and Communications shared some concerns and points about assessment. Dhouha is module chair of TM355 Communications Technology. Following the Covid-19 pandemic, students are assessed through a remote exam. In their exam, students need to draw on discussion materials, and find resources and articles. Educators need to make students aware that there are tools that can detect text generated by large language models, and AI tools can create errors (and hallucinations).

One of the points I noted was: there is the potential need to adapt our assessment approaches. Educators also have a responsibility to do what they can to remove a student’s motivation for cheating. Ultimately, it isn’t in their best interests.

Can students self-learn with ChatGPT?

Irina Rets from the OU Institute of Educational Technology (IET) asked some direct questions, such as: can students learn through ChatGPT? Also, can AI be a teacher? In some respects, these are not new questions; a strand of research that links to AI and education has been running for a very long time. Some further questions were: who gets excluded? Also, what are the learning losses, and learning gains? Finally, how might researchers use these tools?

Chat GPT - Content Creation with AI

Manoj Nanda from the School of Computing and Communications also suggested that AI might be useful for idea generation. Manoj highlighted a couple of tools that I had not heard of before, such as Dall-e2 (OpenAI website) which can generate an image from a textual description. Moving to an entirely different modality, he also highlighted Soundraw.io. Manoj emphasised that a key skill is using appropriate prompts. This relates to an old computing adage: if you put garbage in, you’ll get garbage out (GIGO).

Developing playful and fun learning activities

Nicole Lotz from the School of Engineering and Innovation (E&I) sees tools such as ChatGPT as potentially useful for creative exploration. Nicole is module chair of U101 Design thinking, which is a first level design module. The ethos of the module it all about playfulness, building confidence, and learning through reflection. Subsequently, there may be opportunities to use what ChatGPT might produce as a basis for further reflection, development and refinement.

"I am the artist Riv Rosenfeld" - How ChatGPT is your new neoliberal friend

Tracie Farrell, from the Knowledge Media Institute, works in the intersection between AI and social justice. Tracie asked ChatGPT to write a paragraph about her friend and artist, Riv Rosenfeld. There was a clear error, which was that ChatGPT got their pronouns wrong. An important point is that “ChatGPT doesn’t know your truth”. In other words, the perspective that is generated by large language models comes from what is written or known about you, and this may be at odds with your own perspective. There are clear and obvious risks: marginalised groups are always not as visible. Biases are perpetuated. Some key questions are: who will be harmed, and who will be helped, and to what extent (and how) will these emerging tools reinforce inequality.

Discussion

After the short presentations, we went into a plenary discussion. It wasn’t too long before the history of AI was highlighted. John highlighted the two schools of thought about AI: a symbolic camp, and a statistical camp, and suggested that in the future, there might be a combination of the two. This related to the earlier point that these AI tools can’t (yet) do logic very well.

A further comment reflected an age old intractable problem that hasn’t been solved, and might never be solved, namely: we still haven’t defined what intelligence is. In terms of AI, the measure of intelligence has moved from playing chess, through to having machines do things that humans find intrinsically easy to do, such as assess a visual scene, and communicate with each other using natural language. The key point in the discussion was, of course: we need to ask again, what do we mean by intelligence?

Whenever a technology is discussed, an accompanying discussion of a potential digital divide is never too far away. AI may present its own unique divides: those who know how to use AI tools and can use them effectively, and those who don’t know about them, and are not able to use them. There are clear links to the importance of equity and access.

During the discussion, I noted down the words: “If you’re a novice programmer, what blocks you is your first bug”. In other words, knowing the fundamentals and having knowledge is important. Another phrase I noted down was: “It is perhaps best to view them as fallible assistants”.

Given their fallibility, making judgements about when to trust what an AI tool has produced, and when not to, is really very important. In other words: it is important to think critically, and this is something that only us humans can do.

Reflections

This was a popular event; approximately 250 people attended the first few presentations.

The presentations were quite different to each other. Some explored the question “to what extent might these tools present risks to academic integrity?” Others explored “how can these tools help us with creativity and problem solving?” The important topic of ethics was clearly highlighted. It was also interesting to learn about work being carried out within KMi, and the reference to the emergence of an institutional AI strategy (although I do hold the view that this should be thoroughly and critically evaluated).

I enjoyed the discussion section. In some respects, it felt like coming home. I studied AI as an undergraduate and a postgraduate student over 20 years ago, where the focus was primarily on symbolic AI. At the time, statistical methods, which includes neural networks, was only just beginning to make an appearance. It was really interesting to see the different schools of thought being highlighted and discussed. During the discussion session I shared the following memorable definition: AI is really clever people making really stupid machines to do things that look clever.

I confess to having been around long enough to know of a number of AI hype cycles. When I was a postgraduate student, I learnt about the first generation of AI developments. I learnt about chess and problem solving. I remember that proponents at the time were suggesting that the main problems with AI had been solved, which had the obvious implication that we would soon have our own personal robots to help us with our everyday chores.

The reality, of course, turned out to be different, since some of those very human problems, such as vision, sound and language were a lot harder to figure out. This meant there were no personal robotic assistants, but instead we did get a different kind of personal digital assistant.

Despite my cynicism, one aspect of AI that I do like is that it has been described as “applied philosophy”. When you start to think about AI, you cannot get away from trying to define what intelligence is. In other words, the machine becomes a mirror to ourselves; the computer helps us to think about our own thinking.

I once heard a fellow computer scientist say that one of the greatest contributions of computing is abstraction. In other words, when making sense of a difficult problem, you look at all its elements, and then you go on to create a new representation (or form) of the problem which then enables you to make sense of it all. I remember another computer science colleague saying, “when you get into trouble, abstract your way out of difficulty”. This can also be paraphrased as: “go up a level”.

We’ve all been in that situation when we’ve had multiple search engine tabs open, and we’re eyeballing tens of thousands of different search results. In these circumstances, we don’t know where to begin. Perhaps this is the problem that these large language models aim to resolve: to produce a neat summary of an answer we’re searching for in a neatly digestible format.

To some degree, generative AI can be though as “going up a level”, but the way you go up a level may well be driven by the data that is contained within a large language model. That data, of course, might well be incorrect. Even if you do “go up a level” you might be going up in entirely the wrong direction.

All these points emphasise the importance of taking a critical perspective of what all these new-fangled AI tools produce, but this does require those interpreting any results to have developed a critical perspective in the first place. We need a critical perspective to deal with instances where an AI tool might well provide us with not just machine generated “hallucinations” but also misinformation.

During my bit of the talk, I shared a perspective that I feel is pretty important, which is: “the most important thing in education isn’t machines or technologies, its people”. When we’re thinking about AI, this is even more true than ever. A screen of text looks like a screen of text. A teacher, tutor or lecturer can tell you not only what is important, but why, and what its consequences might mean to others.

I do feel that it is very easy to get carried away by the seemingly magical results that ChatGPT can produce. I also feel that it is important to view these tools with a healthy dose of AI cynicism and scepticism. If AI is applied philosophy, and this new form of AI enables us to more readily hold up a mirror to ourselves, it is entirely possible that we might not like what we see.

It is entirely possible that generative AI tools may well “read” this summary, and these reflections might well help these uncanny tools answer the question “how do humans perceive generative AI?” I’ll be interested to see what answer it produces.

Returning to the implicit question presented in the title of this event: “how generative AI going to change everything?” The cynic in me answers: “I doubt it”. It is, however, likely to change some things.

Other resources

A few weeks before this event, I was made aware of another related event which took place on 16 March, entitled Teaching with ChatGPT: Examples of Practice (YouTube playlist)This event was a part of a series of Digitally Enhanced Education Webinars from the University of KentThese presentations are certainly worth a visit, if only to hear other voices sharing their perspectives about this topic.

After this blog was published, Arosha Bandara sent me a link to the following article: Stephen Wolfram writings: What Is ChatGPT Doing ... and Why Does It Work? It is quite a long read, and it is packed with detail. It's also one of those articles that will take more than a few hours to work through. I'm sharing it here for two reasons: so I know where to find it again, and just in case others might find it of interest.

Acknowledgements

The event was a KMi Knowledge Makers event. Many thanks to John for inviting me, and encouraging me to participate. Many thanks to all the presenters; I hope I have managed to share some of the key points of your presentation, and apologies that I haven’t managed to capture everyone’s presentation. The event was organised by Lucas Anastasiou (PhD Research Student), Shuang Ao (PhD Research Student), Matteo Cancellieri (Lead Developer - Open Research), John Domingue (Professor of Computer Science), David Pride (Research Associate) and Aisling Third (Research Fellow). Thanks are also extended to Arosha for sending me the Wolfram article.

Addendum

A couple of weeks after the event, I was sent a note by a colleague. Someone in KMi may have asked ChatGPT to write a summary of this article. A link to that summary is available through a KMi blog. I have no idea to what extent it may have been edited by humans. This made me wonder: I wonder how ChatGTP might summarise the summary.

Permalink Add your comment
Share post

This blog might contain posts that are only visible to logged-in users, or where only logged-in users can comment. If you have an account on the system, please log in for full access.

Total visits to this blog: 2357787