Personal Blogs

Generative AI and assessment in Computing and Communications

Thursday, 14 Nov 2024, 15:50

Visible to anyone in the world

Edited by Christopher Douce, Wednesday, 27 Nov 2024, 14:06

On 13 November 2024 I attended a workshop that was all about Generative AI and assessment for OU computing modules that was organised by colleagues Michel Wermelinger (academic co-lead GAI in LTA), Mark Slaymaker (C&C Director of Teaching) and Anton Dil (Assessment Lead of C&C Board of Studies). What follows are a set of notes and reflections which relates to the event. I also share some useful links and articles that I need to find the time to follow up on. This summary is, of course, very incomplete; I only give a very broad sketch of some of the discussions that have taken place, since there is such a lot of figuring out to do. Also, for brevity, Generative AI is abbreviated to GenAI.

Introduction and some resources

The workshop opened with a useful introductory session, where we were directed various resources which we could follow up after the event. I’ll pick on a few:

To get a handle on what public research has been published by colleagues within OU, it is worth reviewing ORO papers about Generative AI.

The following notable book was highlighted:

Porter, L., Zingaro, D. and Simon, B. (2024) Learn AI-assisted Python programming : with GitHub Copilot and ChatGPT. [First edition]. Shelter Island, NY: Manning Publications.

There were also a few papers about how GenAI could be used with the teaching of programming:

To conclude this very brief summary, there is also an AL Development Resource which has been produced by Kevin Waugh, available under a Creative Commons Licence.

Short presentations

There were ten short presentations which covered a range of issues, from how GenAI might be used within assessments to facilitate meaningful learning, through to the threats it may offer to academic integrity. What follows are some points that I found particularly interesting.

One of the school’s modules, TM352 Web Mobile and Cloud technologies, has a new block that is dedicated to tools that can be used with software development. Over the last couple of years, it has changed quite a bit in terms of the technologies it introduces to learners. Since software development practices are clearly evolving, I need to find the time to see what has found its way into that module. It will, of course, have an influence on what any rewrite to TM354 Software Engineering might contain.

I learnt about a new software project called Llama. From what I understand, Llama is an open source large language model (LLM) engine that can be locally installed on your desktop computer, from where it can be fed your own documents and resources. The thing is: LLMs can make things up and get things wrong. Another challenge is that LLMs need a lot of computing resources to do anything useful. If students are ever required to play with their own LLMs as a part of a series of learning activities, this raises the subject of computing equity: some students will have access to powerful laptops, whereas other might not. Maybe there will a point when the school may have to deploy AI tools within the OpenSTEM Labs?

Whether you like them or not, a point was made that these tools may begin to find their way closer to the user. Tony Hirst made the point that when models start to operate on your own data (or device) this may open up the possibility of semantically searching sets of documents. Similarly, digital assistants may begin to offer more aggressive help and suggestions about how to write bits of documents. Will the new generation of AI tools and digital assistants be more annoying than the memorable Microsoft Clippy?

GenAI appears to be quite good at helping to solve well-defined and well-known programming problems. This leads us to consider a number of related issues and tensions. Knowing how to work with programming languages is currently an important graduate skill; programming also develops problem decomposition and algorithmic thinking skills. An interesting reflection is that GenAI may well help certain groups of students more than others.

Perhaps the nature of programming is changing as development environments draw upon coding solutions of others. Put another way, in the same way that nobody (except for low level software engineers) knows assembly language these days, perhaps the task of software development is moving to a higher level of abstraction. Perhaps developing software will mean less coding, but more about how to combine bits of code together. This said, it is still going to be important understand what those raw materials look like.

An interesting research question was highlighted by Mike Richards: can tutors distinguish between what has been generated by AI and what has been written by students? For more information about this research, do refer to the article Bob or Bot: Exploring ChatGPT’s answers to University Computer Science Assessment (ORO) A follow on question is, of course: what do we do about this?

One possible answer to this question may lie in a presentation which shared practice about the conducting of oral assessments, which is something that is already done on the postgraduate M812 digital forensics module. Whilst oral assessments can be a useful way to assess whether learning has taken place, it is important to consider the necessity of reasonable adjustments, to take account of students who may not be able to make an oral assessment, either due to communication difficulties, or mental health difficulties.

The next presentation, given by Zoe Tompkins, helped us to consider another approach to assessment: asking students to create study logs (which evidence their engagement), accompanied by pieces of reflective writing. On this point, I’m reminded of my current experience as an A334 English literature student, where I’m required to make regular forum postings to demonstrate regular independent study (which I feel that I’m a bit behind with). In addition to this, I also have an A334 reflective blog. A further reflection is that undergraduate apprentices have to evidence a lot of their learning by uploading digital evidence into an ePortfolio tool, which is then confirmed by a practice tutor. Regular conversations strengthen academic integrity.

This leads onto an important question which relates to the next presentation: what can be done to ensure that written assessments are ‘GenAI proof’? Is this something that can be built in? A metaphor was shared: we’re trying to ‘beat the machine’, whilst at the same time teaching everyone about the machine. One way to try to beat the machine is to use processes, and to refer to contexts that ‘the machine’ doesn’t know about. The context of questions is important.

The final presentation was by one of our school’s academic conduct officers. Two interesting numbers were mentioned. There are 6 points that students need to bear in mind when considering GenAI. If I’ve understood this correctly, there are 19 different points of guidance available for module teams to help them to design effective assessments. There’s another point within all this, which is: tutors are likely to know whether a bit of text has been generated by a LLM.

Reflections

This event reminded me that I have quite an extensive TODO list: I need to familiarise myself with Visual Studio Code, have a good look at Copilot, get up to speed with GitHub for education, look at the TM352 materials (in addition to M813 and M814, which I keep meaning to do for quite a while), and review the new Software Engineering Body of Knowledge (SWEBOK 4.0) that has been recently released. This is in addition to learning more about the architecture of LLMs, and upskill myself when it comes to the ethics, and figure out more about the different dimensions of cloud computing. Computing as moved on since I was last a developer and software engineer. With my TM470 tutor hat on, we need to understand how and where LLMs might be useful, and more about the risks they post to academic integrity.

At the time of writing, there is such a lot of talk about GenAI (and AI in general). I do wonder where we are in the Gartner hype cycle (Wikipedia). As I might have mentioned in other blogs, I’ve been around in computing for long enough to know that AI hype has happened before. I suspect we’re currently climbing up the ‘peak of inflated expectations’. With each AI hype cycle, we always learn new things. I’m of the school of thought that the current developments represent yet another evolutionary change, rather than one that offers revolutionary change.

Whilst studying A334, my tutor talked a little about GenAI in an introductory tutorial. In doing so, he shared something about expectations, in terms of what was expected in a good assessment submission. If I remember rightly, he mentioned the importance of writing that answered the question (a context that was specific, not general), demonstrated familiarity with the module materials (by quoting relevant sections of course texts), and clear and unambiguous referencing. Since the module is all about literature, there is scope to say what we personally think a text might be about. These are all the kind of things that LLMs might be able to do at some level, but not to a degree that is yet thoroughly convincing. To get something convincing, students need to spend time doing ‘prompt engineering’.

This leads us to a final reflection: do we spent a lot of time writing prompts and interrogating a LLM to try to get what we need, or would that time be spent more effectively writing what needed to be written in the first place? If the writing of assessments are all about learning, then does it matter how learning has taken place, as long as the learning has occurred? There is, of course, the important subject of good academic practice, which means becoming aware of what the cultural norms of academic debate and discourse are all about. To offer us a little more guidance, in the coming months I understand there will be some resources about Generative AI available on OpenLearn.

Acknowledgments

Many thanks to the organisers and facilitators. Thanks to all presenters; there were a lot of them!

Addendum

Edited on 27 November 24, attributing Zoe Tompkins to one of the sessions. During the event, a session was given by Professor Karen Kear, who demonstrated how Generative AI can struggle with very specific tasks: creating useful image descriptions. Generative AI is general; it doesn't understand the context in which problems are applied.

Tags: AI, generative AI, GenAI, genAI, LLM, LLMs, artificial intelligence, workshop

Permalink 2 comments (latest comment by Christopher Douce, Monday, 18 Nov 2024, 10:58)

Generative AI and the future of the OU

Sunday, 18 June 2023, 19:33

Visible to anyone in the world

Edited by Christopher Douce, Tuesday, 20 June 2023, 10:24

On 15 June 2023 I attended a computing seminar about generative AI, presented by Michel Wermelinger.

In some ways the title of his seminar is quite provocative. I did feel that his presentation relates to the exploration of a very specific theme, namely, how generative AI can play a role in the future of programming education; a topic which is, of course, being explored by academics and students within the school.

What follows is a brief summary of Michel's talk. As well as sharing a number of really interesting points and accompanying resources, Michel did a lot of screensharing, where he demonstrated what I could only describe as witchcraft.

Generative AI tools

Michel showed us Copilot, which draws on code submitted through GitHub. Copilot is said to use something called OpenAI Codex. The witchcraft bit I mentioned was this: Michel provided a couple of comments in a development environment, which were parsed by the Copilot, which generated readable and understandable Python code. There was no messing about with internet searches or looking through instruction books to figure out how to do something. Copilot offered immediate and direct suggestions.

Copilot isn’t, of course, the only tool that is out there. There are now a bunch of different types of AI tools, or a taxonomy of tools, which are emerging. There are tools where you pay for access. There are tools that are connected with integrated development environments (IDEs) that are available on the cloud, and there are tools where the AI becomes a pair programmer chatbot. There are other tools, such as learning environments that offer both documentation and the automated assessment of programming assignments.

The big tech companies are getting involved. Amazon has something called CodeWhisperer. Apparently Google has something called AlphaCode, which has participated in competitive programming competitions, leading to a paper in Nature which questions whether ChatGPT and AlphaCode going to replace programmers? There’s also something called StarCoder, which has also been trained on GitHub sources.

AI can, of course, be used in other ways. It could be used to offer help and support to students who have additional requirements. AI could be used to transcribe lectures, and help student navigate across and through learning materials. The potential of AI being a useful learning companion has been a long held dream, and one that I can certainly remember from my undergraduate days, which were in the last century.

Implications

An important reflection is that Copilot and all these other AI tools are here to stay. It wouldn’t be appropriate to try to ban them from the classroom since they are already being used, and they already have a purpose. Michel also mentioned there is already a textbook which draws on Generative AI: Learn AI-assisted Python programming.

Irrespective of what these tools are and what they do, everyone still needs to know the fundamentals. Copilot does not replace the need to understand language syntax and semantics and know the principles of algorithmic thinking. Developers and engineers need to know what is meant by thorough testing, how to debug software, and to write helpful documentation. They need to know how to set breakpoints, use command prompts, and also know things about version and configuration management.

An important question to ask is: how do we assess understanding? One approach is an increasing use of technical interviews, which can be used to assess understanding of technical concepts. This won’t mean an academic viva, but instead might mean some practical discussions which both help to assess student’s knowledge, and help them to prepare for the inevitable technical interviews which take place in industry.

New AI tools may have a real impact on not only what is taught but how teaching is carried out, particularly when it comes to higher levels of study. This might mean the reformulation of assignments, perhaps developing less explicit requirements to expose learners to the challenge of working with ambiguity, which students must then intelligently resolve.

Since these tools have the potential to give programmers a performative boost, assignments may become more bigger and more substantial. Irrespective of how assignments might change there is an imperative that students must learn how to critically assess and evaluate whatever code these tools might suggest. It isn’t enough to accept what is suggested; it is important to ask the question: “does the code that I see here make sense of offer any risks, given what I’m trying to do?”

A term that is new to me is: prompt engineering. This need to communicate in a succinct and precise way to an AI to get results that are practical and useful within a particular context. To get useful results, you need to be clear about what you want.

What is the university doing?

To respond to the emergence of these tools the university has set up something called the Generative AI task and finish group. It will be producing some interim guidance for students and will be offering some guidance to staff, which will include the necessity to be clear about ethical and transparent use about AI. It is also said to highlight capabilities and limitations. There will also be guidance for award boards and module results panels. The point here is that Generative AI is being looked at.

Michel suggested the need for a working group within the school; a group to look at what papers coming out, what the new tools are, and what is happening across the sector at other institutions. A thought that it might be useful to widen it out to other schools, such as the School of Physical Sciences, and any others which make use of any aspect of coding and software development.

Reflections

Michel’s presentation was a very quick overview of a set of tools that I knew very little about. It is now pretty clear that I need to know a lot more about them, since there are direct implications for the practice of teaching and learning, implications for the school, and implications for the university. There is a fundamental imperative that must be emphasised: students must be helped to understand that a critical perspective about the use of AI is a necessity.

Although I described Michel’s demonstration of Copilot as witchcraft all he did was demonstrate a new technology.

When I was a postgraduate student, a lecturer once told me that one of the most fundamental and important concepts in computing was abstraction. When developers are faced with a problem that becomes difficult, they can be said to ‘abstract up’ a level, to get themselves out of trouble, and towards another way of solving a problem. In some senses, AI tools represent a higher level of abstraction; it is another way of viewing things. This doesn’t, of course, solve the problem that code still needs to be written.

I have also heard that one of the fundamental characteristics of a good software developer or engineer is laziness. When a programmer finds a problem that requires solving time and time again, they invariably develop tools to do their work for them. In other words, why write more code than you need to, when you can develop a tool that solves the problem for you?

My view is that both abstraction and laziness are principles that are connected together.

Generative AI tools have the potential to make programmers lazy, but programmers must gain an appreciation about how and why things work. They also need to know how to make decisions about what bits of code to use, and when.

It takes a lot of effort to become someone who is effective at being lazy.

Tags: generative AI, AI, artificial intelligence, programming, coding, software engineering, software development, Copilot

Permalink Add your comment

ChatGPT school seminar

Saturday, 20 May 2023, 11:04

Visible to anyone in the world

Edited by Christopher Douce, Sunday, 21 May 2023, 09:49

On 19 April 2023, I arrived slightly late for an online seminar about ChatGPT and generative AI. This blog post share some of the notes that I made during the session. It might be useful to read this post in conjunction with an earlier blog that was written on the same topic that summarises a workshop organised by the OU Knowledge Media Institute (KMI). These notes are pretty rough-and-ready, since they were edited together a month after the event took place.

Seeking opinions

Mike Richards, from the School of Computing and Communications, began by summarising some research that he had carried out with a number of colleagues. Five tutors were interviewed. When it comes to reviewing and marking assignments, it was noted that tutors are sensitive to changes in formatting style, voice and vocabulary.

Tutors rely on module teams and central systems for plagiarism detection, but they can and do pick up on things themselves. ALs don’t like referring students to disciplinary processes. They are cautious; they usually have a very high level of suspicion before they contact staff tutors and invoke the academic conduct processes. In the cases where the identify issues, they take opportunities to make a teaching point to students.

Tutors wish to maintain positive relationships with students, but they are worried about the implications of raising academic conduct referrals and potential professional consequences if they raised unwarranted academic conduct concerns. Of course, there are no consequences for tutors. It is, of course, the academic conduct officers who make the decisions.

Key points

During the session, I captured the following important points. The first point was that assessment is vulnerable to ChatGPT. Specifically, highly structured essays are vulnerable, but these type of essays are used to develop student skills.

ChatGPT perform less well with anything to do with reflections about learning, since anything that is produced will not sound genuine.

There is a role for ChatGPT (or generative AI) detection software, but there are issues with detection tools, since they present a high rate of false positives. Detectors only gives you a probability that something is synthetic, but doesn’t provide evidence like TurnItIn.

Tutors are very important. They are able to spot synthetic solutions; they can identify bland, superficial, repetitive and irrelevant materials in a way that automated tools cannot. To assist with this, and to help our tutors, the university needs to provide better plagiarism training.

A recognised issue is that ChatGPT will generate superficially compelling references that are completely fake. Asking ALs to scrutinise the referencing would go some way to determine whether a chunk of text has been automatically generated. ChatGPT doesn’t currently do referencing at the moment, but there is a possibility this might change if it is connected with public databases.

The next step of this project is to write up findings and to have conversations with other faculties. There is also a university working group which aims to generate an assessment authoring guide to mitigate against generative AI. There is, of course, the need to do more studies. There might also be the need to adopt subject or discipline specific approaches.

The closing thoughts shared during the seminar are important: we need to teach all students about the consequences of AI. Perhaps there needs to be some Open Educational Resources on the topic, perhaps something on OpenLearn that offers a sketch of what it can and cannot do. A closing point was that there are no ‘no-cost’ options. The university needs to carefully consider the role and purpose of assessments. Doing nothing is not an option.

During the discussion session, I noted down a couple of interesting questions: what question types would cause large language modules to perform sufficiently bad from caring to not caring? Also, what limits its abilities? ChatGPT writes in generalities. Its responses comes from how questions are worded. There is also the issue of concreteness. Assessment tasks are often related to specifics, in terms of activities texts, module materials, and forum posts. If generative AI cannot access the texts that students need to access and critically evaluate to develop their skills, its uses are, of course, limited.

Reflections

One of the key points that was emphasised was the importance of the tutor. They have such an important role to play in not only identifying instances of potential academic misconduct, but also in educating students about generative AI, and the risks these tools present.

It is also useful to reflect on the point that tutors can spot changes in writing style. There is the possibility that the stylistic quality of generated text is a characteristic that could be used to respond to not only ChatGPT, but also contract cheating. At the time of writing, anti-plagiarism detection tools such as TurnItIn only evaluate individual assignments. In the arms race to ensure academic integrity, the next generation of tools might analyse text across a number of submissions whilst taking into account the characteristics or structure of individual assessments.

I expect there will be a multi-faceted institutional response to generative AI. There will be education: of students, tutors, and module teams. Students will be informed about the ethical risks of using generative AI, and the practical consequences of academic misconduct. Tutors will be provided with more information about what generative AI is, and offered more development to facilitate sessions to help students. Module teams will have an increasing responsibility to develop assessment approaches that proactively mitigate against the development of generative AI. Also, technology will play a role in detecting academic misconduct, and new procedures will be developed to assist academic conduct officers.

Acknowledgements

An acknowledgement is due to Mike Richards and everyone who took part in aspects of research which is summarised here. A thank you goes to Daniel Gooch, who facilitated the event.

Tags: ChatGPT, seminar, generative AI, AI, artificial intelligence, assessment, academic conduct

Permalink Add your comment

ChatGPT and Friends: How Generative AI is Going to Change Everything

Saturday, 25 Mar 2023, 14:15

Visible to anyone in the world

Edited by Christopher Douce, Sunday, 2 Apr 2023, 10:37

On 23 March 2023 the OU Knowledge Media Institute hosted a hybrid event, which had the curious title: How Generative AI is Going to Change Everything. More information about the details of this event is available through a GenAI KMi site.

I think I was invited to this event after sharing the results of a couple of playful ChatGPT experiments on social media, which may have been seen by John Domingue, the OU KMi director. In my posts, I shared fragments of poetry which had been generated about the failures of certain contemporary political figures.

The KMi event was said to be about “ChatGPT and related technologies, such as DALL E 2 and Stable Diffussion” and was described as an “open forum” to “allow participants to first get an understanding of what lies underneath this type of AI (including limitations)” with a view to facilitating discussions and potentially setting up an ethical workshop.

What follows is a very brief summary of some of the presentations, taken from notes I made during each of the talks. Please do view this blog as simply that, a set of notes. Some of these may well contain errors and misrepresentations, since these textual sketches were composed quite quickly. Do feel free to contact individual speakers.

Introduction and basics of ChatGPT/GPT-3/GPT-4

The event was opened by John who described it as a kick-off event, intended to bring people together. He introduced the topic, characterising the GPT projects as a very sophisticated text predictor, with GPT3 being described as “a text predictor on steroids”. An abbreviation that was regularly used was: LLM. This is short for “large language model”; a term that I hadn't heard before.

We were introduced to the difference between the different versions of GPT. An interesting difference being the amount of text these LLMs have processed and how much text they can generate. We were told that GPT2 was released in 2018 and the current version, GPT4, can make use of images (but I’m not quite sure how).

John shared a slide that described something called the OU’s AI agents ecosystem, which had the subtitle of being an AI strategy for the OU.

There were some pointers towards the future. Some of these new fangled tools are going to find their way into Microsoft 365. I’m curious to learn how these different tools might affect or change my productivity.

What follows is a summary of some of the presentations that were made during the event. Most of the presentations were made over a course of 5 minutes; the presenters had to pack in a lot over a very short amount of time. There is, of course, a risk that I may well have misrepresented some aspects of the presentations, but I hope I have done a fair job in capturing the main points and themes each speaker expressed.

Short presentations

ChatGPT: Safeguards, trustworthiness and social responsibility

The first short presentation was by Shuang Ao from the Knowledge Media Institute. Shuang suggested that LLMs are “uncontrollable, not transparent and unstable” and had limitations in terms of their current ability to demonstrate reasoning and logic. They also may present factual errors, and demonstrate bias and discrimination, which presents real ethical challenges.

But can it make decisions?

Next up was Lucas Anastasiou, also from the Knowledge Media Institute. Lucas had carried out some experiments. ChatGPT can’t play chess at all well, but it does know how to open a game well, since it knows something about chess game opening theory. But how about poker? Apparently there’s something called a poker IQ test. I’m not sure if I remember exactly, but I seem to recall that they’re not great at playing poker. How about a stock portfolio or geo-political forecasting? We were offered a polite reminder that a computer can never be held accountable, but perhaps its users, and developers could be?

ChatGPT attempts OU TMAs

The next speaker was Alistair Willis, School of Computing and Communications. Alistair is a module chair for TM351 Data management and analysis. He asked a simple question, but one that has important implications: can ChatGTP answer one of his TMA questions?

His TMA was a guided investigation, and was split into two parts: a coding bit, and an interpretation bit. The conclusion that was good at the coding bit (or, potentially, helping with the coding bit), but rubbish at the interpretation. Overall, a student wouldn’t get a very high score.

From the module team perspective, a related question was: could it be used to create module materials?

These questions is all very well, but if text and answers can be generated, is there a way to determine whether a fragment of prose was generated by ChatGPT? Apparently, there is a tool which can highlight which bits of text may have been written using ChatGPT.

Five key learnings from our use of Chatbots

Barry Verdin has an interesting role within the OU; he is an assistant director student support innovation. I have heard of Barry before; he keeps inviting me to meetings about systems thinking, but I keep being too busy to attend (but I do welcome his invitations!) His interest lies in supporting a chatbot that offers support to students. He shared an interesting statistic that the chatbot can answer around 80% of queries. Clearly, AI has the possibility of helping with some types of student enquiries.

Experiments with ChatGPT

It was my turn. I wear a number of hats. I’m a student, an associate lecturer, and a staff tutor.

Wearing a student hat

Whilst wearing my student hat, I’ve been studying a module called A230 Reading and studying literature. When I had completed and submitted one of my Tutor Marked Assignments, I submitted an abridged version of my TMA question to ChatGPT. The question I gave it was: “Compare and contrast Shelly’s Frankenstein with Wordsworth’s Home at Grasmere”. I admit that there was a part of me that took pleasure in asking an artificial intelligence what it thought about Frankenstein.

I found the response that I got interesting. Firstly, it was pretty readable, and secondly, it helped me to understand what I had understood when preparing the assignment. For example, it enabled me to check my own understanding of what literary romanticism was all about. Another point was that there was no way that ChatGPT could have responded to the detail specifics of the essay question, since we were asked to interpret a very specific section of Wordsworth’s epic (and we have already learnt that ChatGPT isn’t good at logic). The text that we was working with was only available to OU students in a very specific form.

My study of literature helps me to develop specific skills, such as close reading, and adopting a critical approach to texts. Students, of course, also need to show an understanding of module materials too. If large language models don’t have access to those texts, they’re not going to even attempt to quote from them. This means that a vigilant tutor is likely to raise a curious eyebrow if a student submits a neatly written essay which is devoid of quotes from texts, or from module materials.

Wearing a tutor hat

Picking up on the role of a tutor, another hat I wear is a tutor for M250 Object-oriented Java programming I confess to doing something similar to Alistair. I fed ChatGPT a part of a TMA question which instructed a student to write bits of code to model a scenario. It did well, but it did too much: it produced bits of code that were not asked for. It produced too much. This said, drawing on my experience of programming (and of teaching) I could understand why it suggested what had been produced.

From the tutor’s perspective, if I had received a copy of what had been produced, I would be pretty suspicious, since I would be asking: “where did our student get all that experience from, when this is module that is all about introducing key concepts?”

Wearing a staff tutor hat

For those who are unfamiliar with the role of a staff tutor, a staff tutor is a tutor line manager. We’re a bit of academic and administrative glue in the OU system which makes things work. We get to deal with a whole number of different issues on a day-to-day basis, and a couple of times a year academic conduct issues cross my desk.

The university has to deal with and work with a number of existing threats to academic integrity, such as well-known websites where students can ask questions from subject matter experts and fellow students. Sometimes solutions to assignments are shared through these sites. Sometimes, these solutions contain obvious errors, which we can identify.

Responses to the threats to academic integrity include the use of plagiarism detection software (such as TurnItIn), the use of collusion detection systems (such as CopyCatch), the vigilance of tutors and module teams, the referral of cases to university Academic Conduct Officers, running of individual support sessions to help students to develop their study skills to ensure they do not accidentally carry out plagiarism, and effective record keeping to tie everything together.

When arriving at this event, one question I did have was: could it be possible to create an AI to detect answers that had been produced by an AI? Alistair’s earlier reference to a checker had partially answered my own question. Further question are, of course: how should such detection tools be used within an institution, and to what extent should academic policies be adapted and changed to take account of large language models?

Bring textual wishes to life

Christian Nold from the School of Engineering and Innovation (E&I) shared some information about an eSTEeM project with Georgy Holden. Students were encouraged to send postcards about their experience at level 1 study, sharing 3 wishes. The question that I have noted down was: wow can we use AI tools to generate personas from 3 wishes? Tools such as ChatGPT integrates different bits of text together and the generation personas could help us to think differently.

Core-GPT

Matteo Cancellieri and David Pride, both from the Knowledge Media Institute gave what was pitched as a KMi product announcement: they introduced CORE-GPT. Their project aims to combine open access materials with AI for credible, trustworthy question answering. The aim is to attempt to reduce the number of ‘hallucinations’ (made up stuff) that might be produced through tools such as ChatGPT, drawing on information from open access papers. More information about the initiative is available through a blog article: Combining Open Access research and AI for credible, trustworthy question answering. More information is available through the Core website.

ChatGPT and assessment

Dhouha Kbaier from School of Computing and Communications shared some concerns and points about assessment. Dhouha is module chair of TM355 Communications Technology. Following the Covid-19 pandemic, students are assessed through a remote exam. In their exam, students need to draw on discussion materials, and find resources and articles. Educators need to make students aware that there are tools that can detect text generated by large language models, and AI tools can create errors (and hallucinations).

One of the points I noted was: there is the potential need to adapt our assessment approaches. Educators also have a responsibility to do what they can to remove a student’s motivation for cheating. Ultimately, it isn’t in their best interests.

Can students self-learn with ChatGPT?

Irina Rets from the OU Institute of Educational Technology (IET) asked some direct questions, such as: can students learn through ChatGPT? Also, can AI be a teacher? In some respects, these are not new questions; a strand of research that links to AI and education has been running for a very long time. Some further questions were: who gets excluded? Also, what are the learning losses, and learning gains? Finally, how might researchers use these tools?

Chat GPT - Content Creation with AI

Manoj Nanda from the School of Computing and Communications also suggested that AI might be useful for idea generation. Manoj highlighted a couple of tools that I had not heard of before, such as Dall-e2 (OpenAI website) which can generate an image from a textual description. Moving to an entirely different modality, he also highlighted Soundraw.io. Manoj emphasised that a key skill is using appropriate prompts. This relates to an old computing adage: if you put garbage in, you’ll get garbage out (GIGO).

Developing playful and fun learning activities

Nicole Lotz from the School of Engineering and Innovation (E&I) sees tools such as ChatGPT as potentially useful for creative exploration. Nicole is module chair of U101 Design thinking, which is a first level design module. The ethos of the module it all about playfulness, building confidence, and learning through reflection. Subsequently, there may be opportunities to use what ChatGPT might produce as a basis for further reflection, development and refinement.

"I am the artist Riv Rosenfeld" - How ChatGPT is your new neoliberal friend

Tracie Farrell, from the Knowledge Media Institute, works in the intersection between AI and social justice. Tracie asked ChatGPT to write a paragraph about her friend and artist, Riv Rosenfeld. There was a clear error, which was that ChatGPT got their pronouns wrong. An important point is that “ChatGPT doesn’t know your truth”. In other words, the perspective that is generated by large language models comes from what is written or known about you, and this may be at odds with your own perspective. There are clear and obvious risks: marginalised groups are always not as visible. Biases are perpetuated. Some key questions are: who will be harmed, and who will be helped, and to what extent (and how) will these emerging tools reinforce inequality.

Discussion

After the short presentations, we went into a plenary discussion. It wasn’t too long before the history of AI was highlighted. John highlighted the two schools of thought about AI: a symbolic camp, and a statistical camp, and suggested that in the future, there might be a combination of the two. This related to the earlier point that these AI tools can’t (yet) do logic very well.

A further comment reflected an age old intractable problem that hasn’t been solved, and might never be solved, namely: we still haven’t defined what intelligence is. In terms of AI, the measure of intelligence has moved from playing chess, through to having machines do things that humans find intrinsically easy to do, such as assess a visual scene, and communicate with each other using natural language. The key point in the discussion was, of course: we need to ask again, what do we mean by intelligence?

Whenever a technology is discussed, an accompanying discussion of a potential digital divide is never too far away. AI may present its own unique divides: those who know how to use AI tools and can use them effectively, and those who don’t know about them, and are not able to use them. There are clear links to the importance of equity and access.

During the discussion, I noted down the words: “If you’re a novice programmer, what blocks you is your first bug”. In other words, knowing the fundamentals and having knowledge is important. Another phrase I noted down was: “It is perhaps best to view them as fallible assistants”.

Given their fallibility, making judgements about when to trust what an AI tool has produced, and when not to, is really very important. In other words: it is important to think critically, and this is something that only us humans can do.

Reflections

This was a popular event; approximately 250 people attended the first few presentations.

The presentations were quite different to each other. Some explored the question “to what extent might these tools present risks to academic integrity?” Others explored “how can these tools help us with creativity and problem solving?” The important topic of ethics was clearly highlighted. It was also interesting to learn about work being carried out within KMi, and the reference to the emergence of an institutional AI strategy (although I do hold the view that this should be thoroughly and critically evaluated).

I enjoyed the discussion section. In some respects, it felt like coming home. I studied AI as an undergraduate and a postgraduate student over 20 years ago, where the focus was primarily on symbolic AI. At the time, statistical methods, which includes neural networks, was only just beginning to make an appearance. It was really interesting to see the different schools of thought being highlighted and discussed. During the discussion session I shared the following memorable definition: AI is really clever people making really stupid machines to do things that look clever.

I confess to having been around long enough to know of a number of AI hype cycles. When I was a postgraduate student, I learnt about the first generation of AI developments. I learnt about chess and problem solving. I remember that proponents at the time were suggesting that the main problems with AI had been solved, which had the obvious implication that we would soon have our own personal robots to help us with our everyday chores.

The reality, of course, turned out to be different, since some of those very human problems, such as vision, sound and language were a lot harder to figure out. This meant there were no personal robotic assistants, but instead we did get a different kind of personal digital assistant.

Despite my cynicism, one aspect of AI that I do like is that it has been described as “applied philosophy”. When you start to think about AI, you cannot get away from trying to define what intelligence is. In other words, the machine becomes a mirror to ourselves; the computer helps us to think about our own thinking.

I once heard a fellow computer scientist say that one of the greatest contributions of computing is abstraction. In other words, when making sense of a difficult problem, you look at all its elements, and then you go on to create a new representation (or form) of the problem which then enables you to make sense of it all. I remember another computer science colleague saying, “when you get into trouble, abstract your way out of difficulty”. This can also be paraphrased as: “go up a level”.

We’ve all been in that situation when we’ve had multiple search engine tabs open, and we’re eyeballing tens of thousands of different search results. In these circumstances, we don’t know where to begin. Perhaps this is the problem that these large language models aim to resolve: to produce a neat summary of an answer we’re searching for in a neatly digestible format.

To some degree, generative AI can be though as “going up a level”, but the way you go up a level may well be driven by the data that is contained within a large language model. That data, of course, might well be incorrect. Even if you do “go up a level” you might be going up in entirely the wrong direction.

All these points emphasise the importance of taking a critical perspective of what all these new-fangled AI tools produce, but this does require those interpreting any results to have developed a critical perspective in the first place. We need a critical perspective to deal with instances where an AI tool might well provide us with not just machine generated “hallucinations” but also misinformation.

During my bit of the talk, I shared a perspective that I feel is pretty important, which is: “the most important thing in education isn’t machines or technologies, its people”. When we’re thinking about AI, this is even more true than ever. A screen of text looks like a screen of text. A teacher, tutor or lecturer can tell you not only what is important, but why, and what its consequences might mean to others.

I do feel that it is very easy to get carried away by the seemingly magical results that ChatGPT can produce. I also feel that it is important to view these tools with a healthy dose of AI cynicism and scepticism. If AI is applied philosophy, and this new form of AI enables us to more readily hold up a mirror to ourselves, it is entirely possible that we might not like what we see.

It is entirely possible that generative AI tools may well “read” this summary, and these reflections might well help these uncanny tools answer the question “how do humans perceive generative AI?” I’ll be interested to see what answer it produces.

Returning to the implicit question presented in the title of this event: “how generative AI going to change everything?” The cynic in me answers: “I doubt it”. It is, however, likely to change some things.

Other resources

A few weeks before this event, I was made aware of another related event which took place on 16 March, entitled Teaching with ChatGPT: Examples of Practice (YouTube playlist). This event was a part of a series of Digitally Enhanced Education Webinars from the University of Kent. These presentations are certainly worth a visit, if only to hear other voices sharing their perspectives about this topic.

After this blog was published, Arosha Bandara sent me a link to the following article: Stephen Wolfram writings: What Is ChatGPT Doing ... and Why Does It Work? It is quite a long read, and it is packed with detail. It's also one of those articles that will take more than a few hours to work through. I'm sharing it here for two reasons: so I know where to find it again, and just in case others might find it of interest.

Acknowledgements

The event was a KMi Knowledge Makers event. Many thanks to John for inviting me, and encouraging me to participate. Many thanks to all the presenters; I hope I have managed to share some of the key points of your presentation, and apologies that I haven’t managed to capture everyone’s presentation. The event was organised by Lucas Anastasiou (PhD Research Student), Shuang Ao (PhD Research Student), Matteo Cancellieri (Lead Developer - Open Research), John Domingue (Professor of Computer Science), David Pride (Research Associate) and Aisling Third (Research Fellow). Thanks are also extended to Arosha for sending me the Wolfram article.

Addendum

A couple of weeks after the event, I was sent a note by a colleague. Someone in KMi may have asked ChatGPT to write a summary of this article. A link to that summary is available through a KMi blog. I have no idea to what extent it may have been edited by humans. This made me wonder: I wonder how ChatGTP might summarise the summary.

Tags: academic conduct, large language model, LLM, generative AI, chatGPT, kmi, AI, artificial intelligence

Permalink Add your comment

This blog might contain posts that are only visible to logged-in users, or where only logged-in users can comment. If you have an account on the system, please log in for full access.

Total visits to this blog: 2888762