18 months of living with LLMs – what are we learning?

Our blog is written by Dr Imogen Casebourne who is DEFI’s Innovation Lab lead. She has an MSc in Artificial Intelligence (1995), for which she developed an AI program that wrote short stories (Sharples & Pérez y Pérez, 2023 p.8-10). She is currently co-editing a book for Springer on AI and Education and recently co-authored a paper on AI and Collective Intelligence. In 2023, she designed and taught a short introductory course on AI and Education for students at the Faculty of Education, University of Cambridge. In this blog she gives an overview of Large Language Models (LLMs) developments over the past18 months and considers their potential impact on education.

It is 18 months since OpenAI released GPT 3.5 and since then large language models (LLMs) have been increasingly integrated into business processes and social media. What are we learning about their potential impact on education?

LLMs in General	Q1 2023	LLMs in Education
LLM integration into business practices – e.g. Coca Cola, Duolingo, Moderna. Microsoft integrates ChatGPT into CoPilot. Google integrates their own LLM into Google docs. Meta announces its LLaMA LLM. Anthropic launches its Claude LLM. Open letter raises concerns about dangers of AI. Initial guidance from governments, and international bodies.		Concerns about essay writing and homework. Concerns LLMs are not ‘explainable’. Concerns that LLM writing can’t be reliably detected by plagiarism software. Initial guidance from educational institutions.
Lawsuits allege that content has been used in initial training of LLMs without permission from copyright holders and without financial recompense. International/governmental guidance on use continues to be refined.	Q2 2023	Concerns about bias, representation and ethical issues related to training. MerlynMind launch an LLM for education.
Additional lawsuits initiated, involving multiple LLM models and providers.	Q3 2023	Ongoing research into LLMs in education.
Increasing open source LLM initiatives.	Q4 2023	Ongoing research into issues of ethics and safety. Exploration of how LLMs might be used in education.
Grok LLM launched by Elon Musk. Meta announces LLaMA is integrated into existing products such as Instagram and WhatsApp. Updated UK governmental guidance. AWS announces partnership with Anthropic to help organisations integrate LLMs into business processes.	Q1 2024	Multiple organisations launch courses to help people understand AI. Ongoing research into ethics and safety as well as educational potential.
OpenAI announce MediaManager intended to enable creators to comprehensively opt out of having their work used in training models. Apple release new iPad containing chip intended to support functioning of AI Microsoft release Surface PC optimised for processing AI	Q2 2024	Ongoing research into ethics and safety as well as educational potential. ChatGPT Edu launches Google announce Learn LM and Illuminateexperiment

Some implications for education

AI literacy – the need to teach about LLMs and generative AI

Fake text, imagery and video have been an issue on social media for a while, and generative AI lowers barriers to generating such material. This creates a danger that people may more frequently encounter fake and harmful material.

People need to understand what generative AI is and specifically what LLMs are, how, when and where they might encounter them, what content they generate, whether to trust them and when it is appropriate to use them. For example, LLM training data does not include all human knowledge and may be unrepresentative, so a LLM based chatbot may be missing important information. An LLM based chatbot may also be biased and lack understanding of context. Finally, a need to teach people about generative AI implies a need to teach educators how to teach about generative AI and perhaps how to teach with generative AI.

LLMs at work – implications for the curriculum

As LLMs are integrated into business processes, meaning jobs could change or be eliminated, with implications for aspects of future curriculums. Some skills might be allowed to atrophy (as we no longer commonly calculate complex statistics with pencil and paper instead using tools such as SPSS), but it may be important to retain others, including, for example, formulating questions, making decisions and working collaboratively.

LLMs may remove opportunities to learn via an apprenticeship model (because it may seem easier to have an LLM do entry level tasks) at the same time as increasing the need for experts to review LLM output. It is not clear how the needed future human expertise will be developed, were entry level tasks and roles to be eliminated, which is a serious challenge for educators.

LLMs and academic integrity

LLMs can write convincing essays. This raises questions about the future of traditional written assignments and has led academic journals and academic institutions to develop new guidelines on appropriate use.
Having AI write an assignment and claiming it as student work is academic misconduct. However, there may be ways in which LLMs can actively support learning, some of which are discussed below.

LLMs as a form of assistive technology?

Spellcheckers have been with us for a while, tools such as Grammerly (AI but not an LLM) have been used as assistive technology. LLMs might be adopted in some forms of assistive technology, helping with elements of writing for individuals with special educational needs, or for non-native speakers and in English Language Learning.

Custom LLMs offering new dialogic approaches?

AI offers new educational opportunities, alongside risks. AI Intelligent Tutoring Systems (ITS) have existed for some time, based on a different approach to AI which doesn’t raise the same questions about academic integrity. LLM chatbots might act as ITS if educators can be confident that they will answer appropriately and accurately.

Currently and especially with free versions, this is not always the case. For example, ChatGPT 3.5 generates fake references, and when we asked it to walk through standard statistical procedures we found it consistently made errors in calculations (subscription versions now respond more accurately). However, organisations such as the Khan Academy have developed custom chatbots intended to reliably guide students, so we may see an increase in AI and LLM powered chatbots.

DEFI recently experimented with a webinar format (we believe invented at LSE) where writers listen to readers debate their article before responding to their thoughts and questions. These were necessarily small scale. Could LLMs offer a scalable opportunity for dialogue with an author and text? TeachSmart and Digital Don are custom GPTs created by book authors to enable learners to engage in dialogue with the text. Both demonstrate ways in which future learners might engage in dialogue with AI.

To investigate the potential and limitations of this, we worked with DEFI founder Professor Rupert Wegerif to test an interactive version of his new book, co-authored with Dr Louis Major: ‘The Theory of Educational Technology: a dialogic foundation for design. It is worth noting that OpenAI is explicit that content entered into the free (3.5) version is used to train future models unless users enable privacy settings across every browser. Our experiment was therefore based on a premium subscription with a contractual undertaking that content would not be used to train future models. We tested a preview version which was not released publicly.

As instructed, the GPT restricted answers to the text, politely informing users if a question was beyond the scope of the book. It gave a reasonable overview and could accurately point to references. However, it did not always provide the fullest response. Additionally, when asked about a related concept not directly addressed in the text, it offered pointers to common themes, making it apparent that it was answering in part based on associations gained from its initial training. This raises questions about how possible it might be to eliminate (or not) bias also inherent in LLMs (from their initial training) if using them for custom purposes. Also, unless carefully prompted, by producing the most statistically likely response, it may guide users to the obvious answer and lose any subtle point of nuance.

LLMs work by predicting the most probable response to a prompt (as explained here). To do this, they are ‘trained’ on large amounts of data to develop a model of how words and sentences relate to each other, and they may ‘learn’ bias from this process. We used GPT4 customised via prompt engineering. There are older models (e.g. De Vinci and Ada in the case of OpenAI), which ‘know’ less – and some custom GPTs are trained using models not available to the public via a chatbot. It isn’t currently clear to us whether using an older model would be sufficient to reduce or eliminate bias, or whether custom GPTs may in future be trained from the bottom up (as one educational provider, Merlyn Mind recently announced).

After we conducted this experiment, we learned that Google had announced an experiment of their own, with a very similar goal of enabling learners to enter into discussions with papers. We will watch this space with interest and have signed up to join the wait list to participate.

In summary

Unless state or international regulators act to roll back aspects of this trend, LLMs seem set to be increasingly embedded into every aspect of digital life, which raises profound questions for education.

We may continue to see LLMs used as educational tools, potentially in the form of chatbots acting as dialogic coaches, but there are concerns and risks and we should proceed with caution.

It seems unlikely that books will be released in large numbers on the Open AI platform specifically, but it is possible that publishers might in future offer interactive versions (in the same way that audio and ebook versions accompany paperbacks) or that publishers may in future licence content to be used in this way. We might see GPTs (assuming appropriate data security and respect of copyright) embedded into future ebook readers or learning management or library systems.

It will be important to continue to scan for, explore and discuss such developments and remain vigilant to potential implications. In a future blog we will consider what the future might hold for LLMs and education.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_188059718_1	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
CONSENT	16 years 4 months 7 days 12 hours 20 minutes	These cookies are set via embedded youtube-videos. They register anonymous statistical data on for example how many times the video is displayed and what settings are used for playback.No sensitive data is collected unless you log in to your google account, in that case your choices are linked with your account, for example if you click “like” on a video.

Cookie	Duration	Description
IDE	1 year 24 days	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.
YSC	session	This cookies is set by Youtube and is used to track the views of embedded videos.
yt-remote-connected-devices	never	These cookies are set via embedded youtube-videos.
yt-remote-device-id	never	These cookies are set via embedded youtube-videos.
yt.innertube::nextId	never	These cookies are set via embedded youtube-videos.
yt.innertube::requests	never	These cookies are set via embedded youtube-videos.