What is generative AI and how does it work?

What is generative AI?

It is a type of artificial intelligence (AI) which can write new content in response to user instructions (prompts).

It uses a large language model (LLM), which is a type of computer programme which can analyse large amounts of information, to learn from its training data to recognise patterns such as text, images or code. It finds patterns in the data to learn its underlying structures, styles, relationships and rules. For text it learns grammar, facts and writing style. For images, it learns shapes, colours, textures and how objects look.

In response to the user's prompt, it creates new content, using the patterns it's learnt from the training data. It predicts what word, image or sound is most likely to come next in the output it generates in response to the prompt.

For example, asking a generative AI tool “what words do you associate with granny smith apples” may produce words like ‘green’ ‘fruit’ ‘tart’ ‘crunchy’ ‘crisp’, because those words are commonly used when describing granny smith apples, and the tool will have likely been trained on lots of materials using those words. It may provide additional information on recipes, cultivation or gardening advice.

Generative AI tools all specialise in different tasks. It is important to use the right tool for the job.

What data does generative AI use?

Often LLMs are trained on data from the open web. This excludes non-public data held on secure networks, for example, paywall content or content which needs authentication to access. More specialist LLMs maybe be trained on licenced content or specific content for the job e.g. publisher websites or internal guidelines. Although LLMs are primarily trained on unstructured data, they can also learn from spreadsheets and metadata. This is called structured data. It is organised into predefined format to make it easier to search and analyse by computers e.g. OPACs.

When AI outputs are factually incorrect, misleading or entirely fabricated it is sometimes called a hallucination. These occur when the LLM is missing information, has been trained on incorrect information or information which is disputed. Answers are also based on probability, so incorrect returns may happen if something doesn’t follow the likely predicted pattern. To minimise hallucinations, LLMs use synthetic data. This is data generated by the LLM itself that tries to mimic real data. Giving more clarity to your prompt, but expanding on instruction, trying synonyms or pointing the LLM to the source of the information may also help.

Retrieval Augmented Generation (RAG) also helps to reduce hallucinations by using additional information when it predicts what should come next in an output. This additional information may come from a company's internal data, such as documents, emails, datasets etc.

How does it learn?

Learning can be supervised or unsupervised. In supervised you provide examples alongside accurate answers[AF1] . It will then use these to compare and learn differences. This method improves accuracy as you are providing clear guidelines for it to follow. It is useful of tasks like classification, image recognition, spam detections and regression (predicting trends). It takes time to train and label the data, has limited flexibility and is unable to cope if presented with something it hasn’t seen before.

Unsupervised models look for similarities likes clusters of data or structures and tries to make sense of it. It can be used to simplify data, whilst keeping important patterns or detecting anomalies in data that don’t fit a pattern. Generative AI uses this model. This model is less predictable and harder to evaluate. Often it will need human input to check and make sense of the data.

Some key terms

Data: the content used to train the large language model. Synthetic data is generated by the LLM using algorithms to overcome the limitations of the original training data.

Fine-tuning: additional logic added after the inference stage to increase accuracy and reduce hallucinations.

Hallucinations: when the LLM gets it wrong, usually because of missing or incorrect data.

Inference: prediction of the next token (word or part of a word) in a sentence.

RAG (retrieval augmented generation): extra information provided at the inference stage to improve accuracy and help minimise hallucinations.

Tokens: words or parts of words consumed in training and in creating outputs. LLMs predict the next token in a sequence.

Transformer model: used by the LLM to understand and make sense of the training data.

See also:

Generative AI - a basic overview. University Hospitals Plymouth NHS Trust.

Large language models - a basic overview. University Hospitals Plymouth NHS Trust