Glossary of AI Terminology

AI Glossary

Glossary of AI Terminology

Artificial General Intelligence (AGI)
Chunking
Context window
Deep Learning (DL)
Embeddings
Few-shot learning
Foundation Model (FM)
GAN (Generative Adversarial Network)
LLM (Language Learning Model)
Machine Learning (ML)
NLP (Natural Language Processing)
Neural Network (NN)
Prompt Engineering
Reinforcement Learning from Human Feedback (RLHF)
Supervised Learning
Token
Tokeniser
Unsupervised Learning
Vector Database
Zero-shot learning
Elementary
Imagine if a robot could do everything you can: from playing football to painting pictures, and even making jokes. That's AGI! It can understand, learn, and do lots of different things, just like humans.
Imagine you're playing with a big box of Lego. If you have to build something, it's easier if you group similar pieces together. You put all the blue pieces in one pile, the yellow ones in another, and so on. That's what 'chunking' in computers is like. It's about grouping similar things together to make them easier to understand.
Imagine reading a book and only remembering the last few words you've read. That's what a context window in AI does. It's like a spotlight in the dark that can only see a certain number of words at a time.
Remember when you learned to ride a bike? At first, you had to think about every little thing, like pedalling and balancing. After a while, you could do it automatically, almost like your brain had layers: one for pedalling, one for balancing, and one for steering. Deep Learning is like that. It's a way for computers to learn things in layers, each one learning something different.
Imagine every word is a different toy in a huge toy box. Each toy has its own special spot in the box that helps you know more about it, like its colour, size, or what it does. Embeddings do the same but with words or other things in the computer's brain.
Imagine if you could learn to play a game really well just by watching someone else play it a few times. That's few-shot learning! The AI can learn from just a few examples.
You know how you use LEGO bricks to build all sorts of different things? A foundation model is like the base or first few layers of your LEGO construction that other things can be built upon.
Imagine you and your friend having a drawing competition where you make a picture and your friend tries to tell if it's real or fake. You keep improving your drawings based on your friend's feedback. That's what GAN does, but with computer programs.
Imagine you have a giant toy robot that loves to read books. After reading lots of books, it starts to understand and use the language just like humans do. That's what a LLM does, but with computer codes and data instead of physical books.
Imagine you're teaching your puppy to sit. You say "sit" and when the puppy sits, you give a treat. The puppy is learning what "sit" means by your instructions and rewards. This is a lot like machine learning – computers learning to do something (like recognising a photo or understanding speech) by being trained with lots of examples.
NLP is like having a secret decoder ring for understanding and talking in computer language. It helps computers understand and respond to us in human language.
A neural network is like a team of brainy ants. Each ant doesn't know much on its own, but together they can solve big problems. Each ant, or "neuron", takes in some information, does a tiny bit of thinking, and then passes its results on to the next ants. By working together, they can do things like recognise pictures or understand speech.
Think about a teacher who gives you a question in a special way that helps you give a better answer. That's what prompt engineering is like. It's about asking the AI in the right way to get the best answer.
Imagine if you played a video game and every time you made a mistake, a friend corrected you. That's RLHF! The AI learns from human feedback, just like you learn from your friend.
Imagine your teacher helping you to understand your maths homework by showing you how to solve a lot of similar problems. That's supervised learning! The AI learns by being shown examples.
A token in AI is like a piece in a puzzle. Just like a big puzzle is made of many small pieces, a sentence or paragraph in AI is made of many tokens, usually words or parts of words.
Imagine a machine that chops up a big chocolate bar into little pieces so you can share it with friends. A tokeniser in AI is like that! It chops up text into smaller parts called tokens.
Imagine learning to play a game without anyone teaching you the rules, just by trying it out and understanding it yourself. That's unsupervised learning! The AI learns without being given any examples.
Think of a vector database as a huge digital library where each book (vector) has a specific place. This makes it easy to find exactly the book (vector) you want quickly.
Imagine you've trained a dog to sit, stay, and roll over. One day, you ask it to fetch, and it does it without ever being taught! That's zero-shot learning, where the AI can do tasks without seeing examples.
Intermediate
AGI is a type of AI that has the ability to understand, learn, and apply knowledge across a wide range of tasks, similar to human intelligence.
Chunking is a method used in computer science where data or information is broken down into smaller 'chunks' or groups to make it easier to process or understand. It's like when you study for exams - you might break down the material into smaller sections or 'chunks' to understand it better.
A context window in AI is a fixed-size area that the model uses to focus on a certain segment of data, like words or sentences, to make a decision.
Deep Learning is a subfield of machine learning that uses layered neural networks to learn from data. It's like having a multi-layered filter: the first layer might recognise simple shapes, the next layer recognises complex shapes using simple shapes, and so on, until you understand the whole picture.
Embeddings are like a map of words or items, where each word is placed in a way that shows how similar it is to other words. It's a way for AI to understand the 'meaning' behind words or items.
Few-shot learning is a method in AI where the machine is trained to perform tasks accurately with only a small amount of examples.
A foundation model in AI is a general-purpose model, much like a core structure of a building. You can add to it or modify it for various specific tasks.
A GAN is like an AI art contest. One part of the AI tries to make convincing 'fakes' (like creating a new image), and the other part tries to tell if they're real or fake. They both learn from each other.
LLM is like an advanced version of your language class, but for computers. It uses AI to understand, learn, and generate human languages from a massive amount of data it has studied.
Machine Learning is a part of artificial intelligence where computers can learn and make decisions without being explicitly programmed to do so. It's like when you study for a test - the more problems you solve, the better you get. Computers use algorithms to process data, learn from it, and make predictions or decisions.
NLP is the study of how computers can understand and use human languages. It's like teaching a computer to understand the language we speak and write.
A Neural Network is an artificial intelligence model inspired by the human brain. It has "neurons" that process information and learn from it, then pass on what they've learned to other neurons. It's like a relay race - each runner (neuron) passes the baton (information) to the next, gradually getting to the finish line (the solution).
Prompt engineering is the process of carefully crafting the inputs to an AI system to get better outputs.
RLHF is a technique where AI systems learn from human feedback, improving their performance based on what they did right or wrong.
Supervised learning is a method in AI where a model is trained on a labelled dataset, meaning each piece of data in the training set has a known output.
A token in AI, often used in natural language processing, is a single unit of data. It could be a word, a character, or a subword depending on the language and task.
A tokeniser is a tool in AI that breaks down text into smaller pieces, or 'tokens', that can be understood and processed by the system.
Unsupervised learning is a type of AI where the model learns patterns in data without being given any specific output to aim for, like clustering similar data together.
A vector database is like an organised storage system, where each piece of data (vector) is stored in a way that it's easy and fast to find exactly what you're looking for.
Zero-shot learning is when an AI system can accurately perform tasks that it hasn't explicitly been trained on, using knowledge it has learnt from other tasks.
Advanced
AGI is a form of AI with the cognitive capabilities equivalent to those of a human, being capable of understanding, learning, and applying knowledge across any intellectual task that a human being can.
Chunking is a strategy used in artificial intelligence and cognitive science to break down complex data into manageable and understandable units, or 'chunks'. This technique is essential for machine learning processes where it assists in pattern recognition and data compression.
In natural language processing (NLP), a context window is a sliding window that a model uses to focus on a fixed-size segment of input data to generate predictions. It's fundamental for understanding and predicting elements within a sequence.
Deep Learning, a subset of Machine Learning, leverages neural networks with many layers (deep neural networks). These layers process and transform an input to produce an output, capturing complex patterns. Techniques include Convolutional Neural Networks (CNNs) for image tasks, and Recurrent Neural Networks (RNNs) for time-series data.
Embeddings are a form of representation where items (like words) are mapped to vectors of real numbers. Similar items are closer in this space, which allows models to understand semantic relationships between the items.
Few-shot learning is a concept in machine learning where the goal is to design machine learning models that can learn useful information from a small number of examples - typically on the order of 1-10 training examples.
Foundation models are large-scale models pre-trained on broad data, serving as a starting point for many downstream tasks. You can fine-tune these models for specific tasks, like text classification or object detection.
A Generative Adversarial Network consists of two neural networks: a generator that creates data samples, and a discriminator that attempts to distinguish between real and generated samples. Through their competition, they both improve.
A Language Learning Model is an AI model that's trained on large datasets of text. It's used for tasks involving understanding and generating human language, like text prediction, translation, and summarisation.
Machine Learning is a subfield of AI that focuses on the development of algorithms and statistical models that enable computers to perform specific tasks without explicit instructions, instead relying on patterns and inference from data. Techniques include supervised, unsupervised, semi-supervised, and reinforcement learning.
Natural Language Processing involves techniques for computers to understand, generate, and respond in human languages. It involves many subfields like sentiment analysis, machine translation, and named entity recognition.
A Neural Network is a computational model used in Machine Learning and Deep Learning. It's designed to simulate the behaviour of biological neurons and nervous systems. They are composed of layers of interconnected nodes or "neurons" which process and pass on information. They are optimised via methods such as gradient descent and back-propagation.
Prompt engineering is a technique in NLP where the input queries (prompts) are designed in a particular manner to extract the desired outputs from a language model. It's about optimising the question-asking process to the AI system.
RLHF is a machine learning strategy where an agent learns to make decisions by receiving feedback from humans. It's a form of reinforcement learning where the reward signal originates from human evaluators.
Supervised learning is a paradigm in machine learning where a model learns from a dataset in which the correct outputs (labels) are known for each instance. The model generalises from this data to predict the output for unseen instances.
A token in the context of NLP is the smallest unit of processing. The process of breaking down text into tokens is called tokenisation. Depending on the granularity, a token can be a word, subword, or character.
A tokeniser is a component in NLP that takes raw input text and splits it into tokens - smaller units of text that the model can understand and process. The granularity of tokens depends on the tokeniser, varying from words to subwords or characters.
Unsupervised learning is a type of machine learning where models learn from data without any labels. The model identifies structures, patterns or intrinsic properties of the input data on its own.
A vector database is a specialised data management system that allows efficient storage and retrieval of high-dimensional vector data. It's often used in conjunction with embeddings in various AI applications.
Zero-shot learning refers to a machine learning scenario where the model is required to handle tasks it has not seen any examples of during training. The model extrapolates from the training data to make predictions about unseen categories.

Seeking AI Advice?

For an informal yet private conversation regarding your organisation’s requirements, or to further understand our AI education, strategy, and innovative services, please feel free to contact us.

Why not savour a cup of tea, relish a coffee, or engage in a round of pétanque whilst exploring the potential of AI?