Definitions
Artificial Intelligence can be defined as:
- An entity that performs behaviors a person might reasonably call intelligent if a human were to do something similar.
It's crucial to remember that AI's definition is human-centered and based on our perception of intelligent behavior.
What is Machine Learning?
Machine Learning is:
- A means by which to create behavior by taking in data, forming a model, and then executing the model.
What is a Model?
A model is a simplification of some complex phenomenon. For example, a model car is a smaller, simpler version of a real car.
A model car might be useful for some things, but we can't drive it to the store.
We create models to predict data all the time. A linear or logistic regression is a simple model. Prior to about 2017, most of what we called "AI" was about prediction. Amazon predicting what you would want to buy or predicting what elements on your social media feed would get the most engagement. People tried to develop models to predict language, but they weren't very good at it.
Attention is All You Need
In 2017, a paper, called "Attention is All you Need" introduced a new type of model called a transformer model. At a very basic level, transformers allowed models to pay attention to context, which is very important for language. Older models would only look at the previous word in a sentence and wouldn't do a very good job at predicting language because they didn't have the ability to see larger context.
A more detailed explanation of transformer models can be found here.
Amazing Autocomplete
Keep in mind that these models are still just doing predictions. They are trying to predict what word comes next in a sentence. The way they make those predictions is through a training process ingesting billions of documents from the internet. These models use machine learning to create a model of how language words.
The model predicts words like "dog" or "cat" should come next here, but not "banana" or "pineapple." They still just function like a very advanced version of autocomplete.
The Age of Large Language Models (LLMs)
Large language models first started to appear in 2019-2020, and they weren't particularly interesting because they could write at about a middle school level. This was until late 2022 when ChatGPT was released. ChatGPT was much larger than previous models and appeared to create a new kind of AI that was more capable than we expected. Including exhibiting pretty good performance on standardized tests of knowledge.
USMLE = United States Medical Licensing Exam
These models were tested on medical knowledge, and they performed incredibly well — and remember, these are general models. They are not specifically trained on medical knowledge. General models (that are much larger) actually do better than many models developed just for a specific task. And these models are only getting better...
Generative Artificial Intelligence
We have been talking primarily about large language models but understand that transformer architecture made a whole series of other models possible. There are forms of AI that can generate audio, images, and video as well as take these as inputs (multimodal models). All of these models together are referred to as Generative Artificial Intelligence because they create new content.
Types of Large Language Models (LLMs)
When we talk about different forms of AI, it can get quite confusing quickly to differentiate between the companies that develop models, the interfaces we use to interact with the models, and the models themselves. The following is a list of LLM providers. You will note that a given company has several models. Take OpenAI, for instance, they have four main text models that I have listed (GPT-4o, GPT-4o mini, GPT-4, GPT-3.5). Additionally, they have "reasoning" models like o1 and o1-mini, which basically perform chain-of-thought prompting (interacting-with-llms-prompt-engineering) for you. The public interface that most people use to access these models is ChatGPT, which, as of this writing, gives you access to GPT-4o and GPT-4o mini (two different models).
The version of these models that ChatGPT uses when you log on may change as new updates arrive. If you want greater control for what version of a model you are using, you can access these models through an Application Programming Interface (API). This will give you select a particular model version and gives you greater control over the model behavior. This is probably not necessary for a basic user but should be considered if doing research using AI models.
List of LLM Providers:
- ChatGPT (OpenAI)
- https://chatgpt.com/
- Models: o1, o1-mini, GPT-4o, GPT-4o mini, GPT-4, GPT-3.5
- Claude (Anthropic)
- https://claude.ai/new
- Models: Claude 3 Opus, Claude 3 Sonnet, Claude 3.5 Haiku, Claude 3.5 Sonnet
- Gemini (Google)
- https://gemini.google.com/app
- Models: Gemini 2.0 Flash, Gemini 1.5 Flash, Gemini 1.5 Pro, Gemini 1.0 Pro, Gemini 1.0 Ultra
- Llama (Meta)
- https://www.meta.ai/
- Models: Llama 3 series (8B, 70.6B, 405B). B=billions, the number of parameters (size) of the model)
- Grok (xAI)
- Available on https://x.com (no free tier available)
- Models: Grok-1.5, Grok-2, Grok-2 mini
- Deepseek
- https://www.deepseek.com
- Models: Deepseek-V3
There are thousands of other models available trained for specific applications. You can view many of these models at https://huggingface.co/.
Pricing/Cost/Access
Access to many of the basic models that these companies offer is free, however there are limits to usage with free tiers. All of these companies offer a paid tier that is ~$20 per month that will give you access to more features and models and a higher usage cap. We suggest trying a month of a paid plan as you get started to see the full capabilities these models offer.
If accessing through the API, cost is assigned by the amount of use.