Introduction to Generative
2K views
Jun 23, 2024
Explore the fascinating world of Generative AI with this introductory video. Learn how Generative AI works, from creating realistic images and videos to generating human-like text and music. This video covers the foundational concepts, key technologies, and real-world applications of Generative AI, making complex ideas accessible for beginners. Discover how these powerful tools are transforming industries such as art, entertainment, and technology. Whether you're a tech enthusiast, student, or professional, this guide provides a comprehensive overview to help you understand and harness the potential of Generative AI.
View Video Transcript
0:00
If you're brand new to AI or if you just haven't had time to keep up with all the latest development
0:04
and you've been hearing a lot of buzz about it, in this one video, I want to show you everything you need to know about AI to keep you up to date
0:11
So back in November of 2022, a relatively unknown company called OpenAI released a free website called ChatGPT
0:20
Now within one month of that, over a hundred million people in the world try ChatGPT
0:26
So what is ChatGPT exactly? Well, in the simplest term, it was just a chatbot that could help you write text or you could answer your questions
0:34
And that was the very first moment where AI went mainstream. And AI has been around for a long time, but it really never had a moment like the moment that ChatGPT gave it
0:45
Now, this new wave of AI that ChatGPT started is called Generative AI
0:50
And that means it could generate things like text, it could generate computer code, it could generate images, videos, and audio
0:57
too. Now, there are a lot of different AI tools, though, since chat GPT came out, that could
1:03
generate one or more of these things. So at the very top of the list, the apps that could
1:09
generate text include chat GPT. That was the first mainstream. But after that, Microsoft also
1:15
introduced an AI chatbot, and it was inside of the Bing search engine. It was called BingChat
1:21
But later, almost a year later, they rebranded it to co-pilot. So that's what you'll see now
1:26
Google released one called Google Bard. Now, ChatGPT got a whole different competitor called Claude
1:34
That's by a different company, a startup called Anthropic. Now, Meadow also released one, X.com released one
1:40
So all the big companies basically created their own generative AI model
1:45
Now, all these generative AI models that could produce text as the output
1:50
they are called large language models. So generative AI, think of that as the big category of AI
1:56
and within that you have the large language models that fit into this category
2:01
So how do these large language models work exactly? How come they seem so intelligent
2:06
So these large language models basically get trained for months, sometimes for years, on a massive amount of text data
2:14
So this text data could be public information, for example. It could be from different websites
2:20
It could be from textbooks. Sometimes it's from private information that those specific companies have access to
2:26
Now, this could be millions, sometimes billions of word of text that trains these models
2:32
And once they get trained, basically because they've seen so much words out there
2:38
they become an incredible guessing machine. So technically what they're doing is they're guessing what words comes after another word
2:46
I know it seems very simple, but this is technically in the background how they're working
2:51
and I'm simplifying it obviously. but to really go to the essence of how these work
2:56
they are making educated guess on what word comes after an other word Now this process of training these models with all those words it actually costs tens of millions of dollars typically So these companies are usually the bigger companies that can do this
3:12
because how much it takes to actually train these AI models, these large language models
3:17
But after this initial set of training, these large language models are created
3:23
and at that point, they're referred to as foundational models. So GPT, the foundational model behind chat GPT was created this way
3:33
And after this set of training, they go through another whole set of training called fine tuning
3:38
Now, during that fine tuning stage, you could get these large language models to respond a certain way
3:44
They could have a certain persona or they could have a more specific domain of knowledge
3:50
Now, all the companies that I mentioned so far have these foundational models
3:54
but there are multiple business models behind it that's relevant to regular people using these
4:00
So META, for example, has a large language model they developed called Lama
4:05
But they decided to open source this model, meaning any developer out there
4:10
or any business owner can use it to build apps and AI tools on top of Lama free of charge
4:18
Now, OpenAI, the company behind ChatGPT, their foundational model is GPT, and that other company I mentioned, Anthropic, has Clod
4:28
Now, they are not open sourced, but they have something called an API
4:32
And with an API, you could basically pay them to use their technology
4:37
So this means a lot of people could use the technology behind Chat, GPT, and behind Claude
4:43
the two big language models out there to actually build any type of AI app that they want
4:49
But for everyday people, which I'm making this video for, we don't need to know about APIs
4:54
We're just going to use the regular version of chat GPT and Cloud and other models
4:58
But in the case of chat GPT and cloud, both of these companies also have a paid version
5:03
That is $20 a month. And the only reason you want to upgrade to the paid version, something that I've done
5:08
is they usually give you the best version of that model as a paid upgrade
5:14
So the $20 a month gets you a better version of chat GPT that actually is trained on more data
5:21
and it could give you better responses. And it just generally seems more knowledgeable
5:26
And sometimes there's other limitations like how much data you could feed it to get the information out
5:32
So sometimes you have to pay that upgrade to get the best version of those large language models
5:37
But they all do have a free version too. And then we have Google Bard and we have X.com's GROC and we have that Microsoft co-pilot
5:46
So Bard and GROC are available. GROC is a paid upgrade. Bard currently is free
5:51
and they have a different model running in the background to power these
5:56
So on Bard is called Gemini, for example. And Microsoft Copilot is actually a partner with OpenAI So every time you use CoPilot you technically using the latest version of chat GPT Okay so now you have a good understanding
6:10
of how all this came about, what large language models are, what companies are in play
6:15
So what's the best way to actually use large language models? Well, there's many, many, many use cases
6:21
that are very practical for people in day-to-day activity. For example, we all probably write at least one email every single day
6:29
So these models could write that email for you. You could do that inside a BART
6:33
You could do that inside a chat GPT. BART, for example, connects through extensions with your Gmail
6:39
So you could actually read your Gmail and help you reply by drafting that reply
6:44
You could proofread and rewrite any existing text you have to make it more formal
6:50
to make a shorter, more professional, more friendly. You could translate any language
6:54
You could brainstorm. So a lot of times I'll have a back and forth conversation with chat, GPT
6:58
to come up with a solution to a problem. You could create tables and spreadsheet
7:03
You could write code even if you're not a developer. You could take a screenshot, for example, of a webpage and say, write me code to get
7:11
something that looks like this. This is something you could do with a paid version of chat GPT, for example
7:17
Some of these platforms even have tools where they could yze PDFs that you could give
7:22
it with all kinds of different data that just wouldn't be possible, almost becoming a data
7:26
scientist with these AI tools. And some of these will do a better job than others
7:31
So sometimes you might want to use chat GPT if you're writing or summarizing text
7:36
but maybe when you're doing research, Google Bard is going to do a better job. So it's always best to try multiple models, the free version of these multiple models
7:44
depending on what you're doing day to day. If you're heavy on research and require heavy internet browsing
7:51
Bard and co-pilot usually do a better job. but in some cases if you're doing heavy writing or emailing
7:58
and all kinds of different things chat gpte could outperform those so try it for yourself
8:03
to see which one fits your day to day the best now the way you use these AI tools
8:08
is you simply type in a text prompt in these messaging apps which is really what they are
8:13
and you wait for a reply so the text you input into this chat box is called a prompt
8:20
so the better you get at the prompt the better the output
8:24
from these AI models. And this whole science is called prompt engineering
8:31
simply learning how to craft the right message the right way so these AI chat bots could actually give you a better response
8:39
So that's the part about large language models. Obviously one of the biggest parts of generative AI
8:44
but there's a whole other side to generative AI as well. The other technology that falls in the category of generative AI
8:51
is called diffusion models. So unlike the large, language models we covered. Diffusion models are really designed to create images, videos, and audio
9:01
again from a text prompt Now these are trained in a similar way as the large language models that we covered but they not trained on text They actually trained on images and sounds So the leading apps in this world could take a text prompt and then give you an image or a video
9:17
or an audio file like a music file as the output. Now, in the text to image AI tools right now, the leading company is called Mid Journey
9:26
but OpenAI has a really good tool too called Dolly, and that is available inside a chat GPT
9:31
with the paid upgrade. but meta, Google, Adobe, really most of the big companies out there have some really great text
9:38
to image models as well. And there's one company I want to specifically mention called Stability AI because they are
9:46
the only one that has an open source model that is really good and that is called stable diffusion
9:52
So you'll see a lot of different apps that could turn a text prompt into an image that use that
9:59
open source technology called stable diffusion. And Stability AI, the company that opens source stable diffusion actually has two really good apps
10:06
One is called Dream Studio. And the other one, which is really popular, is called ClipDrop
10:11
And since video is created by a sequence of images, this technology can be used to create generative video as well
10:19
So you have runway, you have Kiber, and you have PICA, and those are some of the companies that are working on text-to-video AI generation
10:27
And in the world of audio, because this diffusion model could generate music and audio
10:31
there's a company called 11 Labs that could create very human-like voices from a simple text prompt
10:39
in a lot of different accents and languages. I'm one of the 11 Labs voices
10:44
I can speak over 20 languages. And with 11 Labs, you can clone your own voice too
10:49
And there are companies like HeyGen that can clone you too. Yeah, just then, that was not me
10:54
That was actually HeyGen and I used that to clone myself. It cloned my voice and it cloned that video
11:00
so I just typed in that sentence and he made that video. Now, as you could imagine, since a lot of these big companies behind this tech have made their tech available
11:09
there are thousands of AI apps built for very specific use cases
11:15
Really, the big companies, for example, Open AI, their mission is to create what's called AGI
11:21
That means artificial general intelligence. They want to build an AI that could do everything
11:26
The smaller companies, though, that are using that technology behind the same
11:30
scenes are more focused on more specific AI tools. So I've put together a list of the top 50 AI tools that I think are worth a try
11:40
And I'll include that as the video that I recommend watching next. And if you want a deeper dive, my team and I have spent the last year creating a Netflix
11:47
style e-learning platform all about generative AI and all the top tools
11:52
So if you want to learn how to effectively use chat GPT, if you want to learn prompt engineering
11:57
if you want to learn the top 50 AI tools, we have entire. courses on nearly all of them on our platform called Skill Leap AI. So I'll leave a link below in the
12:06
description if you want to learn more about that. I hope you found this video useful and I'll see you on
12:10
the next one
#Intelligent Personal Assistants
#Machine Learning & Artificial Intelligence
#Online Media
#Open Source