From 3D model to AI on the Edge - Azure For Sure Ep. 5
4K views
Nov 6, 2023
Join this live session with Stephen SIMON and Goran Vuksic around Synthetic Data on March 10 at 9:00 AM Pacific Time (US). ABOUT SESSION Synthetic data is an inexpensive alternative to real world data that is used to train and improve AI models. In order to train accurate AI models, a large amount of data is needed. With use of realistic 3D models you can easily create synthetic data for AI classification and object detection. In this session you will see how synthetic data can be created, and how this data can be used to train AI models that will be used on the Edge. Azure Percept will be used for a demo of how model trained with synthetic data works on the Edge. #AzureForSure #Azure #live
View Video Transcript
0:00
Thank you
0:29
We'll be right back
0:59
Thank you
2:00
Hi everyone and welcome back to another episode of Azure 4 Show
2:09
I'm your host Simon and we are back once again. If you're someone who's joining us for the very first time
2:14
we stream this show, Azure 4 Show, every Thursday at 9am Pacific time
2:21
And some of you may know that this is only 10 episode series
2:24
and we are almost halfway down. And I couldn't be more excited for today's episode
2:28
because today we're going to host Cohen who is a Microsoft AI MVP
2:33
We get a lot into it later on and talk to Cohen what he's going to talk about today
2:38
But if you're someone who has joined all the past episodes, but in case you have missed some of those
2:45
wherever you are watching, we are streaming all the episodes over there
2:49
Maybe C Sharp, Conor, maybe Cloud Summit, maybe my LinkedIn profile and a couple of more destinations
2:53
You can find all the links over there. And yeah, I think it's time to go ahead and bring our speakers
3:00
Today, we're going to talk about from 3D model to AI on the edge. I definitely have no idea what the content is going to be about
3:06
That is why I'm so excited. So yeah, Goran is an AI MVP
3:11
I am hosting him for the very, very first time. So I'm really excited
3:15
So without any further ado, let's bring our guest into the live show
3:25
Thank you
3:55
Hi, Gurun. Welcome to the Azure for Sure Live show
4:13
Hello, and thank you. It's great to be here. So, Gurun, quickly, since I'm hosting for the very first time, where are you actually joining us today from
4:24
From Malmo, Sweden. Yeah. Okay. That's great. That's where I live. I work in Copenhagen in Denmark, which is just across a bridge
4:35
commuting between two countries each day. Yeah. And originally from Croatia. That's great
4:41
So what time is it at your place now? It is 6 p.m. All right
4:46
6 p.m. Mine is 10.30 p.m. I just had my dinner. So, Gurren, you are Microsoft AI MVP
4:53
right so quick I know I follow you on LinkedIn I know you do a lot of stuff with these robots I
5:00
saw the picture in the beginning you have these robots and all this stuff right so what do you do
5:04
actually in your in your this community and in your home the workday so first workday I work as
5:13
engineering manager in a company named Pandora maybe you heard about it most popular jewelry in
5:20
the world and i'm leading the data infrastructure team and uh yeah in my free time i'm um tech guy
5:29
who likes to you know write the blog posts experiment with technology try to do a lot of
5:35
cool stuff i often use some toys in my projects uh legos are my favorite you will see some some
5:43
today and so on yeah so it's um i attend a lot of conferences called workshops and such
5:52
yeah i mean definitely you're involved in a lot of cool stuff i see there govancy today you're
5:58
gonna talk about from 3d model to a on the edge i'm really excited to it uh because i've never
6:03
seen a session title that has 3d and ai together so i'm really excited to it i'm not going to take
6:10
much of your time, Karun. I see you're already sharing the screen. I'm going to add it to the
6:13
stream. Everybody else can see it. And next 25 to 30 minutes is all yours. Thank you. Okay. So I
6:20
gave a quick introduction to myself. So I will just add that you can find me on Twitter and
6:28
LinkedIn. Feel free to connect. Feel free to let me know what kind of cool AI IoT projects you work
6:34
on and yeah if you enjoyed this session maybe you want to join some some of the upcoming ones
6:42
so today we will talk a bit about computer vision and AI in in general then we will
6:55
we will slowly build the story through some basic examples that I will show you and go into into the
7:03
story about synthetic data what it is and how it's used and why it is important why it is actually
7:10
the future of AI so computer ambition I will not take long about it everybody knows what it is and
7:19
some common tasks with with it we are doing trying to do image classifications telling okay is this
7:29
picture a cat or dog doing the object detection, figuring out objects in the image, OCR, facial recognition
7:39
pose estimation, and so on. So our computers nowadays are able to see
7:44
and we can define different tasks for them to do that. And through this talk, we will focus on object detection
7:55
So let's imagine we have this picture, right? With object detection, we could, for example
8:00
identify glasses are here. If you use this picture in some standard
8:06
Azure computer vision, like it would probably also detect the dog, detect the laptop in background and so on
8:16
So with the object detection, we are trying to find, recognize some object
8:22
that is in picture and find the bounding box where this object exactly is placed right and you can use as i said computer vision for some
8:34
standard tasks it is trained to recognize thousand uh general models but what if you need something
8:42
custom if you want to recognize this lego batman for example right then you need to train your own model and you can do it with the custom vision I will show you one project that I did a long time ago
8:55
and it's called Where's Chewy? We are looking for this Lego minifigure Chewbacca on this board
9:04
trying to recognize it. And this is something that computer vision model is not trained for, right
9:11
We are using some custom objects over here and trying to do that
9:18
If you want to do it on your own, it's super simple
9:23
Like go to custom vision AI, create a project there, just give it a name, define a resource
9:29
You would like to object detection in a general domain, and you can start with your projects
9:35
So how it works. First, you need to upload images. of this object that you are trying to recognize you need to tag the images define where in those
9:48
images is the object you need to train the model and then you test pretty simple probably most of
9:55
you did it already just quick walkthrough so we are uploading images you see I took a lot of
10:04
pictures of the Chewbacca minifigure standing on the LEGO catalog, just to be some different backgrounds so
10:12
computer, so custom vision can easier identify whether the object is. Then for each of those pictures, we need to define
10:23
okay, where is where is our Chewbacca on it, right? Define our own bounding box and give it a tag
10:34
Chewy, as you can see in this picture. And once you do it for all the images
10:40
you can train the model. And over here on the top right, you
10:44
can see there is a quick test. So immediately in this web interface
10:50
you can click the quick test. You can select some picture that this model
10:56
haven't seen before and try with that picture to identify where the object is
11:02
This was the first test because it used really a small amount of images
11:09
And as you can see, detections are not correct, right? It's detecting something, but that is not Chewbacca
11:17
And on the bottom right, you can see that it's only 20% certain about it
11:23
so what we need to do now this red line we went through the flow we tested the model we are not
11:30
happy with result we go back we upload more images we tag them again train the model and we test it
11:37
again uploading more images will give us a better result as you can see over here Chewbacca is
11:47
identified correctly with 40.8 percent certainty that probability that this is uh this is correct
11:57
uh giving giving this model more and more images will make it more accurate right we will easier
12:06
identify where the where the object is uh that we are looking for but if we step a bit back and tick
12:17
think, OK, how this actually works, and take a whole picture, we see that it's really time consuming
12:25
to take those images, to upload them, and tag them. This is where the most of our time is going
12:34
So is there something that we could do to make this process easier
12:41
And this is something where synthetic data can help us out. And by definition, synthetic data
12:52
provides this inexpensive alternative to real-world data that is used to create accurate AI models
13:02
So with use of 3D models that are photorealistic, we could generate a lot of images
13:13
and train our model with those images. So we could shorten this time for data collection and tagging
13:19
We could also, in the production, minimize costs for the data collection
13:25
We can even reduce the bias in training data and get more accurate AI detections
13:32
That's for sure. So imagine, for example, you want to detect some specific object
13:40
And like nowadays in whatever industry we talk about, you will most likely have 3D model of your product
13:50
So you could use the 3D model to generate images and to train the model
13:57
And synthetic data is not something that I came up with or that is my idea
14:03
Far away from that. Synthetic data is like a really hot topic
14:09
You will see it on the Gartner's hype cycle last year. You can see it like this research you see now on the screen
14:20
It was published two years ago where they predict like that use of synthetic data will be like three to four times higher in 2030 than use of real data
14:33
Because real data, when you are taking pictures, you need to think about GDPR, many other things
14:42
And with synthetic data, you can generate those pictures and train your models
14:50
Okay, so sounds great, right? But what kind of solutions we have out there
14:57
Unity is working on their solution called Perception. It allows you to upload your 3D model of your object
15:09
And then with this perception package, like a plugin for Unity, you can generate the pictures
15:18
And over here on the right side, you see this object in the middle and a lot of other objects in the background, right
15:28
and it should look like something on the left here once you take the picture and the perception
15:37
will tag automatically the project that this object that you are looking for right and generate
15:46
the pictures for you and you can use these pictures to to train your model there is also
15:53
solution from nvidia uh it is called uh omniverse replicator uh and this is like a simulation uh
16:03
framework that will uh allow you to generate synthetic data and it is mostly focused on the
16:11
autonomous vehicles and for training robots so for situations like uh robots in the warehouses
16:19
and such this is this is really uh great you can check that out also but just as every presentation
16:29
need to have some uh slide with bold statement mine has also and we can i'll tell you that we can
16:38
generate synthetic data even with javascript don't get me wrong javascript is most popular
16:45
programming language in the world actually Python JavaScript is most used sorry but thing is
16:53
my take on it is always if it works in browser then it works anywhere you know if you can
17:00
do it in a browser you can easily you don't need some super fancy tools for it like you can do it
17:11
on your own and i'll show you how there are several uh several libraries like uh 3js
17:21
in this example that allows you to use the 3d objects in the in a browser so you can render them
17:29
by use of webgl uh you can find uh 3js uh on its official website 3 and you will find a link to GitHub
17:43
and you can find a lot of examples there. So let me switch now to my browser, where I have this
17:50
page open and over here you can see see some examples they have that people created but also
18:02
over here you have official examples that you can open up and now you can see here that 3d model
18:12
loaded in the browser and you can also notice that this model is animated that's pretty cool
18:18
Over here you can see a lot of different examples what 3.js is capable to do
18:28
It can define different effects, lightnings and so on that can also help you generate those pictures that you would like to use correctly
18:44
so when you train the model like one of the one of the things you need to keep in your mind is okay
18:53
lightning out there in the real world can be different and with use of some
18:59
some options that 3djs is offering you can simulate such things okay so also over here
19:11
you can see light from the from the different uh angle so let me show you uh how it works live
19:20
actually and i will i will switch to terminal quickly and over here i will start uh 3.js locally
19:30
now it's running and i will switch to visual studio code and here i have one uh one uh 3js
19:45
script that we will go through and i i will explain you what it's doing and we'll see how
19:54
it looks in a in a browser so in it's pretty simple and uh this is not something that i wrote
20:04
i just added a few lines of code this is the standard example that you can find uh if you if
20:10
you check uh 3js and we have some container element and we are adding uh the camera setting
20:21
up the camera and setting up the scene we will also for that scene set up the hemisphere light
20:27
and directional lights all those are standard things in the scene we will have a 3d object added
20:35
and so we don't want this object to be flying somewhere in the air so we want to define the
20:43
ground like one mesh that is a ground and we will put the grid on that ground just to be a bit more
20:50
visible. We will load one model and in this case I'm using R2D2. If you can see over here what I'm
21:01
holding in hand, this is Lego R2D2 and what I have actually is a realistic 3D model of it. So I'll be
21:09
able to use a 3D model to generate a lot of images, train the model, and recognize this object
21:20
So model never seen the real object. It will just use the synthetic images
21:26
So getting back to the code, we load this model. And we add it to our scene
21:34
I will uncomment soon the other parts so you can see what it is about
21:41
Okay, so when we load the scene, we see our 3D model
21:46
I'm able to rotate it around, zoom in, out, and check it out
21:51
That is our 3D model of R2D2. Okay, it's not really realistic
21:56
Something is missing. Our textures, right? So if we go back to code, and now I comment the textures
22:05
we have texture for the head and for the body. These 3D objects consist of two elements
22:14
So we are looking for a child that is head, will apply the texture head, and body will apply the texture body
22:22
Okay, I will save this, go back to browser, and there it is, R2D2, right
22:31
Looks really realistic. Now let's go back to the code. What else we are able to do
22:38
As you can see over here in the scene, there is this ground with mesh
22:42
We don't actually want that. So we would like to use some kind of texture in the background
22:48
And here I will add the texture. I'm using the standard Star Wars backgrounds that you can find
22:57
wallpapers you can find on the internet. But before adding it, I want to remove the ground and the grid
23:08
so when i save this and reload i can see my r2d2 right and it's using the star wars background so
23:21
one more thing we could do to to make this more fun is we could animate our camera there is one
23:34
small function that is animating the camera so i will uncomment this save head back to the page
23:44
reload and you can see that now camera is animated background is fixed that means if we take the
23:54
screenshots with with this we can generate a lot of pictures of this object from many different
24:03
angles right and that that pictures we can use for the for the training so how is this done
24:14
there is a really nice framework called playwright and it allows you to do a lot of things like cross cross browser cross platform uh different
24:30
languages and it is great for testing the testing the uh your web pages you can find it on website
24:40
playwright.dev i will let you explore the possibilities with it maybe you've already
24:48
seen it it's really popular but if you use it to take screenshots and for example in your code you
24:55
could also change the backgrounds you could generate a lot of those pictures and you could
25:00
train your model so basically here you can see i'm changing different star wars background and
25:07
i can train the model and try to recognize i can run the quick test here in browser
25:12
but i could also test it somewhere else how about at the edge ai is going to the edge
25:18
There are devices nowadays that are capable of processing AI models directly at the edge
25:26
and uploading just the data points to the cloud. One of such devices is Azure Percept
25:32
It was released last year by Microsoft. And you can find it on official website
25:40
This is how it looks. Over here, we have a central unit with the camera and audio array
25:49
Camera is on the right, audio array is left. In the Azure portal, you can find Percept Studio
25:58
And there you can easily train your models or publish the standard models
26:03
You will have option to publish the standard models for the object detection, people detection, and so on
26:09
You could try out the voice templates like inventory in a warehouse, for example
26:16
But you could also go there and test your model that you just tried in the custom vision So basically here through the simple ui you can publish that model and test it on the device you can connect to device via the browser
26:32
and see the video what is what is happening and here is the video of exactly that
26:39
and figuring out this object our r2d2 based only on synthetic images and for this purpose it was
26:51
used i think like 200 pictures or something like that and it's detecting already with the high
26:57
accuracy pretty cool i i believe i also want to show you quickly uh one project i work in my
27:05
My free time is a web tool called Synthetic AI Data. It's a project supported by Microsoft for Startups
27:13
Microsoft for Startups can give you Azure grades if you have some startup project
27:18
This is mine. I applied it there and got accepted. If you have worked on some projects, please check it out
27:25
They are really great support for startup programs. what I'm building in my free time is as I said one web tool where you can go in upload the model
27:39
and generate those pictures and integrate them directly into into the custom vision
27:46
so for example over here you can see just give me a second to turn off the
27:53
or here we have realistic model of sea turtle so through the tool you can easily pick the turtle
28:04
you can apply the texture basically what you've seen now in a code you can choose different
28:12
backgrounds that you would like different lightning effects camera rotation you can generate it and
28:19
you can integrate into Azure custom vision. And those are the test results. With only use of the
28:26
model that is photorealistic, on the real picture, this is detection in the custom vision test
28:38
So, as I said, simple tool that I have found in my free time, you upload model, configure option
28:48
and download generated data should be released soon if you want to follow for updates feel free
28:56
to check out the linkedin page so hopefully through this short session i gave you some ideas
29:04
like why ai is going toward the synthetic data and how this synthetic data for example
29:11
can be generated for the computer vision so thank you absolutely great guru and i was just watching
29:22
behind from behind the scenes that you've been using so many tools there was azure there was
29:26
unity there was three years there was vs code um and then towards the end you had this amazing
29:32
azure hardware that you just percept right is that is that right as your first step right yes
29:37
yes yes so so quick quick question you know i know uh i told you right there might be some dumb
29:42
question that i might be asking you so so you did had a physical model right one could take a picture
29:48
of that and upload it and train train train train it right but what you actually did that you wrote
29:54
a code right uh and you created a 3d model is that is that what really happened no no i'm i'm using
30:00
the done 3d model but you you can also imagine it uh this way many of us already have 3d printers
30:07
right yes you are using 3d model to print it out and now you would like to recognize the object that
30:13
you 3d printed so you know why not use the same model and generate the pictures from it train train
30:21
ai and you are immediately able to recognize yeah that's cool that's cool so and all the and all the
30:28
textures that you added uh that was all happening through the code right i can use any texture that
30:34
I want. Yes, definitely. Perfect. And also towards the end, Gurren, when you were talking about the
30:40
startup that you have, where you can upload your model and upload any texture. So what are the
30:46
file extensions of these models? I did see you, did share some of the libraries and people can
30:51
download it. But what are some of the extensions that one can upload? That is the thing with 3D
30:59
models that there are so many different standards, I would say. What you're seeing at the moment is
31:07
FBX models that work and it's optimized for them at the moment. Perfect. So I did see you also had
31:17
Unity over there. So people who are working with Unity also spend good time on Blender, right? So
31:23
if there's if there's a 3d character on a blender i can i can use that and go ahead and train my
31:28
model and and then then i can use custom vision to recognize it right that can happen yeah unity
31:35
has solution uh extension that can generate pictures for you so you can also try that out
31:43
nvidia is also building a solution for that so yeah many uh many uh solutions are popping up
31:51
because this is becoming more and more hot topic. Like we need to be able to generate data for AI training
32:01
Yeah. For the simple reason that, you know, if you want to recognize this small guy
32:09
you need to take a lot of pictures from different angles, from different lightnings and to get the high accuracy
32:17
And that's what we are looking for. yeah yeah yeah i get it right sometimes i mean lighting is very important uh sometimes the camera
32:26
angle you need to make sure that you're taking pictures from all different angles and it can be
32:29
hectic sometimes uh yeah unity unity i love the tool unity i've been using it a lot with c sharp
32:35
and all so i love that i think many people watching in and once we put it out there uh
32:39
people love it now uh goran a quick question before i know we're only 32 minutes throughout
32:45
this session, Goran, you sounded very passionate about this entire ecosystem, right? This 3D object
32:52
this modeling, gaming, cloud and all. But there might be a few things that might be very challenging
32:58
for you, right? I mean, during the process, what is the most challenging task that you have faced
33:05
um like uh creating this i i'm not not sure but maybe it was too easy for you i i'll twist it a
33:18
bit i'll say this way like uh i usually say that i'm a very visual guy and you know when i uh when
33:25
i try to present something or even at the meetings i often draw some things and things like that
33:33
So playing around with 3D models, applying textures, changing lightnings, this, that, rendering different scenes and such, it's really fun
33:43
And I enjoy it, really. So that's great. You definitely do, Goran
33:49
So before we leave, I'll ask you one final question. And I keep on saying it's a final question
33:53
But apart from all this tech stuff, Goran, coding, 3D modeling, AI, cloud, what do you do in your free time
34:00
i know you are very creative but still like you like cooking or you like playing music anything
34:05
around that you're obvious uh yeah uh i'm a father i have a teenager who is 14 he's uh interested
34:14
also in the in tech and i spend time with my wife and him and yeah i also like music i'm on the
34:25
heavy metal side of things i would say and uh yeah that's um in short that's me i like to travel
34:33
and such and hopefully we are getting back to more in-person conferences so i look forward to that
34:41
yeah there is a few going on i'm also looking forward to it so garan thank you so much once
34:45
again that was an absolutely pleasure having you i was really excited to host you and we'd love to
34:50
to have you back when you're available and yeah thank you and hope to see you soon
34:55
bye bye thank you thank you everyone we'll see you next week bye we did that girl video yeah
#Engineering & Technology