#716 - AI-Generated Amazon Product Videos - Transcript

Full Content

#716 - AI-Generated Amazon Product Videos Speaker 2: In this episode of the Serious Sellers Podcast, we have AI expert Andrew Bell join us and he is talking about how you can create videos from still photos using AI. Unknown Speaker: How cool is that? Pretty cool, I think. Hello, everybody, and welcome to another episode of the Serious Sellers Podcast by Helium 10. I'm your host, Bradley Sutton, and this is a show that's completely BS-free, unscripted and unrehearsed, organic conversation about serious strategies for serious sellers of any level in the e-commerce world. We have a really exciting webinar. I know you guys are really excited for this one. We have Andrew Bell who has been doing all of our AI webinars and he's an AI expert and he really understands e-commerce as well which is really helpful for all of us is that he's not just showing us a concept. He's gonna show us how to use it for e-commerce. So I'm gonna go ahead and just bring Andrew on. Hello, Andrew. Speaker 1: Hey, how's everyone going? Unknown Speaker: Yeah, good. Speaker 2: And I'm sure they're very excited to hear about this AI. I know for me, especially, it's really expensive to do a lot of this stuff. Without AI, so I think a lot of people are very happy to know that, you know, you can create cool videos from a still picture using AI. And so I'm excited to hear this and I know people are definitely excited in the comments. So do you want to go ahead and take it away? Speaker 1: Yeah, let's do it. I kind of want to get right into it and kind of like as a start, I want to be clear about what we're doing. Like I'm not talking about here AI-generated UGC, user-generated content. Honestly, most of that looks bad. It's uncanny. It's inauthentic, often hurts brand trust more than it helps. So what we're doing here is completely different. And I'm not going to tell you that AI is going to replace your studio. If you have the resources, I think you should keep going with that. But we're using, I think it's important here, we're using AI as a studio, not the studio. So it's not a substitute for real people, for influencers and things like that. It's about turning your existing lifestyle imagery, the assets you've already invested in, into cinematic brand motion content. And like if you have a video studio and a photography studio, you should definitely use those resources. But a lot of people only maybe have something where they can take pictures, right? Or maybe they can't take pictures, so they use AI to take their products and put them into lifestyle scenes. But the hardest thing to do is probably to put something into video. And we know video converts even more than just static images. So how can we turn those into cinematic brand-driven motion content? So, again, this isn't about pretending to have influencers or fake people holding your product. It's about giving your product its own presence, the lighting, the motion, the story, all designed to make your product look like it belongs in a professional commercial event, not a fabricated influencer clip. So, if you've been seeing those AIUGC videos floating around, that's not what we're talking about today. This is about quality, storytelling, and speed, and no shortcuts. And so, one thing I'm going to do before we like dig right into examples, and actually two GBTs that I built for you guys. One that's more simple in the way that it does things. So, for example, it'll take a lifestyle image and bring it to life within a 10-15 second By staying within that lifestyle image, whereas the other GBT, it's going to actually create out a scene from scratch that's relevant to your product. And so I'm really excited to go into this. But first, I'd like to start with some really good tips, I believe. So, first thing is to start simple. You want to begin with short focus prompts that describe a single subject, simple motion, clear lighting. Then from there, add complexity gradually, right? You don't want to just say, oh, create it. It's studio ready. No, you want to iterate and iterate and iterate because even in a studio when you do videos, a lot of the work is actually the editing. So here, you're actually getting the creative assets and from there, you're able to put those together using editing software and things like that. Speaker 2: Thinking about selling on TikTok Shop? Or maybe you are already in it and you're ready to scale. Unlock all of Helium 10's brand new TikTok Shop tools with our diamond plan. Everything from bulk Amazon to TikTok listing conversions to instant Amazon MCF fulfillment. Best of all, you use the code TT10 to get 10% off Diamond for six months, even if you've used a coupon before. So go ahead and upgrade and let Helium 10 do all the heavy lifting for you so you can focus on what really matters. For more info on our new TikTok shop offerings, visit h10.me forward slash TikTok. I'll see you there. Speaker 1: So we're going to go right into looking at the lifestyle image to cinematic motion, GBT. So basically what you do is you can upload an image of a certain product. So let's say it's this one, a water bottle. And here it'll create five different prompts that are hyper relevant to this image. So for example here, option one, Arctic Refresh. Cool, diffused, light ripples across metallic bottle surface as a condensation beads form and slowly trickle downward. A hand enters frame to lift the bottle, ice cubes inside, Clinking softly, the background water will ripple subtly, animates, camera remains static. And so the intent is to emphasize purity, endurance, and sensory chill of long-lasting cold. So all these tips that I'm going to recommend you might seem like, oh gosh, do I have to be a cinematographer? No, but I'm giving you GBTs that'll help you form prompts that'll bring images to life. And so you see several different kinds here. So soft daylight from left brightens bottle surface. A hiker's hands clips the carburetor to the backpack loop, testing the grip before setting it down again. And then you see option three, contrast of elements, and then option four, hydration focus, and then explores pause here. And then if I want to go more into a lifestyle image of a different bottle, And I click this here, it'll come up with another five as well. So you can put as many images as you want in here and it'll produce these prompts for you. So here you can make this image go, okay, a gentle hand reaches down to grab the bottle while the skateboard here will slowly spin in the background. Subtle reflection, glimmers on the metal surface, motion is fluid and minimal. The intent is to convey motion readiness and youthful energy. And we're going to actually go and test a lot of these too. So condensation beads appear briefly on the bottle, catching light before fading. So these are things like you might see some like, ah, no, that wouldn't work. But then others you're like, yeah, that definitely fits what I'm going to do. So we're going to look at prompts specifically from here. And then we're also going to do prompts from this as well. So what you do is you would either describe your product or you can upload the product itself. So let's do the same one here. And we'll create an even more in-depth prompt. So here it's going to say, okay, format look, let's do 15 seconds. It's going to give you the aspect ratio. It's going to be able to capture this way frame rate, motion, texture, lenses and filtration, grade, palette, lighting and atmosphere, location and framing, wardrobe, props, textures, sound. And this is actually based on what OpenAI, creators of ChatGPT and SOAR2 actually gave in a guide. So this prompt is based on that. So then it gives you an optimized shot list. It'll give you a 15-second scene here, broken up into four seconds, and then another four seconds, another four seconds, and then another three seconds. It'll give camera notes and then the finishing touch as well. So without, let's try this prompt here in Sora 2. So we're going to go right here like this, and we're going to go, say, 15 seconds. So you want to follow the format there. So just go to portrait, and then you have 15 seconds. And then we're going to enter there. And then one thing I really suggest is because you want to iterate, you want to have multiple scenes going at once. So I'd go ahead and do another one to test that. See a portrait here. Enter here. And then now we're going to go back to one of these. So what we're going to do is take this image right here. And we're going to take one of these prompts here. So let's say Let's just try this one here. So what we'll do is upload the image here in SOAR 2 and then we'll do the prompt here like this. So it really doesn't matter for this one if you want to do portrait, if you want a duration. I recommend portrait just because this is square and it'll actually be better if you go more vertical this way. In most cases with SOAR, this is important too, you can't use You can't use images that already have people in it. However, you can use images that have people in it that aren't showing like the face and stuff. That's the big thing. So we're going to create here. So I'm going to show you something funny. So what I mean by iterate is you start with something like this from a lifestyle image, and I'm going to show you some bloopers of what happened. So I do the video. You can see the hand going down like this, right? And then you ask it to do it again, and it's just a hand reaching out really awkwardly. And look, it doesn't even have a head. So these are things you have to continue to work with. And one thing I've done with the prompt is I've made it with the GBTs, made it much better. And so you can see this one is definitely much better. As you can see someone going to the mirror, looking, and you can put text in the background You got to watch it too. Like again, I'm kind of showing you just the inside look of like, this is what it looks like to do it. It's going to take time. So here, notice like it's not as good. Here, what's the problem is the girl's going up and look, she's not even showing up in the mirror. So it looks good that she's walking in front of the mirror. It shows the size of the mirror, etc. But the problem here is it's not actually showing her face. Whereas with this one, it does. She comes in, looks at herself like this. Unknown Speaker: A little better, right? Speaker 1: So there are things like that you can do. So here, this is actually just for a general water bottle. So like this is the scene that we were looking at together. 15 seconds aspect ratio. This is how the scene goes. It shows the water coming off this way like this. And obviously it's going to require a lot of like editing and stuff. And so this is one actually done with the photo. And so this is something you want to continually iterate on. So when you drink, it's drinking the water, right? You notice like, okay, you could have put that back down. It's probably a little bit too fast. But this is something you want to do over and over again. And the fact that it got it like this in just one try, that's actually hard to do. But the intent here was to convey effortless style and hydration as far as an active urban lifestyle. So you notice I was able to do that with the actual lifestyle image there. And so this one is almost ready to go. One thing... All right, here we go. So it could be for another water bottle that you have. And here I didn't upload an image of a water bottle, but it shows you what you can be doing too with the prompt itself. And so you see consistently, it's going through the same scenes here, same character, everything. And so what you'll be able to do eventually is to actually be able to put your product in there too, which is super important. Kind of going next, I want to show you, this is something you can actually do. I am curious, who is everyone that I've messed with the image generation and videos on Amazon specifically? So in the Creative Studio, if you can't find it, go to Campaign Manager. And under there, you should see Creative Studio. And once you hit Creative Studio, it should give you an option to generate images or video. Hit Video, and you'll be able to do it from there. Firefighter costume for kids. I think that would probably be a hard one to do, right? Let's do dog treat container, spray bottle, blanket. Let's try a blanket. Blanket sounds good. Blanket here. Let's do Let's try this one. Hope for a good one here. So what you do is you end up, you generate the videos. One thing I noticed is you actually have to wait. One thing I was hoping for is that you could generate as many as you want. Generate here, go back, do another product, but it turns out you have to wait for it here like this. So as that's going, I'm actually going to share one of my GBTs in the chat. So let's do this one here. I kind of want to show you. Let's do one more example like this. And let's try this example just to show you can do that. This is the image that I used in those videos that I was telling you about. And this is how I made the GBT better was I would go in, I would generate with a prompt. I'd say, okay, that's not working. Okay, do this one and then the next one and the next one. So this GBT is actually the product of that, right? It's the continued iteration that I did to see how good the video would do. And so by the end, this is what I got. And just so you know, this is not a GBT I just create for the webinar. Yes, you guys will exclusively get it. But more importantly, I'm going to continue to iterate on this. So you'll be able to see over the next month like, okay, there's going to be a 2.0, there's going to be a 3.0, 4.0. One thing about my GBTs is I don't just put it on, okay, here's a comment. I'm going to send it to you and I'm not going to worry about it. I'm going to continually update these for you guys so that you can have them. And they should be gated for like the next three months or so. So you'll have exclusive access to them for the first three months and the next three versions as well. So you can hear a soft golden light drifts through the curtain. You can see the curtain there, gently brightening the room. A person enters the reflection in the mirror. I mean, these are things like you wouldn't even necessarily think of. You can't expect someone to be a cinematographer. I know all the prompting guides out there will tell you that and that's good. You should have those tips. But most importantly, I think it's good to say not everybody has that skill. In fact, I would believe the vast majority of people don't have that skill. And so having people who went through the work to do this, so you don't have to spend the time on it. You can focus on your business, focus on what you're doing, and someone create these things for you that reflect those best practices. And so it gives you the intent. The intent of each of these is like, okay, here, evoke timeless elegance. In self-assurance, intent here, suggest nostalgia, hidden stories within refined surroundings. Here, intent, convey the poetic passage of time and serene domestic intimacy. I don't know if I would do serene domestic intimacy. Intent, highlight beauty and simplicity and the touch of human care. Here, evoke classic sophistication and contemplative stillness. See, I like those. And these are things that you'll continue to go through. And every option you'll notice is better with each iteration. Alright, so let's go back to this tab and here is one of the videos. So some motion is better than no motion. I think that's the big thing here. And it has the text correctly on there. This one's better. And they're only six seconds long, but you should, that's about like what you want sometimes for like a sponsored brand ad, let's say, especially if you don't have the equipment, you don't have the money to be able to like actually, you know, like invest those resources in a video. Again, if you have the resources to do that, if you have the video assets, And you have the ability to produce video assets like that at scale with a video like a studio. I definitely think you should do that. But one thing you can do too is you can put things together like that. You can use AI, studio-driven video generation with your real assets that you have. Okay, so here's another one with a person in it. Okay, that doesn't make any sense, does it? You can see here, this one thing you have to watch for too is like, why would you have your coffee right over your blanket like that? I don't know. Maybe some people do. It just doesn't seem natural to do that. And then let's look at this one again here. Zoom in. Cozy moments, perfectly wrapped. And here's another version of the human. So it had a little bit extra there at the end like that. I'm kind of hoping here that they're the same ones. You know what I mean? And then you can do this. You hit generate more and you see, okay, what else can it create? I definitely recommend doing this. In fact, if you have the chance with your product specifically, I would spend maybe an hour just going through and producing and producing and producing and see what you can get from it. See if it's just like, okay, it's all the same stuff. It's just repeating, but there might be a ton of assets. You're like, oh, you know what? I can put that together. I can put that together. So if you generate up to an hour of that, you're getting well over 50 videos within an hour that you can take together, little clips. You can go into your editing software, whatever you use, and you can put those together seamlessly. So I would definitely recommend possibly doing that. So we'll see what this generates here. And in the meantime, I would like to talk about a little like some other things that are important. So like, if you're going to like actually, you know, decide I want to, you know, Andrew, this is all great, but I think I would like to, you know, prompt on my own. So like, what kind of stuff should I do? One thing you want to do is you want to replace any of your vague terms that you have with specific framing. So you don't want anything weak like, oh, cinematic look or close-up or, hey, camera moves. Instead, have things like wide shot, low angle, medium close-up, slight angle from behind. A slow dolly in from eye level. This is effective framing examples. So like wide establishing shot at eye level, medium close-up shot over the shoulder, arrow wide shot, slight downward angle, tight close-up on hands, macro detail. And remember, my cinematic prompt generator is able to do this kind of thing here. You'll notice. It goes through everything. It goes through the aspect ratio, capture, frame rate, motion, texture, lenses, the goal of it, highlights, mids, blacks, the palette, lighting and atmosphere. So like what's the key? Here the ambiance, daylight reflected from modern glass facades. The fill balance, the practicals, the atmospherics, clean open air with subtle depth from urban reflections. Here, you have the direction of it and then you have the location framing as well, where it's contemporary urban plaza with reflective architecture. Then you have the framing of like, okay, the wide shot, subject environment, sense of urban leisure. Then you have a mid, human gesture, checking phone, skateboard is idle. Then you have a close one, a close shot, condensation on bottle, light brushing surface. You actually notice that too, like in the video here. When you look at it, it goes through that scene. And then watch right there like that. So you see the level of detail I captured there where it said condensation on bottle light brushing surface. And it's not going to be 100% perfect, but again, the more specific you are, the better it's going to be. I'm not saying overload it and make it crazy big, but this is actually the best practices that come from open AI on how to do a generation with video. And this is kind of what I expected from Amazon. It's the same kind of thing. It just regenerated some of the videos like this. And so it doesn't look like it's going to really give you anything new. It's going to be very similar to what it was before. And unfortunately, right now with videos, you can't really give your own prompt per se. It doesn't necessarily let you do that. But it is good to have the ability to take it. It's free. Number one, you don't have to worry about going to SORA 2 and doing all this stuff. You can do it right within your Amazon ads. And think about this. You have six videos that you can test from. Say, hey, I want to actually beta test all these videos and see which ones do better and sponsor brands. We're sponsored display ads, the videos. I'm kind of like wrapping up here again. You know, be specific about movement. Like I've said before, you don't want to have, you want to, you know, describe things like the depth of field, the motion and timing. So like for example, when you say a cyclist moves quickly, instead say cyclist pedals three times, brakes, stops at crosswalk. The reason you want to do stuff like that is because open AI now is way better at kind of getting down to the physics of things. And you want to like anchor realism as well. Use descriptors like handheld, jitter, overcast afternoon to ground a video in a believable style. You want to keep the number of characters sometimes small and motion simple because some complex interactions can reduce the fidelity of the prompt itself. And then the most important thing is to iterate. Don't expect the first generation to be perfect. What I showed you on Sora, like the multiple generations of that year, is I think it's the most important lesson in video generation. It's not going to be perfect at first. It only will get better if you iterate and that's what those GBTs are built on as well. Other than that, I just I encourage two things, I guess, you know, iteration and patience. That's what it takes to do video generation is iteration and patience. You're going to remember two things from here. It's iteration and patience when it comes to generating videos. And of course, use these GBTs as well to help. I actually don't recommend like prompting from scratch, but if you want to, you will have a guide that I'm going to be providing that I've put together with the two GBTs as well. And possibly my new GPT I created specifically for Sora too. But these should also work in VO31, which is, by the way, if people don't know, VO31 is the video generation model from Gemini, right, from Google. And then remember Sora is from OpenAI, readers of ChatGPT. And I do not recommend Grok. Grok is not there yet in terms of the technology. You know, I, I playing I think is great. Runway is good too. I don't know anything about DaVinci. I'll have to look into that. Yeah, I definitely recommend Sora 2, VO3, and you can try the video generation for Amazon as well, which is good. Speaker 2: Awesome. That's great information. Thank you so much, Andrew. Speaker 1: Somebody asked, how do you get access to Sora 2? Right now, from what I understand, it is invite only, but you can get Sora 1 and it's a lot of the same prompting techniques that you can do. I recommend first going to Sora.com. I recommend going through OpenAI.com first to get to it. And then you will be able to sign up for free at that. But then you can also do Sora too. Because if you're on Sora too, you can technically go back to the old Sora. So you can sign up for Sora. But go ahead and go to Sora.com. I think the exact link actually is, you know, https and then sora.chattobt.com is what you want to type in. And if you have it to where it's only invite only, you'll eventually get an invite very soon. If you sign up, I know you can get early access to that, especially as they're releasing it to more and more people. But you can definitely do the first SORA. And if you have trouble doing that, I definitely recommend reaching out to me on LinkedIn because I feel led to do this. If you guys are having trouble with that, like logging in and getting into it, I am actually free. I want to help troubleshoot that for you. So reach out to me on LinkedIn. My name is Andrew Bell and help me troubleshoot. I'll help you troubleshoot that and get into a video generation model. All right. Unknown Speaker: Somebody was asking about Top View AI. Speaker 1: I don't know that one. Top View AI. I'm going to write it down though. That and DaVinci. Speaker 2: And somebody else said, what about Runway? Speaker 1: Yes, Runway. Runway is good. The prompts do not need to be as long with Runway. In fact, it's advised that you do it much shorter. But if you're going to do a product video, I highly recommend Sora and VO3 is even good too. In fact, if you're on Google Gemini, let me just go ahead. I'm going to share one more example here. What we're going to do is we're going to go to Gemini. And we're going to take a lifestyle image and actually create a video out of that. I'm going to show you a cool trick here that you can do. So let's upload an image here. Say it's this one. You can say, go here, create videos of Veo. And then just say this, bring this image. This is a hack. That I think is really important because we talk about all these big prompts, right? But this is actually one that you can do. You have less control over it, but you'll notice it actually does. It's pretty cool. So go to Gemini, go to create videos with Veeam. And then again, upload image of your lifestyle, one of your creative assets, and then put Bring this image to life within this scene. So you give it a second there to create the video and you should be able to create multiple videos also at once. So people know you can obviously go here and you can go to upload files. And you can take this and say, make the bottle blue. And then once you do that, it'll come with an image and the image should make this bottle blue. And if it makes it blue, then from there, you can create a video off that as well. So generating image here, you notice the blue here like this. Speaker 2: And then what you want to do from there, you just have to keep saying, say it to keep everything the same color and look the same in that prompter. Speaker 1: Yeah, exactly. Yeah. All you have to do is say, make the bottle blue and it'll know instinctually not to, like, it'll just change the color of the bottle. So I could say here forever, I could say, okay, go green now. And then the next one I could say, go aqua blue. Say that it went green, like that. And say, okay, go aqua blue. Unknown Speaker: Like this, aqua blue. Speaker 1: Then what you can do is I can download it like this, start a new chat, and then I can upload that same image. Alright, let's see what it did. There. So you can see that it does very well. Wow. Yeah. I'm very impressed with VO3. The only problem with VO3 is it's not as long and you don't have as much creative freedom either to do it. But this to me is good enough and you can actually go to what's called flow and maybe that's another one we can do where you put it together as a storyboard and you can do as many things you want. So the next one could be, okay, now I want her to pick up the bottle. And so the next one would be she picks up the bottle here. And ideally, what you want it to do is like, okay, you have this scene and then you create another scene that's her picking up the bottle and then you could put the scene together, right? You could find a way to put that together into separate shots. So it's creating the video here, bring this image to live. So like it just made it a different thing. That's because you cannot go from image to video in the same chat on Gemini. You have to stay image, download the image, but then you can go to a separate chat and do this. And go to the video. So if you don't have access to SOAR 2, my second recommendation is to go to Gemini and use their video generation model. And you'll find it right here at the bottom. When you click this right here, you see deep research, create videos with video, create images. You want to go to create videos with video. And that's where you'll be able to generate D. It's a little bit harder for that motion. Whereas with a phone, it's a lot easier to do that. But her, she's very hesitant. Do I pick this water up or not? Do I pick it up? No confidence. You don't want that in your commercial. You don't want that in your product video. But something like this, to me, this is usable. Tell me if I'm wrong here in the chat, but this is definitely something that's usable. Especially if it comes from this lifestyle image, I don't see any real distinctions here. Yeah. I mean, it does a phenomenal job. Speaker 2: I think that's all the questions we have. I think that was pretty straightforward, pretty good information so people can start doing some of those questions. I mean, where do you think most e-commerce sellers could use these videos? What do you think they're best for? Speaker 1: Oh, definitely best for sponsored brand videos, I'd say number one. Speaker 2: Okay. Speaker 1: Product videos, since product videos are much longer, I think there's opportunity for that. Like I said, Google has the ability, it's called Flow, where you can do multiple video generations into like a storyboard. I've been able to get a video actually up to, I should share it too in that link, up to two minutes. I've been able to get a consistent product video. But here's the thing, it's taken way longer than it would be just to shoot it on your own. Yeah. But I just wanted to see what it can do. But the fact that it can do it, that's evidence that it's going to only get better from here. This is the worst things are ever going to be. Speaker 2: Yeah. Speaker 1: For sure. Unknown Speaker: That is a good point because then people can start doing, we've been doing some advertising Like an advertising series and one of the things you could do is, you know, take one of these still photos and kind of change the background. Maybe like, you know, for example, Destine always uses the example of, you know, protein powder. Well, protein powder for bodybuilders. If you show a picture or video of someone, just a normal person, that's not a bodybuilder, it's not going to, you know, really answer what the person is searching for. But if you had a bodybuilder, with the protein, maybe they're drinking a protein shake, then they're going to be more likely to click on it. So you can do a lot of these videos, which is really cool. Well, thanks again, Andrew. This is great. We really appreciate you. It looks like people want some more webinars in the future. So we'll try to plan some more content around all this stuff. And thanks again for joining. And everyone, thanks for all of your nice comments and questions. And we'll see you all on the next webinar. Bye everyone. Speaker 1: Thanks guys.

#716 - AI-Generated Amazon Product Videos

Summary

Full Content

Stay Updated