Introduction
Creative teams today are no longer split neatly between design and video. With the rise of AI-powered tools, these two disciplines are increasingly interconnected sharing assets, workflows, and even creative decision-making. This evolving ecosystem is what we can call the AI creative stack: a layered combination of strategy, design, motion, generative AI, and distribution tools that work together to deliver faster, more scalable, and more consistent creative output.
In this blog, we’ll explore:
- What the AI creative stack looks like in practice
- How design and video teams collaborate inside this stack
- Where AI adds the most value (and where humans still lead)
- A real-world example: the AI-driven Swiggy campaign work, where design and video came together seamlessly
What Is the AI Creative Stack?
The AI creative stack is not a single tool, it’s a system of tools and workflows that support the entire creative lifecycle.
At a high level, it includes:
- Strategy & Ideation Layer
Brand guidelines, campaign objectives, messaging frameworks, references, and prompts. - Design Layer
Static visuals, key frames, layouts, typography, colors, brand assets, and visual rules. - AI Generation Layer
Image generation, video generation, motion enhancement, lip-sync, upscaling, background replacement, and variation creation. - Video & Motion Layer
Editing, timing, transitions, pacing, camera movement, and storytelling. - Review, Iteration & Distribution Layer
Feedback loops, rapid iterations, platform-specific exports, and performance optimization.
What makes this stack powerful is how tightly connected these layers are especially design and video.
The New Relationship Between Design and Video Teams
Traditionally:
- Designers finished static creatives
- Video teams animated or recreated them later
With AI, that handoff is no longer linear.
Design Now Sets the Rules of Motion
Design teams are responsible for:
- Character appearance and consistency
- Costumes, props, logos, packaging, and typography
- Composition and camera framing references
These elements become anchors for AI video generation. If the design is strong and precise, the AI output becomes predictable and brand-safe.
Video Teams Orchestrate Time and Emotion
Video teams focus on:
- Micro-actions (entering, stopping, smiling, delivering, reacting)
- Facial expressions and eye lines
- Duration control (6s, 10s, 15s formats)
- Platform-first storytelling
Instead of animating frame by frame, they now direct AI like a virtual production crew using prompts, constraints, and reference images.
Where AI Fits In (and Where It Doesn’t)
AI is not replacing creativity, it’s compressing execution time.
AI Is Excellent At:
- Creating multiple variations quickly
- Maintaining visual consistency across formats
- Converting static designs into short videos
- Scaling UGC-style content
- Speeding up experimentation
Humans Are Still Critical For:
- Campaign strategy and storytelling
- Brand judgment and taste
- Cultural nuance and emotion
- Deciding what not to generate
The AI creative stack works best when humans define intent and rules, and AI handles repetition and scale.
Case Study: Swiggy AI Video Campaign (DE Campaign)
The Challenge
For the Swiggy DE campaign, the goal was to:
- Create engaging, realistic delivery-person videos
- Maintain strict brand consistency (logos, bags, uniforms)
- Produce multiple video variations quickly
- Ensure outputs were usable across platforms
This required tight coordination between design precision and AI-driven video generation.
Design Team’s Role
The design layer did the heavy lifting upfront and continued to play a critical role throughout execution not just in visuals, but also in conversion-focused elements.
Key design responsibilities included:
- Locking the exact look of the Swiggy delivery executive
- Finalizing uniform colors, bag design, logo placement
- Defining framing rules (doorstep position, camera angle, distance)
- Ensuring all text and logos were clear and undistorted
Designing Eye-Catching CTAs
A crucial part of the designer’s contribution was creating high-impact CTAs (Call-To-Actions) that worked seamlessly across AI-generated videos.
Designers focused on:
- Bold, brand-aligned CTA typography for instant readability
- High-contrast color combinations aligned with Swiggy brand guidelines
- Clear hierarchy between character action and CTA placement
- Safe-zone positioning so CTAs remained visible across all aspect ratios (horizontal, vertical, square)
Video & AI Execution
Using AI video tools:
- The delivery executive enters the frame and stops at the doorstep
- Holds the Swiggy Instamart bag exactly as designed
- Smiles naturally at the camera
- Performs minimal, controlled motion to avoid distortion
Key AI constraints included:
- No change in face or identity
- No distortion of logos or packaging
- Strict positional consistency
- Natural but subtle expressions
Dialogue Video Challenges
In addition to action-based shots, dialogue videos proved to be one of the most challenging parts of the AI execution.
Dialogue-focused AI videos require a higher level of precision because they involve:
- Accurate lip-sync with spoken words
- Natural mouth movement and facial expressions
- Consistent character identity throughout speech
- Zero distortion of uniforms, logos, and delivery bags while talking
Through close collaboration between the design and video teams, these challenges were gradually resolved, resulting in clean, believable, brand-safe dialogue videos that met campaign standards.
Adaptation Process (Sizes + Vernaculars)
Using AI, each approved master video was quickly adapted into horizontal, vertical, and square formats through AI-powered resizing and reframing. In parallel, vernacular versions in 5 languages (BN, MR, TA, TE, KN) were created using ai voice-overs, with the same size adaptations applied to each language enabling rapid, large-scale localization without re-editing.
This resulted in a highly scalable output where a single creative idea expanded into dozens of localized, platform-ready assets with minimal turnaround time.
Final Results Powered by AI
With the AI creative stack fully in place, the final outcomes clearly demonstrated the impact of this approach.
- Huge numbers of master videos were successfully produced as part of the campaign
- Each master video was further scaled into number of platform-specific adaptations (sizes × languages)
- The entire delivery was completed at a very fast pace that would have taken significantly longer using a traditionally directed ad production process
By leveraging AI for generation, variation, and iteration while keeping creative control firmly with the design and video teams the campaign achieved both speed and scale without compromising brand quality.
This approach proved that AI is not just a cost-saving tool, but a creative multiplier, enabling teams to do more in less time and respond faster to campaign needs.
Key Takeaways for Creative Teams
- Design First, Generate Later
Strong design foundations reduce AI unpredictability. - Think in Systems, Not Assets
Build reusable characters, poses, and motions. - Constrain AI to Protect the Brand
Fewer freedoms often lead to better results. - Video Is Direction, Not Just Editing
Prompting, timing, and expression are the new craft. - Collaboration Beats Speed
AI rewards teams that communicate clearly.
Conclusion
The AI creative stack is reshaping how design and video teams work together. What used to be a sequential process is now deeply collaborative and iterative. Design defines the visual truth, video defines movement and emotion, and AI bridges the gap at unprecedented speed.
The Swiggy AI campaign is a strong example of this future: when teams align early, respect each other’s craft, and use AI with intention, the result is not just faster content but better, more scalable storytelling.As AI tools continue to evolve, the teams that win will not be the ones generating the most content, but the ones who build the smartest creative stacks.

