HEYGEN
As part of the Avatars squad at HeyGen, I led the design of Looks, an evolution of our Photo Avatars feature. Previously, users could only create a single avatar from one photo, limiting creativity and engagement. With Looks, we re-architected avatars into a multi-look system that gave people the freedom to generate endless variations of themselves in different scenarios.
My role focused on translating a highly technical backend (LoRA training, 1-to-few generation, motion synthesis) into a clear, usable experience. I partnered closely with my PM, who shared the PRD and initial vision, and with research/engineering to shape flows that balanced explainability, cost transparency, and user delight.

Before “Looks"
• Users could only create avatars using their photo
• That avatar was locked to one outfit, pose, and background (whatever photo they uploaded).
• This made avatars feel repetitive and hard to reuse across different contexts (professional vs casual, ads vs personal use).
• For a product like HeyGen, where the North Star metric is # of videos created per user, this was a growth ceiling → fewer avatars = fewer videos.
Why "Looks"
if we could expand avatars from a single static look into a system of multiple looks, people would not only get more creative value but also make more videos, which ties directly to our business goals around engagement and credit usage.
• If users could generate multiple variations of themselves (different outfits, poses, and backgrounds), they would naturally create more videos.
• “Looks” = different appearances of the same avatar, while the face identity remains consistent.
• More looks = more creative freedom → more reasons to generate videos → direct lift in the North Star metric.

Model Training requirements The model needed a minimum set of photos for consistent results. Clear upload guidance was required (e.g. “10 photos for best results, 20 for higher resemblance”).
Latency & waiting states: Training could take ~3 minutes, and generations up to 5 minutes.
Explaining new concepts: Terms like “training a model” or “generating looks” were unfamiliar to most users.
Page density: Balancing the new Generate card with existing avatar and voice cards required multiple layout iterations to prevent visual overload.
I Explored two directions:
Modal flow: This gave a focused experience but blocked them while waiting.
Sidebar card flow (chosen): Brought the generation card into the Looks page as a right-side panel. This allowed multitasking users could browse or edit existing looks while new ones generated in the background.


Because this was a brand-new concept, I needed to guide users clearly:
Training a model can sound intimidating - I Added a “See how it works” link to a Help Center demo for first-time users.
Placed inline guidance text: “Upload at least 10 photos for ideal results, 20 for highest resemblance.”
Added an upload progress bar showing how many photos were uploaded and whether they’d hit the “ideal” threshold (10+ photos) for best resemblance.
Displayed time estimate (“~3 minutes”) and a percentage progress bar to set expectations.

Next came the generation experience. We tested two layouts.
Initially placed the Generate card on the left side of the grid as a persistent card, but it competed visually with navigation and users couldn't dismiss it.
Moved it to the right side as a collapsible panel, which created clearer hierarchy and reduced clutter.


Another design decision was around prompting. Early tests showed users struggled to come up with effective prompts on their own. Without guidance, output quality was inconsistent, and many users abandoned the flow.
Solutions:
• “Try Sample” button → instant examples.
• Curated “Looks Packs” curated sets of prompts around themes (Vacation, Professional, or Old Money) → 20–30 cohesive variations with one click.


Higher engagement: Users shifted from creating 1–2 photo avatars to generating 20–30 looks per avatar. (users were generating around 50,000 new looks every month)
Increased video creation: More looks led to more avatar videos per user, our key success metric.
Revenue lift: Credit usage and upgrades increased after gating training for free users and introducing usage limits.
Lower friction: Looks Packs and sample prompts reduced cognitive load, making generation accessible to non-technical users.
Scalable foundation: The new system set the stage for advanced features like outfit customization and combining photo + video avatars.
Explainability matters as much as functionality The success of “Looks” wasn’t just in the outputs - it was in making a complex AI workflow feel approachable. Clear guidance + progress states directly improved trust and adoption.
People need inspiration Users freeze when asked to be creative from scratch. By giving them curated packs and samples, we reduced cognitive load and boosted engagement.
Latency isn’t just technical - it’s emotional Users don’t mind waiting if they know what’s happening. Showing progress estimates turned frustration into anticipation.
Integration beats isolation The sidebar solution showed me that embedding new features into existing workflows drives higher adoption and feels more natural.
(2016-25©)