From: redpointai

Generative video AI tools are transforming content creation by enabling the generation of video footage using artificial intelligence, rather than traditional camera filming [00:04:36]. Haen, an AI video platform, aims to make video production faster and more affordable, serving individuals who may be camera-shy or businesses without access to expensive equipment [00:05:06].

Evolution of Video Production

Historically, video production involved filming with a camera and then extensive post-production editing [00:04:03]. The advent of generative AI allows for the creation of footage directly from AI, potentially replacing the need for physical cameras [00:04:29]. This shift could also revolutionize editing, moving away from traditional timeline editors, which were necessitated by the high cost of camera footage [00:06:15]. Future editing experiences are envisioned to be vastly different, possibly involving text-to-video generation and documentation-like script editing [00:06:50].

Haen’s Capabilities and Use Cases

Haen raised 500 million valuation from investors including Benchmark [00:00:04]. The company experienced a viral moment when its video translation tool was used to dub the Argentinian President’s speech at the World Economic Forum into different languages [00:00:51]. This showcased the “magic moment” of speaking in various languages with natural voice and expression [00:01:45].

Haen serves over 40,000 customers with three primary use cases [00:09:37]:

  • Create: Users can generate videos by typing text, using custom avatars or stock avatars, eliminating the need for a camera [00:09:48].
  • Localize: Existing videos can be localized into over 175 languages and dialects, preserving voice tone, facial expression, and lip-sync [00:10:01].
  • Personalize: A single video can be personalized into over 100,000 variations based on customer demographics, industry, and problems faced [00:10:19].

Haen is designed for the “99% of users who are not professional players” [00:11:18], such as marketers who write content but lack video production skills [00:11:29]. The mission is to enable visual storytelling for everyone, especially those without access to expensive cameras or sophisticated software [00:11:40]. Key to user adoption is demonstrating the diverse use cases across different verticals like marketing, sales, customer support, and training [00:12:47].

Technical Aspects and Challenges

Avatar Quality

A critical aspect of generative video AI is the quality and engagement of the generated avatars [00:07:47]. An effective avatar must deliver messages effectively, which means being engaging beyond just lip-sync [00:13:56]. This includes realistic head movement, eyebrow expressions, and body motion/gesture that match the script [00:14:17].

Creating an avatar requires submitting a video footage (e.g., 30 seconds to 2 minutes) so the AI model can learn and mimic the individual’s full talking style, including all bodily movements and expressions [00:17:04]. Haen’s AI 3.0 model can render full bodies and aims to incorporate gestures as the next step [00:21:32]. The ongoing challenges and opportunities in AI model development and infrastructure include continuously improving model architecture to capture diverse variations and dimensions [00:16:18].

Synchronous Generation and Performance

While much of the current generative video creation is asynchronous, the potential for synchronous, real-time generative streaming is significant [00:18:53]. Haen offers a beta interactive avatar that can attend Zoom meetings in real-time [00:19:15]. The main technical challenges and innovations in AI hardware for real-time generation include optimizing inference speed as models become larger and more complex [00:19:42]. There is optimism that real-time AI video generation, even on-device, will be possible within 12 months [00:20:46]. This capability would enable new use cases like personalized video ads tailored to individual preferences [00:20:17].

Integration with other AI models and Brand Personalization

Haen primarily focuses on business videos, prioritizing control, consistency, and quality [00:23:24]. The company believes in an orchestration engine approach, combining text, script, voice, music, avatar footage, and background generation, rather than pixel-by-pixel, frame-by-frame generation, which can be less controllable [00:23:00]. Haen aims to integrate with text-to-video partners, using their output as a base layer while building its own orchestration engine on top to provide a holistic video experience [00:24:00].

Another key area for future development is “brand personalization” within video [00:25:03]. Similar to how ChatGPT can generate text in a specific brand tone, video AI models could learn a company’s visual style, color palette, and opening/closing elements from existing videos and integrate them into new generated content [00:25:51]. This would involve disassembling video into components and assembling them with user-inputted brand memory [00:26:16].

Business Strategy and Market Positioning

Haen’s business model is built on serving Enterprise customers who demand high quality and brand consistency [00:31:41]. A key challenge is integrating the technology into existing day-to-day workflows [00:32:09]. For marketing use cases, integration with CRM and go-to-market tools like HubSpot is crucial, as demonstrated by Haen’s partnership with HubSpot [00:32:27].

Regarding competition with incumbents like Snap and TikTok, Haen believes it’s creating a new market rather than directly competing in the old one [00:28:22]. Traditional platforms are built around mobile cameras and creators who use them [00:29:12]. Haen, however, aims to make the camera obsolete and enable video creation without a camera [00:29:30].

A potential dilemma for platforms like TikTok is how to balance and recommend AI-generated content alongside creator-based content [00:30:00]. If AI content becomes 50% of the platform, it could reduce views for existing creators, potentially necessitating new platforms specifically for AI-generated content [00:30:15]. Haen’s mission is not to be a consumption platform, but to build creative tools, though a new platform for AI-generated content is a possible future opportunity [00:31:10].

Capital and Growth

The AI category has different financial models compared to traditional software companies because of significant GPU and talent costs, meaning marginal costs are not close to zero [00:37:29]. However, AI-native companies and teams are highly efficient, allowing for faster growth trajectories and potentially requiring less capital than expected to build a great AI company [00:38:50]. Haen plans product development 12 months ahead, anticipating future model capabilities and cost reductions [00:41:00].

Ethical and Security Considerations

Trust and safety are critical for Haen, especially when serving large Enterprise customers [00:34:01]. Policies include:

  • Avatar Creation: Requires video consent from the person whose avatar is being created, verified by advanced AI to match the person in the footage [00:34:11]. Dynamic generated passwords with short expiry times (e.g., 10-15 seconds) add a secure layer to prevent unauthorized avatar creation [00:34:25].
  • Content Moderation: A hybrid system of AI model review and human moderation is in place to prevent hate speech, misinformation, and political campaign content [00:35:01].

Haen also engages in IP partnerships with actors who allow their likeness to be used for stock avatars [00:35:49]. The ability to generate new voices and persistent AI-generated persons opens up possibilities for creating new intellectual property, such as AI influencers [00:36:10].

The Future of Generative Video AI

Joshua Xu envisions that by 2030, everyone will have a “video agency in their pocket” [00:48:30]. This personal AI video agency, exemplified by Haen, would allow users to interact with the product as if talking to a personal video agency, translating ideas into filmed footage and edited content, with feedback loops [00:48:04].

The power of creative tools like generative video AI lies in opening up new use cases that are currently unimaginable [00:49:29]. Just as the mobile camera led to the rise of Instagram, Snapchat, and TikTok, lowering the barrier to video creation will unlock a whole new world of possibilities [00:49:56].

To learn more about Haen and try the product, visit haen.com [00:51:14].