From: redpointai

AI is profoundly transforming the music industry, with Suno leading the charge, boasting over 10 million users and a recent fundraise of $125 million [00:00:11]. Mikey Shelman, CEO of Suno, is at the epicenter of this evolution, envisioning a future where AI greatly expands the market and joy derived from music [00:40:48].

The Evolving Nature of Music Creation

Suno’s vision for the future of music goes beyond the final product, emphasizing the creative journey and the human experience [00:02:47]. Mikey Shelman highlights how personal creations, like songs made with his son about fantastical situations, resonate deeply not just for the output, but for the shared experience of crafting them [00:02:51]. This marks moments in time that will never be forgotten [00:02:57].

Music, much like photographs, serves as a powerful way to relive memories and evoke specific feelings tied to moments in life, and AI can amplify this experience [00:05:22].

Democratizing Music Creation

A core tenet of Suno’s vision is to democratize music creation, making it accessible to everyone, not just trained musicians [00:04:16].

  • Reigniting Imagination: AI tools aim to reintroduce the “play” aspect of music creation that adults often lose, similar to how children naturally interact with music using everyday objects [00:03:55]. This addresses the common theme in AI creator tools of allowing individuals to actualize their imaginative ideas, even if they were previously told they lacked artistic talent or found complex tools too daunting [00:04:01].
  • Guided Creative Experiences: The goal is to offer a “paint by numbers” type of experience for music, providing a guided creative process that humans are hardwired to enjoy [00:04:25]. This counters cultural reasons that might discourage people from engaging in creative endeavors [00:04:47].

Intuitive Interaction and Overcoming the “Blank Canvas” Problem

While current AI music tools are largely text-driven, Mikey Shelman acknowledges this as a sign of their early stage [00:10:37]. The future will involve more intuitive ways for users to interact with the underlying music model, moving beyond the “blank canvas” or “writer’s block” problem often associated with AI products [00:08:40].

Future interaction methods could include:

  • Humming a melody [00:10:01]
  • Tapping a beat [00:10:06]
  • Using emotions or current mood as prompts [00:11:41]
  • Leveraging visuals or sounds from everyday life, like clinking glasses, to inspire musical creations [00:11:47]

The aim is to allow users to “pour their heart out” into the creation process, making it a more expressive and enjoyable journey [00:13:11].

The Rise of Collaborative Music Creation

A significant future focus for Suno is “multiplayer” experiences, enabling people to make music together [00:14:31].

  • Synchronous and Asynchronous Collaboration: This could involve real-time jam sessions where users express ideas and react to each other, or asynchronous collaboration where parts of songs or musical ideas are shared and modified [00:14:39].
  • Musical Conversation: Music is viewed as a conversation, and just like natural language conversations, musical collaborations can be fluid and dynamic, allowing participants to riff, adapt, extend, or add to each other’s contributions [00:15:20].
  • Joy of Jamming: The goal is to recreate the “most enjoyable moments” of jamming with friends for those who are not instrument experts, leveraging the fact that everyone has musical taste [00:15:52].
  • Live Interactive Performances: The potential extends to live streaming platforms like Twitch, where artists can create music interactively with their audience, transforming traditional digital concerts into engaging, collective experiences [00:16:55]. This could even evolve into interactive sports stadium music where fans and athletes have input [00:17:52].

Future Music Model Capabilities

Mikey Shelman envisions AI music models moving beyond simply replicating human-made pop songs.

“Three and a half minute pop song indistinguishable from what a human would do in a recording studio just to me feels like a somewhat artificially constrained… that’d be kind of disappointing if that’s all we got” [00:38:01]

The true potential lies in creating experiences that evoke feeling and allow for new forms of interaction [00:37:56].

  • Iterative Control: Models need to improve their ability to respond to specific, iterative feedback from users, allowing precise adjustments (e.g., “do that but change X”) [00:23:55].
  • Precise Musical Control: Future models should offer more objective control over elements like tempo (BPM), which currently act as loose guidelines [00:24:17].
  • Real-time Interaction: A personal aspiration is a Vision Pro app where users can “play air guitar with a band” or conduct a symphony, with the music responding in real-time to their movements, creating an immersive and joyful “game” [00:38:23]. This aligns with the broader challenges and opportunities in creative AI tools to make creative processes more intuitive and enjoyable.

Broadening the Horizon of AI in Audio

Mikey Shelman expresses excitement about the growing recognition of audio as a first-class citizen in the AI world, particularly following advancements like GPT-4o [00:29:32].

“I’m so glad that… this wave is happening right now for people to realize that audio is and should be a first class citizen… it is how the vast majority of human communication happens.” [00:29:32]

This broader impact of audio AI aligns with the future of AI in human communication and the future of voice AI and its impact. While not focused on music, examples include enhancing customer service or even revolutionizing interactions with home infrastructure (e.g., plumbing systems) through voice commands [00:31:17].

He believes that while current models like GPT-4o and ElevenLabs offer impressive interfaces, they are still largely text-driven at their core [00:32:51]. True integration of audio will take longer, suggesting room for diverse, specialized audio models rather than immediate consolidation into one giant multimodal entity [00:33:12]. This perspective also ties into the wider discussion on challenges and future directions for AI in various domains, particularly regarding the balance between generalist and specialist AI models.

Industry Growth and Niche Opportunities

The AI music market is seen as “really, really big” and still “so Green Field” [00:39:40]. AI is poised to greatly expand the overall market for music, which is currently smaller than its potential [00:39:43].

  • Diverse Business Models: There will be room for multiple companies serving different niches [00:40:10]. For example, some companies will focus on tools for professional artists, while others will target background music generation for videos [00:40:24].
  • Consumer-Centric Approach: Suno’s primary focus remains on building experiences for the average person and general consumer, aiming to expand how important and joyful music is in their lives [00:40:54].
  • IP Partnerships and Viral Content: While acknowledging the viral nature of AI-generated content mimicking famous artists (e.g., Taylor Swift singing Enter Sandman), Suno has intentionally avoided this due to legal and long-term relevance concerns [00:42:10]. This kind of content is viewed as a “flash in the pan” that won’t define the future of music creation [00:43:39]. Instead, the focus is on enabling users to create music relevant to their own lives and stories [00:43:09]. This approach contrasts with some discussions on the broader impact of AI on the music industry, especially concerning copyright and artist likeness.