From: redpointai
Percy Liang, a leading AI researcher and co-founder of Together AI, discusses the future potential of AI in various fields, including music, social simulation, and regulated industries, emphasizing its evolving role beyond simple task completion to more ambitious, long-term applications [01:41:00].
AI in Music
The field of AI music, exemplified by companies like Puno and Udio, is progressing, utilizing similar techniques to those in other AI domains, such as large transformers and data processing [00:53:16].
Challenges and Control
Key challenges in AI music include copyright issues, which are a significant hurdle [00:53:28]. Another important aspect is giving artists sufficient control over the music generation process [00:53:38].
“We don’t want to just have unconditional generation or even just textual generation and having it just generate the music felt like it wasn’t giving artists enough control.” [00:53:53]
Tools and Vision
Researchers are developing models like the anticipatory music transformer, which is a generalized infilling model allowing conditioning on various musical events, such as generating harmony from a melody or filling in sections of a score [00:54:12]. This approach aims to create a “co-pilot” for musicians, similar to GitHub Copilot for software engineers, helping composers and artists realize their creative visions [00:54:35]. Personally, Liang envisions AI tools assisting in achieving musical aspirations, especially for complex classical music that requires subtle control and where data is scarce [00:55:00].
AI in Other Industries and Applications
Generative Agents and Social Dynamics
Liang’s work on generative agents involves creating virtual worlds, akin to The Sims, where AI agents interact, allowing researchers to study complex social dynamics [00:14:14]. These agents, each powered by a language model with specific prompts and grounded in a virtual environment, can move and communicate [02:25:00]. This work has revealed emergent behaviors, such as information diffusion, where agents influence each other (e.g., one agent running for mayor convincing others) [02:37:00].
“Many of the phenomena that you see and uh you know kind of uh you know social dynamics crop up like you know information diffusion…” [02:49:00]
The vision for generative agents extends beyond believable simulations to creating “digital twins of society” [02:48:00]. Such simulations could be used to:
- Run experiments, like testing the impact of a COVID mask policy or a new law [02:59:00].
- Conduct social science studies more efficiently and with greater demographic diversity than traditional methods that rely on college students [02:25:42].
- Allow for unique experimental controls, such as giving an agent both the “treatment” and “control” by wiping its memory and re-running the simulation [02:55:00].
This type of simulation differs from past methods (e.g., physical or agent-based modeling) by allowing for much greater detail in mimicking human behavior due to the capabilities of modern AI models [02:41:00]. Future applications could include simulating personal decisions, such as investment scenarios or even dating [02:40:00].
Cybersecurity
AI models are being developed and tested on challenging cybersecurity exercises. Sidebench, a benchmark put out by Liang’s team, involves Capture the Flag cyber security challenges that can take human teams over 24 hours to solve [03:43:44].
Robotics
The field of robotics is currently in a “BERT era,” where vision-language models are fine-tuned to create foundation models for robotics [00:50:39]. While these models are effective, they still require fine-tuning for narrow tasks and result in brittle policies, unlike the more robust language models [00:50:52]. Despite the challenges of hardware, there’s optimism for a “ChatGPT moment” in robotics due to increased interest, funding, data collection efforts, and the ability to leverage existing infrastructure and architectural advancements from language and vision models [00:51:07]. The hope is that many “robotics problems” can be factored out into language and vision problems, making development more efficient [00:52:25].
Education
AI holds significant promise for education, with tools acting as excellent teachers and coaches [00:56:20]. Examples include AI explaining math problems or breaking down complex concepts for children [00:56:35].
Scientific Discovery and Research
Liang believes that AI’s future potential lies in scientific discovery and improving researcher productivity [00:59:21]. While many AI applications are driven by commercial needs (e.g., RAG solutions, summarization), the underexplored areas of fundamental science and research productivity are crucial as they can feed into and improve the entire AI ecosystem [00:59:34]. The goal is for AI to extend human knowledge by creating new research, solving open math problems, or discovering “zero-day” exploits in cybersecurity, moving beyond merely mimicking expert human behavior [00:48:51].
Regulated Industries (Healthcare, Finance)
In industries like healthcare and finance, the “black box” nature of Large Language Models (LLMs) presents a challenge, particularly concerning interpretability [00:35:37]. While mechanistic interpretability (understanding individual neurons) is a scientific pursuit, practical interpretability is needed for debugging models and for compliance in regulated sectors where knowing why a model made a decision (e.g., a credit decision or diagnosis) is crucial [00:37:04].
Attributing model predictions to specific training examples via influence functions is one approach, but it is computationally difficult to scale, especially if training data is private [00:37:38]. Chain-of-thought explanations can provide steps, but these explanations may not always reflect the model’s true internal workings [00:38:57]. For true interpretability, there’s a need for transparency, including access to model weights and training data, which was more common in earlier AI research [00:39:45].