Multiagent systems in AI

From: aidotengineer

LinkedIn has embarked on a comprehensive journey to develop and integrate AI agent capabilities into its platform, evolving from simple Generative AI (GenAI) features to complex multiagent systems. This evolution necessitated the creation of a robust GenAI platform to support and unify these advancements [00:00:35].

Early GenAI Product Experiences

In 2023, LinkedIn launched its first formal GenAI feature: Collaborative Articles [00:01:29]. This was a straightforward “prompt-in, string-out” application, leveraging GPT-4 to create long-form articles and invite member comments [00:01:34]. At this stage, the team built core components like a centralized model access gateway and Python notebooks for prompt engineering, but operated with dual tech stacks (Java for online, Python for backend) [00:01:57].

By mid-2023, LinkedIn recognized limitations, particularly the inability to inject rich internal data into the product experience [00:02:30]. This led to the development of the second generation of GenAI products, internally referred to as co-pilot or coach systems [00:02:42]. An example is the personalized recommendation feature that analyzes a user’s profile and job description, using a Retrieval Augmented Generation (RAG) process to assess job fit [00:02:51].

Key platform capabilities emerged during this phase:

Python SDK: Built on top of a popular framework, this SDK orchestrates Large Language Model (LLM) calls and integrates with LinkedIn’s large-scale infrastructure, enabling developers to assemble applications more easily [00:03:13].
Unified Tech Stack: Recognizing the cost and errors associated with transferring Python prompts to Java, the tech stack began unifying [00:03:40].
Prompt Management: Investment in a “prompt source of truth” sub-module helped developers version prompts and structure meta-prompts [00:03:51].
Conversational Memory: This infrastructure tracks LLM interactions and retrieval content, injecting it into the final product to enable conversational bot experiences [00:04:08].

Evolution to Multiagent Systems

In the last year, LinkedIn launched its first true multiagent system: the LinkedIn HR Assistant [00:04:33]. This system assists recruiters by automating tedious tasks such as job posting, candidate evaluation, and outreach [00:04:42].

The platform further evolved to support these advanced AI agent capabilities:

Distributed Agent Orchestration: The Python SDK was extended into a large-scale distributed agent orchestration layer, handling distributed execution, retry logic, and traffic shifts for complex agent scenarios [00:05:08].
Skill Registry: A key investment was made in a skill registry, providing tools for developers to publish APIs into a centralized hub. This registry handles skill discovery and invocation, making it easy for applications to call APIs and perform tasks [00:05:36]. Agents are expected to perform actions, making skills/APIs a critical component [00:05:36].
Experiential Memory: Beyond conversational memory, the platform introduced experiential memory. This storage system extracts, analyzes, and infers tacit knowledge from interactions between the agent and the user [00:06:14]. This memory is organized into different layers, including working, long-term, and collective memories, enhancing the agent’s awareness of surrounding content [00:06:35].
Operability: Due to the autonomous nature of agents—their ability to decide which APIs or LLMs to call—predicting their behavior is challenging [00:06:50]. LinkedIn invested in an in-house operability solution, built on OpenTelemetry, to track low-level telemetry data. This data enables replaying agent calls and provides analytics to guide future optimization of agent systems [00:07:08].

LinkedIn GenAI Platform Architecture

The GenAI platform’s components are broadly classified into four layers [00:07:39]:

Orchestration [00:07:44]
Prompt Engineering [00:07:47]
Tools and Skills Invocation [00:07:49]
Content and Memory Management [00:07:50]

This platform serves as a unified interface for a complex GenAI ecosystem, which also includes modeling layers (e.g., fine-tuning open-source models), responsible AI layers (ensuring agent adherence to policies), and core AI/Machine Learning infrastructure [00:07:56]. The platform’s key value proposition is simplifying developer access to this entire ecosystem, for instance, by allowing model switching with a single line of code [00:08:20]. It also provides a centralized point to enforce best practices and governance, ensuring efficient and responsible application development [00:09:12].

Why a GenAI/Agent Platform is Critical

LinkedIn believes a dedicated GenAI platform is critical for success because Generative AI systems, particularly agent systems, differ fundamentally from traditional AI systems [00:09:55]. In traditional AI, there’s a clear separation between model optimization and serving phases, allowing AI engineers and product engineers to work on separate tech stacks [00:10:04].

However, in GenAI systems, this line blurs; everyone becomes an engineer who can optimize overall system performance, creating new tooling and best practice challenges [00:10:24]. Agent systems are considered “compound AI systems” (as defined by Berkeley AI Research lab), tackling tasks using multiple interacting components like models, retrievers, or external tools [00:10:49]. The GenAI app platform aims to bridge the skill gap between AI engineers and product engineers [00:11:10].

Building and Scaling AI Agent Solutions

Hiring Principles for Agent Development Teams

Building a team for AI agent development is challenging due to the need for a rare combination of skills [00:12:25]. Ideal candidates are strong software engineers with infrastructure integration skills, developer PM skills for interface design, and AI/data science backgrounds to understand the latest techniques, while remaining hands-on [00:11:55].

LinkedIn’s hiring principles include:

Prioritize Software Engineering: Strong software engineering skills are prioritized over AI expertise [00:12:47].
Hire for Potential: Given the rapid evolution of the field, potential is valued over outdated experience or degrees [00:13:03].
Diversified Teams: Instead of seeking “unicorns,” LinkedIn builds diversified teams comprising full-stack engineers, data scientists, AI engineers, data engineers, fresh graduates from research universities, and individuals with startup backgrounds. Collaboration within these teams helps engineers acquire new skills [00:13:15].
Critical Thinking: The team constantly evaluates new open-source packages, engages with vendors, and proactively deprecates existing solutions, acknowledging that current solutions may be outdated within a year [00:14:06].

Technical Recommendations and Key Takeaways

Tech Stack Choice: Python is strongly recommended due to its prevalence in research and open-source communities, and its scalability [00:14:37].
Key Components to Build:
- Prompt Source of Truth: A robust version control system for prompts is critical for operational stability [00:15:03].
- Memory: Essential for injecting rich data into the agent experience [00:15:26].
- Skills/APIs: Uplifting internal APIs into easily callable skills for agents, supported by surrounding tooling and infrastructure [00:15:42].
Scaling and Adoption Strategies:
- Solve Immediate Needs: Start by addressing immediate problems (e.g., a simple Python library for orchestration) rather than attempting to build a full-fledged platform initially [00:16:04].
- Infrastructure and Scalability: Focus on building scalable infrastructure, like leveraging existing messaging infrastructure for the memory layer [00:16:29].
- Developer Experience (DX): Prioritize DX to ensure developer adoption. Design the platform to align with existing workflows, easing adoption and increasing success [00:16:46].

This strategic approach to developing AI agents and agentic workflows emphasizes iterative development, strong foundational engineering, and a focus on both technical excellence and developer productivity to build a robust and scalable GenAI ecosystem. For more technical details, readers can consult the LinkedIn Engineering blog post by Cake S and the presenter [00:17:10].

Tubegraph

Explorer

Table of Contents