From: aidotengineer

Bloomberg began seriously investing in large language models (LLMs) in late 2021, though the company has been investing in AI for nearly 15-16 years [00:00:36]. Initially, Bloomberg dedicated 2022 to building their own large language model, learning about data organization and evaluation during this process [00:00:50]. However, with the rise of ChatGPT and the open-weight and open-source communities, Bloomberg pivoted its strategy in 2023 to focus on building on top of existing available models [00:01:06].

Bloomberg’s AI Organization

Bloomberg’s AI efforts are structured as a specialized group within engineering, reporting to the Global Head of Engineering [00:01:38]. This group works in cross-functional settings with data counterparts, product teams, and the CTO’s office [00:01:49]. The AI team comprises approximately 400 people across 50 teams, located in London, New York, Princeton, and Toronto [00:01:58].

The company has been actively building products using generative AI for 12 to 16 months, with a significant focus on developing more agentic tools [00:02:16].

Defining Tools vs. Agents

For internal clarity, Bloomberg distinguishes between “tools” and “agents” based on the paper “Cognitive Architectures for Language Agents” [00:03:17].

  • Tool: Refers to the “left-hand side” of the spectrum, implying less autonomy [00:03:20].
  • Agent: Represents the “right-hand side” of the spectrum, characterized by greater autonomy, memory, and the ability to evolve [00:03:30].

Bloomberg’s Data Scale and Principles

As a FinTech company, Bloomberg deals with a massive scale of financial data:

  • They generate and accumulate both unstructured data (news, research, documents, slides) and structured data (reference data, market data) [00:04:17].
  • Daily, they receive 400 billion ticks of structured data, over a billion unstructured messages, and millions of written documents including news [00:04:36].
  • This data spans over 40 years of history [00:04:48].

Due to the nature of finance, certain product principles are non-negotiable:

  • Precision, Comprehensiveness, Speed, Throughput, Availability [00:06:25].
  • Protecting contributor and client data [00:06:30].
  • Transparency [00:06:36]. These principles must be maintained regardless of whether AI is used [00:06:38].

Scaling Agentic Architectures

When scaling LLM-based agents, two primary aspects are crucial:

1. Embracing Fragility with Robust Guardrails

Traditional software, like generalized matrix product APIs, is robust and well-documented [00:10:07]. Machine learning APIs, while generally well-intentioned in defining input/output distributions, introduce a degree of stochasticity [00:10:41]. However, with LLMs and their compositions into agents, errors multiply significantly, leading to fragile behavior [00:11:18].

For instance, a news sentiment product built in 2009 had predictable input (news wires, language, editorial guidelines) and output distributions, allowing for careful testing and deployment risk assessment [00:11:43]. Downstream consumers were notified of model changes and encouraged to re-test [00:12:41].

In contrast, modern agentic architectures require daily improvements, making traditional batch regression testing and extensive release cycles impractical [00:13:03]. Errors can compound, especially in downstream workflows where data may not be fully exposed [00:14:16].

The key to scaling is to factor in that upstream systems will be fragile and evolving, and implement independent safety checks and guardrails at each stage [00:14:23]. This allows individual agents to evolve faster without extensive coordination and sign-offs from every downstream caller [00:14:58].

For example, Bloomberg’s agentic products are “semi-agentic” because full autonomy is not yet trusted [00:08:57]. Essential guardrails, such as prohibiting financial advice or ensuring factuality, are hard-coded and mandatory [00:09:09].

2. Rethinking Organizational Structure

Traditional machine learning teams often have a software factorization reflected in their organizational structure [00:15:38]. However, when building with LLMs and new tech stacks, this structure needs to be re-evaluated [00:15:57].

  • Early Stages (Product Discovery): It’s beneficial to have vertically aligned, collapsed teams and software stacks to facilitate rapid iteration and product design discovery [00:16:46]. This fosters fast iteration and sharing of code, data, and models [00:17:01].
  • Later Stages (Optimization and Scale): Once the product design and agent use cases are understood, the organization can transition to a more horizontal structure [00:17:10]. This allows for optimization, performance improvement, cost reduction, increased testability, and transparency [00:17:29]. For instance, common guardrails like filtering financial advice queries should be horizontally managed across teams to avoid redundant effort [00:17:41]. This also enables breaking down monolithic agents into smaller, more manageable pieces [00:18:12].

Example: Research Analyst Agent

For a research analyst, a current agent architecture looks like this:

  • An agent deeply understands the user’s query and session context [00:18:28].
  • It determines the necessary information and dispatches to a tool, often with an NLP front end, to fetch data [00:18:34].
  • Answer generation has its own agent, with strict rules for well-formed answers [00:18:43].
  • Non-optional guardrails are called at multiple points, ensuring no autonomy where core principles are concerned [00:18:50].
  • The system builds upon years of traditional and modern data management techniques [00:18:59].

This approach reflects the understanding that to successfully scale and deploy AI agents, organizations must adapt their technical and structural mindsets to prioritize resilience and disciplined factorization.