From: aidotengineer
Bloomberg began seriously investing in large language models (LLMs) in late 2021, though the company has been investing in AI for nearly 15-16 years [00:00:36]. Initially, Bloomberg dedicated 2022 to building their own large language model, learning about data organization and evaluation during this process [00:00:50]. However, with the rise of ChatGPT and the open-weight and open-source communities, Bloomberg pivoted its strategy in 2023 to focus on building on top of existing available models [00:01:06].
Bloomberg’s AI Organization
Bloomberg’s AI efforts are structured as a specialized group within engineering, reporting to the Global Head of Engineering [00:01:38]. This group works in cross-functional settings with data counterparts, product teams, and the CTO’s office [00:01:49]. The AI team comprises approximately 400 people across 50 teams, located in London, New York, Princeton, and Toronto [00:01:58].
The company has been actively building products using generative AI for 12 to 16 months, with a significant focus on developing more agentic tools [00:02:16].
Defining Tools vs. Agents
For internal clarity, Bloomberg distinguishes between “tools” and “agents” based on the paper “Cognitive Architectures for Language Agents” [00:03:17].
- Tool: Refers to the “left-hand side” of the spectrum, implying less autonomy [00:03:20].
- Agent: Represents the “right-hand side” of the spectrum, characterized by greater autonomy, memory, and the ability to evolve [00:03:30].
Bloomberg’s Data Scale and Principles
As a FinTech company, Bloomberg deals with a massive scale of financial data:
- They generate and accumulate both unstructured data (news, research, documents, slides) and structured data (reference data, market data) [00:04:17].
- Daily, they receive 400 billion ticks of structured data, over a billion unstructured messages, and millions of written documents including news [00:04:36].
- This data spans over 40 years of history [00:04:48].
Due to the nature of finance, certain product principles are non-negotiable:
- Precision, Comprehensiveness, Speed, Throughput, Availability [00:06:25].
- Protecting contributor and client data [00:06:30].
- Transparency [00:06:36]. These principles must be maintained regardless of whether AI is used [00:06:38].
Scaling Agentic Architectures
When scaling LLM-based agents, two primary aspects are crucial:
1. Embracing Fragility with Robust Guardrails
Traditional software, like generalized matrix product APIs, is robust and well-documented [00:10:07]. Machine learning APIs, while generally well-intentioned in defining input/output distributions, introduce a degree of stochasticity [00:10:41]. However, with LLMs and their compositions into agents, errors multiply significantly, leading to fragile behavior [00:11:18].
For instance, a news sentiment product built in 2009 had predictable input (news wires, language, editorial guidelines) and output distributions, allowing for careful testing and deployment risk assessment [00:11:43]. Downstream consumers were notified of model changes and encouraged to re-test [00:12:41].
In contrast, modern agentic architectures require daily improvements, making traditional batch regression testing and extensive release cycles impractical [00:13:03]. Errors can compound, especially in downstream workflows where data may not be fully exposed [00:14:16].
The key to scaling is to factor in that upstream systems will be fragile and evolving, and implement independent safety checks and guardrails at each stage [00:14:23]. This allows individual agents to evolve faster without extensive coordination and sign-offs from every downstream caller [00:14:58].
For example, Bloomberg’s agentic products are “semi-agentic” because full autonomy is not yet trusted [00:08:57]. Essential guardrails, such as prohibiting financial advice or ensuring factuality, are hard-coded and mandatory [00:09:09].
2. Rethinking Organizational Structure
Traditional machine learning teams often have a software factorization reflected in their organizational structure [00:15:38]. However, when building with LLMs and new tech stacks, this structure needs to be re-evaluated [00:15:57].
- Early Stages (Product Discovery): It’s beneficial to have vertically aligned, collapsed teams and software stacks to facilitate rapid iteration and product design discovery [00:16:46]. This fosters fast iteration and sharing of code, data, and models [00:17:01].
- Later Stages (Optimization and Scale): Once the product design and agent use cases are understood, the organization can transition to a more horizontal structure [00:17:10]. This allows for optimization, performance improvement, cost reduction, increased testability, and transparency [00:17:29]. For instance, common guardrails like filtering financial advice queries should be horizontally managed across teams to avoid redundant effort [00:17:41]. This also enables breaking down monolithic agents into smaller, more manageable pieces [00:18:12].
Example: Research Analyst Agent
For a research analyst, a current agent architecture looks like this:
- An agent deeply understands the user’s query and session context [00:18:28].
- It determines the necessary information and dispatches to a tool, often with an NLP front end, to fetch data [00:18:34].
- Answer generation has its own agent, with strict rules for well-formed answers [00:18:43].
- Non-optional guardrails are called at multiple points, ensuring no autonomy where core principles are concerned [00:18:50].
- The system builds upon years of traditional and modern data management techniques [00:18:59].
This approach reflects the understanding that to successfully scale and deploy AI agents, organizations must adapt their technical and structural mindsets to prioritize resilience and disciplined factorization.