From: aidotengineer

Bloomberg, a financial technology (fintech) company, has been investing in AI for almost 15 to 16 years [00:00:44]. The company’s AI efforts are organized as a special group reporting to the Global Head of Engineering, collaborating closely with data, product, and CTO teams [00:01:36]. This group comprises approximately 400 people across 50 teams in London, New York, Princeton, and Toronto [00:01:58].

Evolution of AI Strategy

In late 2021, Large Language Models (LLMs) began capturing significant attention [00:00:39]. Bloomberg initially spent 2022 building its own LLM, gaining extensive knowledge in model construction, data organization, evaluation, and performance tuning [00:00:47]. However, with the emergence of ChatGPT and the strong growth of the open-weight and open-source community, Bloomberg strategically pivoted to building on top of available external models and resources [00:01:06].

Defining AI Tools and Agents

Internally, Bloomberg distinguishes between “tools” and “agents” based on concepts from the paper “Cognitive Architectures for Language Agents” [00:03:17]:

  • Tools: Function as cognitive architectures for language agents, typically on the left side of the spectrum, implying less autonomy [00:03:20].
  • Agents: Are more autonomous, possessing memory and the ability to evolve [00:03:30].

Non-Negotiable Principles in Finance

As a fintech company serving diverse financial clients, Bloomberg adheres to strict non-negotiable principles for all its products, regardless of whether AI is used [00:06:15]:

  • Precision and Accuracy: Ensuring correctness of information [00:06:25].
  • Comprehensiveness: Providing thorough and complete data [00:06:26].
  • Speed and Throughput: Delivering information quickly and efficiently [00:06:27].
  • Availability: Ensuring consistent access to services [00:06:29].
  • Data Protection: Safeguarding contributor and client data [00:06:30].
  • Transparency: Maintaining clarity throughout the product development and output processes [00:06:36].

Bloomberg manages and processes vast amounts of data daily, including 400 billion ticks of structured data and over a billion unstructured messages, with over 40 years of historical data [00:04:36].

Case Study: Earnings Call Summaries for Research Analysts

Bloomberg has been building products using generative AI tools for 12 to 16 months [00:02:12]. An example product is the automated summary of public company earnings calls for research analysts [00:06:54]. These calls include executive presentations and Q&A segments, with many occurring daily during earning season [00:07:13].

The strategy involves using AI to answer sector-specific questions of interest to analysts, enabling them to quickly determine if a deeper dive is necessary [00:07:37].

Challenges and Solutions

Developing this product required significant effort in MLOps, particularly in building remediation workflows and circuit breakers [00:08:19]. Given that these summaries are published and errors have an outsized impact, continuous monitoring, remediation, and CI/CD processes are crucial to improve accuracy [00:08:31].

Scaling AI Products and Organizational Structure

Scaling AI product development at Bloomberg involves two key aspects:

1. Handling Fragility of AI Systems

Unlike traditional software with well-documented APIs, AI models, and especially compositions of LLMs (agents), introduce stochasticity and multiply errors [00:10:54]. This leads to fragile behavior [00:11:29].

  • Lessons from News Sentiment Product: Even with controlled inputs (e.g., news wires with editorial guidelines) and clear outputs (e.g., sentiment scores), and robust training data, traditional ML models still required out-of-band communication with downstream consumers about model changes [00:11:43].
  • Challenge with Agentic Architectures: For agentic architectures, the goal is to make improvements daily, not through slow batch regression testing cycles [00:13:02].
  • Strategy: Internal Safety Checks: Instead of relying on upstream systems to be perfectly accurate and stable, Bloomberg’s strategy is to factor in their potential fragility and evolution [00:14:23]. This means building in internal safety checks and guard rails within each agent. This approach allows individual agents to evolve faster without extensive “handshake signals” with every downstream caller, ultimately leading to more resilient systems [00:14:51].

2. Evolving Organizational Structure for Scaling

Traditional machine learning organization structures often reflect software factorization [00:15:40]. However, with new tech stacks and product types, it’s necessary to rethink this structure [00:15:57].

  • Initial Phase (Vertical Alignment): In the early stages of product design where iteration speed is key and the product’s exact nature is unknown, it’s more effective to have vertically aligned teams [00:16:46]. This encourages fast iteration, and sharing of code, data, and models [00:17:01].
  • Scaling Phase (Horizontal Alignment): As products and agents mature and their use cases become clearer, the organization can shift towards horizontally aligned teams [00:17:18]. This allows for optimization (e.g., increasing performance, reducing cost), improved testability, and transparency [00:17:29].
    • Example: Guard Rails: Functions like “guard rails” (e.g., preventing financial advice or ensuring factual accuracy) are centralized and handled horizontally across teams, rather than each of the 50 teams figuring them out independently [00:17:39]. This is a crucial aspect of AI implementation best practices.
  • Decomposing Monolithic Agents: Over time, monolithic agents can be broken down into smaller, more specialized pieces, reflected in the organizational structure [00:18:12].

Current Research Agent Architecture Example

A research agent at Bloomberg today exhibits a semi-agentic architecture [00:18:24]. It processes user queries, understands session context, identifies needed information, and dispatches to relevant tools [00:18:28]. Components like query understanding and answer generation are factored out as their own agents [00:18:39]. Non-optional guard rails are implemented at multiple points, ensuring no autonomy in critical areas, and it builds upon years of traditional and modern data management techniques [00:18:52].