From: aidotengineer

Bloomberg, a fintech company with over 40 years of history, has been engaged in investing in AI for approximately 15 to 16 years [00:00:44]. While they initially built their own large language model in 2022, they later pivoted their strategy to build on top of open-weight and open-source models after the emergence of ChatGPT [00:01:06].

The company’s AI efforts are organized as a specialized group reporting to the Global Head of Engineering [00:01:42]. This group collaborates closely with their data counterparts, product teams, and the CTO in cross-functional settings [00:01:49]. Bloomberg’s extensive data aggregation and enhancement for fintech capabilities are a significant asset, handling 400 billion ticks of structured data and over a billion unstructured messages daily, with more than 40 years of historical data [00:04:36].

Building Generative AI Products in Finance

Bloomberg has been building products using generative AI and more agentic tools for 12 to 16 months [00:02:16].

The company defines “tools” as cognitive architectures for language agents, while “agents” are more autonomous, possess memory, and can evolve [00:03:17].

Non-Negotiable Product Principles

Given its role in finance, Bloomberg’s products adhere to several non-negotiable principles, regardless of AI integration [00:06:21]:

These principles inform the design considerations for financial AI tools and the generative AI project challenges and strategies encountered when deploying AI agents.

Use Case: Supporting Research Analysts

One primary focus for generative AI applications is supporting financial research analysts [00:05:12]. Research analysts are typically experts in specific sectors like AI, semiconductors, or electric vehicles [00:05:27]. Their daily activities include:

  • Search and discovery [00:05:40]
  • Summarization (especially of unstructured data) [00:05:42]
  • Working with structured data and analytics [00:05:46]
  • Communication with colleagues [00:05:52]
  • Building models, which requires data normalization and programming [00:06:00]

Earnings Call Summaries

In 2023, Bloomberg identified an opportunity to use AI to summarize public company earnings calls [00:06:54]. During earnings season, many such calls occur daily [00:07:17]. The goal was to generate transcripts and then use AI to answer common questions of interest to analysts within specific sectors, allowing them to quickly assess whether a deeper dive is needed [00:07:30].

Developing this product involved significant work on:

  • Performance: Initial out-of-the-box performance for precision, accuracy, and factuality was not high [00:08:11].
  • MLOps: Extensive work on remediation workflows and circuit breakers was necessary, as published summaries cannot contain errors due to their outsized impact [00:08:21].
  • Monitoring: Constant monitoring, remediation, and CI/CD processes ensure summaries become more accurate over time [00:08:38].

Challenges in Scaling Generative AI

Addressing Fragility and Stochasticity

When building agentic architectures, especially compositions of LLMs, errors can multiply, leading to fragile behavior [00:11:25]. Unlike traditional machine learning models with well-defined input/output distributions, LLMs introduce more stochasticity [00:10:47].

For instance, an agent tasked with retrieving structured data like US CPI for the last five quarters might dispatch to a tool that returns incorrect data due to a minor input error (e.g., monthly vs. quarterly data) [00:13:31]. If a downstream workflow only displays the answer without the underlying table, it becomes difficult to catch such compounding errors [00:14:04].

To mitigate this, Bloomberg implements “guard rails” [00:09:09]. These are non-optional, coded checks for any agent [00:09:20]. For example, Bloomberg doesn’t offer financial advice, so any query asking for investment advice triggers a guardrail [00:09:12]. The approach is to assume upstream systems will be fragile and evolving, building in safety checks downstream to ensure resilience [00:14:23]. This allows individual agents to evolve faster without extensive “handshake signals” between teams every time an improvement is made [00:15:00].

Organizational Structure for Scaling

Scaling generative AI also requires rethinking organizational structure, moving beyond traditional machine learning team factorizations [00:15:40].

  • Initial Phase: In the beginning, when product design is uncertain and fast iteration is key, it’s easier to use vertically aligned teams (often called “full-stack” or “product teams”) that share code, data, and models [00:16:46]. This allows teams to quickly build and figure things out [00:16:58].
  • Maturity Phase: Once a product or agent’s use case and capabilities are well-understood, and multiple agents are being built, the organization can move towards a more horizontally aligned structure [00:17:10]. This allows for optimization, increased performance, reduced cost, better testability, and transparency [00:17:29]. For example, guard rails are handled horizontally, preventing each of 50 teams from independently figuring out how to refuse thinly veiled financial advice inputs [00:17:41]. This involves breaking down monolithic agents into smaller, more specialized pieces [00:18:12].

Current Agentic Architecture for Research Analysts

Bloomberg’s current architecture for knowledge agents in financial research is “semi-agentic” [00:08:57]. There isn’t full trust for complete autonomy, so some pieces are autonomous while others are not [00:09:03].

For a research analyst, the agent workflow involves:

  1. Query Understanding: An agent deeply understands the user’s query, session context, and what information is needed to answer it [00:18:28]. This is factored out as its own agent and reflected in the organizational structure [00:18:39].
  2. Tool Dispatch: The agent figures out which domain to dispatch the query to and uses a tool (with an NLP front end) to fetch the data [00:13:38]. This process contributes to efficiency improvements with AI in financial analysis.
  3. Answer Generation: A separate agent handles answer generation, with strict rigor around what constitutes a well-formed answer [00:18:43].
  4. Guard Rails: Non-optional guard rails are called at multiple points in the process to ensure compliance with principles like factuality and avoiding financial advice [00:18:52].

This architecture leverages years of traditional and modern data wrangling, including the evolution from sparse to dense and hybrid indices [00:18:59]. The use of agents and tools for fetching and processing data also ties into processes like generating structured data with AI SDK and evaluating generative AI workloads.