From: aidotengineer
Bloomberg, a fintech company with over 40 years of history, has been engaged in investing in AI for approximately 15 to 16 years [00:00:44]. While they initially built their own large language model in 2022, they later pivoted their strategy to build on top of open-weight and open-source models after the emergence of ChatGPT [00:01:06].
The company’s AI efforts are organized as a specialized group reporting to the Global Head of Engineering [00:01:42]. This group collaborates closely with their data counterparts, product teams, and the CTO in cross-functional settings [00:01:49]. Bloomberg’s extensive data aggregation and enhancement for fintech capabilities are a significant asset, handling 400 billion ticks of structured data and over a billion unstructured messages daily, with more than 40 years of historical data [00:04:36].
Building Generative AI Products in Finance
Bloomberg has been building products using generative AI and more agentic tools for 12 to 16 months [00:02:16].
The company defines “tools” as cognitive architectures for language agents, while “agents” are more autonomous, possess memory, and can evolve [00:03:17].
Non-Negotiable Product Principles
Given its role in finance, Bloomberg’s products adhere to several non-negotiable principles, regardless of AI integration [00:06:21]:
- Precision [00:06:25]
- Comprehensiveness [00:06:25]
- Speed [00:06:27]
- Throughput [00:06:27]
- Availability [00:06:27]
- Protection of contributor and client data [00:06:30]
- Transparency [00:06:34]
These principles inform the design considerations for financial AI tools and the generative AI project challenges and strategies encountered when deploying AI agents.
Use Case: Supporting Research Analysts
One primary focus for generative AI applications is supporting financial research analysts [00:05:12]. Research analysts are typically experts in specific sectors like AI, semiconductors, or electric vehicles [00:05:27]. Their daily activities include:
- Search and discovery [00:05:40]
- Summarization (especially of unstructured data) [00:05:42]
- Working with structured data and analytics [00:05:46]
- Communication with colleagues [00:05:52]
- Building models, which requires data normalization and programming [00:06:00]
Earnings Call Summaries
In 2023, Bloomberg identified an opportunity to use AI to summarize public company earnings calls [00:06:54]. During earnings season, many such calls occur daily [00:07:17]. The goal was to generate transcripts and then use AI to answer common questions of interest to analysts within specific sectors, allowing them to quickly assess whether a deeper dive is needed [00:07:30].
Developing this product involved significant work on:
- Performance: Initial out-of-the-box performance for precision, accuracy, and factuality was not high [00:08:11].
- MLOps: Extensive work on remediation workflows and circuit breakers was necessary, as published summaries cannot contain errors due to their outsized impact [00:08:21].
- Monitoring: Constant monitoring, remediation, and CI/CD processes ensure summaries become more accurate over time [00:08:38].
Challenges in Scaling Generative AI
Addressing Fragility and Stochasticity
When building agentic architectures, especially compositions of LLMs, errors can multiply, leading to fragile behavior [00:11:25]. Unlike traditional machine learning models with well-defined input/output distributions, LLMs introduce more stochasticity [00:10:47].
For instance, an agent tasked with retrieving structured data like US CPI for the last five quarters might dispatch to a tool that returns incorrect data due to a minor input error (e.g., monthly vs. quarterly data) [00:13:31]. If a downstream workflow only displays the answer without the underlying table, it becomes difficult to catch such compounding errors [00:14:04].
To mitigate this, Bloomberg implements “guard rails” [00:09:09]. These are non-optional, coded checks for any agent [00:09:20]. For example, Bloomberg doesn’t offer financial advice, so any query asking for investment advice triggers a guardrail [00:09:12]. The approach is to assume upstream systems will be fragile and evolving, building in safety checks downstream to ensure resilience [00:14:23]. This allows individual agents to evolve faster without extensive “handshake signals” between teams every time an improvement is made [00:15:00].
Organizational Structure for Scaling
Scaling generative AI also requires rethinking organizational structure, moving beyond traditional machine learning team factorizations [00:15:40].
- Initial Phase: In the beginning, when product design is uncertain and fast iteration is key, it’s easier to use vertically aligned teams (often called “full-stack” or “product teams”) that share code, data, and models [00:16:46]. This allows teams to quickly build and figure things out [00:16:58].
- Maturity Phase: Once a product or agent’s use case and capabilities are well-understood, and multiple agents are being built, the organization can move towards a more horizontally aligned structure [00:17:10]. This allows for optimization, increased performance, reduced cost, better testability, and transparency [00:17:29]. For example, guard rails are handled horizontally, preventing each of 50 teams from independently figuring out how to refuse thinly veiled financial advice inputs [00:17:41]. This involves breaking down monolithic agents into smaller, more specialized pieces [00:18:12].
Current Agentic Architecture for Research Analysts
Bloomberg’s current architecture for knowledge agents in financial research is “semi-agentic” [00:08:57]. There isn’t full trust for complete autonomy, so some pieces are autonomous while others are not [00:09:03].
For a research analyst, the agent workflow involves:
- Query Understanding: An agent deeply understands the user’s query, session context, and what information is needed to answer it [00:18:28]. This is factored out as its own agent and reflected in the organizational structure [00:18:39].
- Tool Dispatch: The agent figures out which domain to dispatch the query to and uses a tool (with an NLP front end) to fetch the data [00:13:38]. This process contributes to efficiency improvements with AI in financial analysis.
- Answer Generation: A separate agent handles answer generation, with strict rigor around what constitutes a well-formed answer [00:18:43].
- Guard Rails: Non-optional guard rails are called at multiple points in the process to ensure compliance with principles like factuality and avoiding financial advice [00:18:52].
This architecture leverages years of traditional and modern data wrangling, including the evolution from sparse to dense and hybrid indices [00:18:59]. The use of agents and tools for fetching and processing data also ties into processes like generating structured data with AI SDK and evaluating generative AI workloads.