AI tools in financial research and due diligence

From: aidotengineer

Brightwave, led by founder and CEO Mike Khan, develops a research agent designed to digest extensive content corpuses within the financial domain [00:00:19]. This technology addresses the non-trivial task of quickly gaining conviction in competitive deal processes, spotting critical risk factors, and analyzing market trends at both sector and individual ticker levels [00:00:27].

The Challenge of Manual Financial Research

Financial professionals, particularly junior analysts, often face immense pressure to perform impossible tasks under extremely tight deadlines [00:01:34]. Examples of these demanding tasks include:

Due Diligence: Stepping into data rooms with thousands of pages of content pre-term sheet and needing to quickly assess risks and opportunities [00:00:31].
Earnings Season Analysis: Mutual fund analysts covering 80-120 names must process calls, transcripts, and filings to understand market happenings [00:00:47].
Confirmatory Diligence: Reviewing hundreds of vendor contracts to identify early termination clauses or thematic negotiation patterns across an entire portfolio [00:01:08].

Such tasks are “frankly not a human-level intelligence task” [00:01:21], leading to significant human cost and stress when performed manually [00:02:04].

The Role of AI in Finance Workflows

The integration of AI into finance workflows draws parallels to the early adoption of spreadsheets [00:02:19]. Before computational spreadsheets, accountants manually “ran the numbers,” a cognitively demanding, important, and time-intensive job [00:02:30]. Today, no one wants that manual job because tools have allowed for a substantial increase in the sophistication of thought applied to problems, enabling more effective and efficient work [00:02:44].

Similarly, systems like Brightwave and other knowledge agents are capable of digesting vast volumes of content and performing meaningful work that accelerates efficiency and time-to-value by orders of magnitude [00:03:03].

Brightwave’s Approach and Design Philosophy

Brightwave aims to build a high-fidelity research agent [00:03:33]. A core design problem is how to reveal the thought process of an AI that has considered thousands of pages of content in a useful and legible way, as the final form factor for such products is still evolving beyond simple chat interfaces [00:03:40].

Addressing Model Limitations

Non-reasoning models often perform “greedy local search,” leading to fidelity issues [00:04:10]. For example, if a model has a 5-10% error rate when extracting organizations from an article and these calls are chained, the likelihood of error increases exponentially [00:04:24]. Winning systems will perform end-to-end Reinforcement Learning (RL) over tool use calls, where API call results influence the RL sequence of decisions, allowing for locally suboptimal decisions to achieve globally optimal outputs [00:04:36]. However, intelligently leveraging tools to achieve global optimality remains an open research problem [00:05:04].

Product Building and User Experience

While the “bitter lesson” suggests that more data, more compute, and better models will dominate [00:05:23], the reality for product development today involves being “circumspect about what is the scope of behaviors that the system the agent is going to engage in” [00:05:41]. This constrains model complexity, reducing the likelihood of degenerate outputs [00:05:48].

Users are generally not expected to become prompting experts, as developing such a skill can take over a thousand hours [00:07:22]. Therefore, verticalized product workflows that provide scaffolding to orchestrate these systems and specify user intent are likely to be enduring [00:07:31].

Archetypal Design Patterns for Autonomous Agents

To design effective autonomous agents, it’s crucial to mimic the human decision-making process [00:08:00]. This involves decomposing complex tasks into steps:

Assess Relevant Documents: Like an analyst looking for public market comparables in SEC filings or earnings call transcripts [00:08:12].
Distill Findings: Extracting information that substantiates hypotheses or investment theses [00:08:32]. Intermediate notes, or “thinking out loud,” are extremely useful at this stage [00:08:55].
Enrich and Error Correct Findings:
- Models can be asked to verify accuracy, e.g., “is this factually entailed by this document?” or “is this an organization?” [00:09:22].
- Performing this validation as a secondary call is often more powerful than within a Chain of Thought JSON, as models can be “primed to be credulous” otherwise [00:09:39].

Through this process, the system synthesizes disparate fact patterns across many documents into a coherent narrative [00:09:55]. Human oversight is extremely important, allowing users to nudge the model with directives or pull interesting threads, leveraging non-digitized information or specific insights [00:10:04].

Applying the Unix philosophy, it’s beneficial to think of these systems not as anthropomorphized agents (e.g., “portfolio manager agent”) but as simple tools that do one thing and work well together, with text as the universal interface [00:10:48]. This approach maintains flexibility as the compute graph’s design needs change [00:10:54].

Challenges and Considerations for AI Systems

The Latency Trap

The “latency trap” refers to the impact of the feedback loop on a user’s mental model [00:11:59]. If the feedback loop is long (e.g., 8-20 minutes), users cannot perform many iterations in a day, hindering their faculty with the system and product [00:12:49]. The user’s mental model develops based on the difference between their expectation for a report and its actual output, which is the “loss” [00:12:39].

Synthesis Limitations

While models boast large context windows (e.g., 100,000 tokens), producing very long (e.g., 50,000 token) coherent and novel responses is difficult in practice [00:13:10]. This is because instruction tuning datasets typically have characteristic output lengths; it’s hard to write 50,000 coherent words as a human demonstration [00:13:31].

This implies a “compression problem” where a large input context window is compressed into a shorter output [00:13:50]. Higher quality and more information-dense outputs can be achieved by decomposing research instructions into multiple sub-themes, allowing the model to be more granular and specific [00:14:27].

Furthermore, the presence of recombinative reasoning demonstrations in instruction tuning and post-training corpuses is low [00:14:43]. It’s easy for models to internalize a fixed corpus and generate variations (e.g., new epilogues for The Great Gatsby), but true synthesis involves weaving together disparate fact patterns from multiple documents, which is challenging [00:15:05].

Managing Complex Real-World Situations

Current models face limitations in managing complex real-world situations, such as:

Temporality: Understanding the sequence and impact of events, like how proforma financial statements change after a merger and acquisition [00:15:50].
Addendums: Propagating evidentiary passages with metadata that contextualizes their importance and relation to other evidence, especially for contract addendums [00:16:07].

Brightwave Product Features and UI/UX

Brightwave aims to reveal the AI’s thought process through an interface that acts more like a “surface” [00:16:40]. Key features include:

Details on Demand: Users can click on a citation to get additional context about the document and understand what the model was thinking [00:17:40].
Structured Interactive Outputs: Allows users to “pull the thread” on a specific point, like inquiring about rising capital expenditure (capex) [00:17:52].
Text Interrogation: Users can highlight any passage of text and ask for more information or implications, similar to how Open AI’s Canvas allows increasing the reading level of a passage [00:18:00].
Audit Trail: The system treats its discovered findings as a high-dimensional data structure, with the report being one view [00:18:45]. Users need to be able to “turn over that cube” to see the “receipts” or audit trail of the system’s analysis, by clicking into documents or viewing laid-out findings like a fundraising timeline or ongoing litigation [00:18:59].
“Magnifying Glass for Text”: This allows analysts to drill in and get additional details on demand for specific items that catch their attention, such as patent litigation or a critical supply chain disruption [00:19:26].

The final form factor for this class of products is still evolving, representing an interesting design problem [00:19:47].

Tubegraph

Explorer

Table of Contents