Building custom AI models for enterprises

From: redpointai

The landscape of enterprise AI adoption and usage is rapidly evolving, with various models for deployment emerging [00:00:46]. While some approaches lean towards direct consulting or out-of-the-box products, a middle ground is anticipated to win out in the long term, requiring a degree of support for enterprises to integrate this new, complex technology [00:01:19].

The Need for Custom AI Models

AI agents require extensive access to enterprise data—such as emails, chat, calls, CRM, ERP, and HR software—to effectively drive automation [00:01:58]. This presents unique challenges:

Privacy Concerns This level of access raises significant privacy issues, which are more pronounced for AI agents than for other types of enterprise software [00:02:27].
Diverse Software Ecosystems Each company utilizes a unique combination of software, necessitating custom setup to bring all relevant context together and integrate it into the AI model [00:02:43] [00:02:57].

While general models are becoming extraordinary, there remains a fundamental need for custom models to address specific business or domain contexts [00:10:52] [00:11:02]. Data not readily available on the web, such as manufacturing data, customer transactions, or detailed personal health records, necessitates custom model development to close this knowledge gap [00:11:15] [00:11:29].

Coher’s Approach to Custom Models

Coher, for example, partners with organizations that possess this proprietary data to create custom models accessible only to them, enabling deep specialization within those domains [00:11:43] [00:11:47].

The company recognized that customers were repeatedly building similar applications on top of their models [00:18:23]. These applications, often built by internal AI teams rather than product teams, frequently lacked user-friendliness [00:18:48].

This led to the development of “North,” an application platform designed to solve the entire deployment problem for enterprises [00:18:57]. North aims to provide a consumer-grade product experience that is:

Intuitive and low-latency [00:19:06].
Customizable in UI, branding, data connections, and tool access [00:19:17].
Capable of integrating third-party models, like Llama fine-tunes, if desired [00:19:30].

Coher’s strategy involves being vertically integrated, optimizing its generative model, Command, for specific enterprise use cases, such as interacting with ERP and CRM systems [0:21:00] [00:21:10]. This deep integration between model development and customer needs is considered crucial for product quality [00:21:31].

Data Strategies for Custom Models

Human evaluation remains the gold standard for assessing model usefulness [00:13:03] [00:13:15]. However, generating training data, especially for specialized domains like medicine, is cost-prohibitive with human labor alone [00:13:47].

The ability of general models to “chitchat” and converse has unlocked significant potential for synthetic data generation [00:14:14] [00:14:16]. This allows for training models in specific domains like medicine using a much smaller pool of human data, then expanding it through synthetic data generation [00:14:24].

For domains with verifiable outcomes, such as code and math, it’s easier to filter synthetic data by checking results [00:14:49]. Currently, an overwhelming majority of data generated by Coher for new models is synthetic [00:15:11] [00:15:18].

Strategic Uses of AI in Enterprises

Current enterprise use cases for Generative AI with product-market fit can be categorized:

Vertical Applications: Examples include making note-taking and form-filling easier for doctors by passively listening to doctor-patient interactions [00:04:57].
General Categories:
- Customer Support: The technology and need are aligned, leading to rapid adoption across various verticals like telco, healthcare, and financial services [00:05:24] [00:05:31].
- Research Augmentation: Agents can perform months’ worth of research in hours, significantly boosting productivity for knowledge workers in fields like banking wealth management [00:05:47] [00:05:51]. This capability allows models to read vastly more information than humans and provide robust, cited research for auditing [00:06:42] [00:06:50].

Reasoning models, which can dynamically allocate computational energy based on problem complexity, represent a significant advancement [00:08:00]. They have unlocked the ability for models to reflect on and understand why initial attempts failed, enabling them to find alternative paths to solutions [00:16:47]. This has transformed previously impossible tasks into solvable ones [00:16:24] [00:16:36].

Looking ahead, deep research-style use cases are ready for prime time and are expected to be integrated into nearly every enterprise [00:23:42]. Additionally, mundane back-office tasks in finance, legal, and sales are expected to come online as automation infrastructure like North becomes more widespread [00:24:22] [00:24:57].

Future of Model Specialization

There is a debate between the path to “one model to rule them all” versus a world of specialized models [00:10:19] [00:10:24]. While general models have become powerful, custom models are still vital for capturing domain-specific information not found on the web [00:10:52] [00:11:02].

It is unlikely that every single team within an organization will have its own fine-tuned model [00:12:17]. Instead, organizations might have a handful of models [00:12:15].

Initially, enterprises may adopt a federated approach, with different teams purchasing specialized applications [00:26:37]. However, this will likely lead to a consolidation phase, driven by the unsustainable maintenance burden of integrating disparate applications and data sources [00:27:00] [00:27:09]. The long-term goal is a single platform plugged into everything, capable of accomplishing diverse automation objectives [00:27:10].

New generations of foundation models are expected to emerge for specific domains, such as biology, chemistry, and material science [00:32:13]. These will require substantial capital investment to leverage specialized, siloed data [00:33:00] [00:34:43].

Impact of Continued Model Improvement

A key missing capability in current models is the ability to learn from experience and user interaction [00:08:46] [00:08:58]. This “learning from experience” through continuous feedback is expected to unlock significant potential [00:44:46]. If models could retain feedback across interactions, users would be more invested, as the models would grow with them, learning preferences and becoming more personalized [00:46:05]. This might involve storing interaction history in a queryable database, always available as context to the model [00:45:48].

The “scale is all you need” hypothesis is breaking down, with diminishing returns on capital and compute [00:09:21] [00:09:26]. The future will require smarter and more creative approaches to achieve the next level of technological advancement [00:09:35]. While test-time compute still demands significant computational resources, the increasing abundance and decreasing cost of compute, along with diverse hardware options, are positive trends for the industry [00:39:06] [00:39:23] [00:39:53].

Tubegraph

Explorer

Table of Contents