Pretraining and finetuning AI models

From: redpointai

Organizations face crucial decisions regarding the deployment of AI models: whether to train their own models, fine-tune existing ones, or simply use prompt engineering [00:00:09]. The optimal approach depends on the specific use case, data availability, and desired performance.

The AI Model Development Journey

The journey of adopting AI models for an enterprise is often iterative, starting small and gradually scaling up based on justified return on investment (ROI) [00:09:01].

The recommended path generally begins with:

Prompting an existing model [00:08:21]. This acts as a litmus test to see if AI is suitable for the use case [00:08:31].
Retrieval Augmented Generation (RAG) [00:09:27]. This involves bringing specific enterprise data to bear, as generic models won’t have inherent knowledge of internal data [00:09:31].
Finetuning [00:09:50]. If initial value is demonstrated, finetuning can “bake in” more knowledge, offering better quality in a smaller package and potentially reducing inference costs [00:09:56].
Continued Pre-training [00:10:03]. This involves further training on domain-specific data.
Pre-training from scratch [00:10:07]. This is the most significant undertaking and is “not for the faint of heart” [00:10:08].

Each step up the chain involves a greater upfront investment, but it can significantly improve the cost-quality trade-off, justifying itself with sufficient model usage [00:18:30].

When to Build Custom Models

The decision to train or finetune a custom model becomes relevant in several scenarios:

Language Specificity: Off-the-shelf models may not perform well in languages where less training data is available, such as Japanese or Korean, leading companies to build their own models tuned for these languages [00:16:26].
Domain Specificity: For highly specialized tasks, like protein modeling, generic language models are insufficient [00:17:13].
Performance Requirements: When a model needs to be extremely fast and specific, such as for code completion used by millions of users, custom development can create a “Lamborghini of models” built for speed and task [00:17:27].
Cost Optimization: Pre-training or finetuning can lead to a more efficient model for inference, either providing the same quality at a lower cost or higher quality at the same cost [00:18:08].

Challenges and Considerations

Data and Evaluation

A common mistake is waiting for perfect data or perfect evaluations before starting AI development [00:10:17]. The measure of data quality and evaluation effectiveness comes from the model’s performance in real-world scenarios [00:10:24]. It’s recommended to:

Start with minimal data preparation to get the model interacting [00:10:46].
Build the “crappiest model” possible and a quick evaluation [00:10:51].
Iterate on data, model, and evaluation based on real-world testing [00:11:03].
Initial evaluations will be proxies for the real world, so direct human testing (even with one friend) is crucial [00:11:25].
Start with simple evaluations, even just five examples with graded responses, to calibrate LLM judges [00:13:03].

Databricks, for example, is developing a product to help users create meaningful evaluation sets quickly, aiming for dozens of examples in an afternoon, rather than relying solely on synthetic data generation [00:13:50].

Cost-Quality Trade-off

Scaling models is a game of “incredible investment” [00:36:28]. While large companies like Meta invest heavily in foundational models, other organizations should focus on existing gaps in the ecosystem to make customers successful [00:37:06]. This includes finetuning, RAG, and developing compound AI systems that connect different pieces of technology [00:38:05].

New finetuning techniques are emerging to address common challenges, such as working with fragmented data where key pieces expected for traditional finetuning might be missing [01:03:32].

Future of Model Development

The field is in a phase of experimentation with new products and applications [00:46:04]. The focus is on providing customers with maximum choice at minimal cost and helping them select the best options and deploy them reliably [00:32:23]. This approach acknowledges that “it’s not AutoML” but aims to simplify the process of navigating many potential model configurations [00:31:30].

For a specific application, once product-market fit is achieved, the need to “go back to the drawing board” for fundamental model architecture is not often necessary; the focus shifts to optimizing quality and cost [00:19:28].

Tubegraph

Explorer

Table of Contents