From: redpointai

Finetuning AI models involves adapting a pre-trained model to a specific task or dataset. This process can significantly improve model performance and efficiency for particular use cases [05:50:00].

When to Finetune vs. Use Base Models

For developers and enterprises, deciding whether to use a base model (like GPT-3.5 or GPT-4) or to finetune is a key consideration [04:46:00].

  • GPT-3.5 Turbo vs. GPT-4: GPT-3.5 Turbo is highly capable for many use cases, but it may require more effort to craft a precise prompt [05:07:00]. GPT-4 generally performs better when prompts involve more than three or four instructions [05:21:00]. Following DevDay price drops, GPT-4 has become more affordable, leading many to use it directly to avoid the extra effort of prompt engineering [05:33:00].
  • Performance: Some early customers using finetuned GPT-3.5, combined with prompt engineering, have achieved GPT-4 level performance [05:56:00].

Benefits of Finetuning

Finetuning offers several advantages for specific applications:

  • Improved Performance: Finetuning can help models achieve better performance for particular tasks [06:00:00].
  • Token Savings: For example, by finetuning GPT-3.5 for function calling, developers can train the model to “hallucinate” functions, eliminating the need to pay for function tokens with every call [06:50:00]. This can lead to significant cost savings [07:17:00].

Challenges of Finetuning

While beneficial, finetuning presents challenges:

  • Data Requirements: Finetuning requires substantial, high-quality data [07:42:00]. For custom models, billions of tokens might be needed [08:49:00].
  • Expertise: Effective finetuning demands a level of machine learning understanding, and poor data quality can lead to suboptimal results [10:44:00].
  • Cost: Creating custom models can be very expensive, potentially costing millions of dollars [08:17:00].
  • Dependency Concerns: Some enterprises are wary of taking on dependencies with venture-backed open-source companies, preferring to rebuild infrastructure themselves [20:07:07].

Custom Models: An Advanced Approach

Custom models represent a deeper form of finetuning, often involving OpenAI’s research teams directly assisting companies.

  • Requirements: These models are typically suited for domains where base models lack sufficient training data, such as legal or medical fields, and where companies possess extensive proprietary data [07:44:00].
  • Cost and Accessibility: Custom models are currently very expensive, limiting their accessibility to companies with significant capital expenditure [08:23:00].
  • Expert Support: OpenAI’s research teams provide hands-on assistance to companies training these custom models, addressing the need for world-class machine learning expertise [11:15:15].
  • Comparison to Open Source: While open-source models (like Llama) offer more granular customization and ownership of model weights [15:52:00], large proprietary models are expected to remain superior in performance due to their scale and engineering investment [15:03:00]. Open-source models also allow for more extensive finetuning approaches and considerations in AI, including reinforcement learning from human feedback (RLHF), which is not yet a standard offering from OpenAI [16:16:00].

Future of Finetuning and Custom Models

The future of finetuning AI models for enterprise data and custom models is expected to evolve:

  • Accessibility: OpenAI aims to make custom models more accessible and affordable through API offerings [10:24:00].
  • Advanced Techniques: The company anticipates supporting more diverse finetuning and training techniques, such as RLHF [16:07:00].
  • Base Model Improvements: Continued improvements in base models will make them more steerable, which could reduce the need for custom models in some cases, though a need will likely always remain for specialized domains [09:37:00].
  • Efficiency: Custom models may become more compute-efficient by removing unnecessary data from training sets, allowing for more focused data input [10:02:00].

As the AI ecosystem progresses, the goal is to provide a comprehensive platform where developers can find all necessary tools for building, including finetuning, inference, and various model development steps [00:59:22].