From: redpointai

The ability for AI models to learn from experience and personalize their interactions is a crucial, yet currently missing, property of intelligence [00:08:48]. Currently, models often forget past interactions, leading to wasted time when users provide feedback [00:45:06]. Humans learn from experience, starting as novices and becoming experts over time through practice and feedback [00:08:50]. AI models should possess the same capability to learn from their real-world experiences and human feedback [00:08:56].

The Need for Learning from Experience

When a model doesn’t remember previous chats or feedback, every new interaction is like starting with a new intern who has forgotten everything [00:45:08], [00:46:27]. This lack of persistent learning makes the user experience frustrating and limits deeper engagement [00:45:10]. If models could learn from feedback and avoid repeating past mistakes, users would be more invested, turning the model from a basic intern into a personalized “me 2.0” that understands their preferences and needs [00:46:05].

While the exact implementation doesn’t fully exist yet, a likely approach involves storing interaction history in a queryable database, making it always available to the model during generation [00:45:47]. This capability is expected to significantly unlock what can be built with AI [00:45:59].

Custom and Specialized Models

While general models are extraordinary and synthetic data can considerably close certain gaps, custom models remain important [00:11:56]. They address the fundamental context about specific businesses or domains that is often missing from models trained solely on public web data [00:10:57]. The web contains vast amounts of information about humanity, history, culture, and science, but it lacks specific proprietary data [00:11:04].

Examples of such missing data include:

Coher partners with organizations possessing this unique data to create custom models accessible only to them, making the models highly effective within those specific domains [00:11:43], [00:11:51]. While there won’t be hundreds of models within an organization, a handful of specialized models might be necessary [00:12:05]. Custom models are relevant when the data type is different from what the pre-trained model has been exposed to, necessitating fine-tuning or even basic pre-training [00:12:23].

Impact on Model Development

The field has moved beyond simple RHF (Reinforcement Learning from Human Feedback) data to expert data labelers for encoding reasoning tasks and synthetic data generation [00:12:43], [00:12:50]. Human evaluation remains the gold standard for model usefulness, especially in evaluation loops, as humans are the intended users [00:13:03], [00:13:15].

However, human data generation for specialized domains (e.g., teaching a model medicine with 100,000 doctors) is too expensive [00:13:47]. The ability of models to “chitchat” unlocked a degree of freedom for synthetic data generation, which can then be applied to specific domains like medicine with a much smaller pool of human data (e.g., 100 doctors) [00:14:14], [00:14:26]. This trusted human data can then generate a thousandfold more synthetic lookalike data [00:14:39]. In verifiable domains like code and math, it’s easier to check results and filter out garbage from synthetic data [00:14:49]. Currently, an overwhelming majority of data generated by Coher for new models is synthetic [00:15:08].