A pre-trained large language model (LLM) will only support small-scale generalized tasks. For an enterprise deploying an LLM to perform a specialized task like answering patient inquiries, generic responses aren’t enough.
An enterprise-scale use case requires a model that is fine-tuned on a substantial amount of human-generated data specific to that need in order to create value. Otherwise, enterprises deploying generalized models risk overinvesting in a tool that under-delivers.
While orchestrating the creation of a fine-tuning dataset is costly and complex, enterprises investing heavily in an AI model can’t afford to not fine-tune it. In fact, 80-90% of enterprise AI projects reportedly fail because they either can’t reach or progress beyond the proof of concept phase, suggesting that models aren’t performing as desired.
In this blog post, we explore how LLMs are pre-trained and why fine-tuning is a critical stage in enterprise AI deployment. Let’s dive in.
How LLMs are Pre-Trained
Large language models are pre-trained on billions of unlabeled and unstructured data points. In order to make sense of that data, the model needs to undergo a process of machine learning.
There are three main types of machine learning that do this:
- Supervised Learning: The model learns from provided examples of inputs and their corresponding outputs, much like a student learning from a textbook with solutions.
- Unsupervised Learning: The model finds patterns or groups within the data on its own like a child sorting toys without being told specific categories to sort them into.
- Reinforcement Learning: The model learns by interacting with an environment and receiving feedback (rewards or penalties) for its actions, similar to teaching a pet new tricks with treats or scolding.
For an LLM, the overarching goal of these techniques is to create new data or content that is coherent, realistic, and aligns with the characteristics of the training data. Because the training data comes from such a broad source (the internet), the end state constitutes a vanilla model that can only perform generalized tasks.
Why Vanilla Models Need to Be Fine-Tuned
Vanilla models on their own can perform small-scale business tasks like content generation. An enterprise-scale task would instead be a complex process involving multiple points of specialized outputs from a fine-tuned model, which requires the safe handling of proprietary data.
Think of a vanilla model as if it were a new hire at your company. It’s unlikely that a new employee would be expected to perform high-value, domain-specific tasks prior to onboarding, training, and on-the-job experience.
Like a new hire, a vanilla model falls short of meeting most enterprises' needs right out of the box for the following reasons:
- Lack of Specialization: Vanilla models are general-purpose and basic. Business problems, however, come with specific challenges and nuances that require specialized knowledge to address those challenges effectively.
- Limited Adaptiveness: While they might work well for basic or introductory scenarios, a vanilla model won’t adapt well to diverse or unseen data. Businesses operate in dynamic environments where data can vary widely, and a model's (and an employee’s) ability to adapt is crucial.
- Lack of Explainability: In business contexts, being able to explain why a specific prediction or classification was made is just as important as the output itself. Stakeholders might need to explain these decisions to others in the organization, customers, or even regulators. Newcomers and vanilla models don’t offer this level of interpretability.
- Compliance Concerns: Especially in sectors like finance and healthcare, models must adhere to industry regulations. Models and people need to be trained compliantly, or else an enterprise could be exposed to the consequences of non-compliance.
Where Generalized Models Go Wrong
Let’s break down a practical example using a real business use case. In this scenario, a national pharmacy chain deploys a chatbot, based on a vanilla model alone, on its website and mobile app.
The chatbot is designed to assist users in understanding their medication, its side effects, interactions with other medications, and general health advice. Because the model hasn’t been fine-tuned, the pharmacy chain runs these risks:
- Vanilla LLMs may be able to provide generalized medical information, but can’t accurately represent specific dosages, contraindications, or recent medical findings that illuminate health risks to patients.
- The model may fail to catch nuance within the context of a user asking about medications or describing symptoms, leading to incorrect or inappropriate medical advice that exposes the company to legal concerns.
- Foundation models are meant for generalized audiences, and will likely lack the sensitivity needed to appropriately interact with a patient in distress.
A foundation model may be more likely to inadvertently misuse a patient’s health data, harming their privacy and exposing the company to even more legal concerns.
Realizing that such a model threatens to erode patient trust and do more harm than good, the pharmacy chain may reach a point where they turn away from the model. The solution to avoid limiting the use of these powerful tools is fine-tuning.
A fine-tuned model, trained on data produced by experts in the medical field, would comfortably handle patient inquiries and provide better, expert-backed answers. Speak to our team to learn more about how we fine-tune LLMs for specialized use cases and support enterprises in deploying AI models for complex business processes.