Snowflakes AI strategy and infrastructure

From: redpointai

Snowflake’s head of AI, Baris Gocken, leads the company’s extensive AI initiatives, including products like Cortex AI, Data Cloud, and the proprietary LLM Arctic [00:00:03]. Snowflake’s focus is on building the core infrastructure that supports AI tools in production [00:00:15].

Snowflake’s AI Product Portfolio

Snowflake offers a broad AI product portfolio designed to support various enterprise AI use cases [00:10:21].

Cortex

Cortex is Snowflake’s core inference engine and managed service for running Large Language Models (LLMs) [00:10:29]. It supports Snowflake’s own Arctic LLM, as well as models from providers like Mistral and Meta [00:03:11].

Arctic LLM

Arctic is Snowflake’s own open-source LLM, specifically designed for enterprise needs [00:02:01].

Development: The development began around December, driven by customer demand for BI-type experiences using AI, particularly text-to-SQL functionality [00:00:59]. It was developed by a relatively small team of researchers, including founders from DeepSpeed and vLLM, in approximately three to four months [00:01:34].
Optimization: Arctic features an innovative, efficient architecture, making it highly efficient for both training and inference. It was built at roughly 1/8 the cost of similar models [00:02:27]. A significant effort was also placed on refining the “data recipe” to ensure optimal performance [00:02:47].
Purpose: While capable of general tasks, Arctic is optimized for SQL co-piloting and high-quality chatbot functionality, rather than being a general-purpose model like GPT-5 [00:02:01], [00:26:17].
Usage: Customers leverage Arctic for various use cases, including AI for BI, building chatbots for unstructured data (documents, PDFs), and extracting data from semi-structured text like sales call logs and customer support tickets [00:03:25].

Cortex Analyst

Cortex Analyst is Snowflake’s BI product, designed to answer business intelligence questions through natural language [00:05:12].

System Design: It is a complex system combining three to four LLMs working in concert [00:05:14]. It’s built with “self-healing” mechanisms, including generating SQL, checking its validity, and knowing when to ask for clarifications or abstain from answering [00:05:25].
Quality: While achieving 90-95% quality in internal efforts, it’s not yet fully reliable for critical financial reports, necessitating “human-in-the-loop” systems for validation [00:05:37]. It also incorporates a feedback loop allowing customers to create “verified queries” to increase confidence [00:06:11].
Challenges: The primary difficulty lies in accurately handling complex SQL joins, particularly with large, messy real-world datasets containing tens of thousands of tables and hundreds of thousands of ambiguously named columns [00:06:39]. LLMs may hallucinate column names, get joins wrong, or generate unexecutable queries [00:07:03]. Snowflake consciously prioritizes precision over recall, choosing not to answer all questions to maintain high quality [00:07:55].

Cortex Search

Cortex Search focuses on high-quality search capabilities, crucial for Retrieval Augmented Generation (RAG) applications [00:10:39].

Arctic Embed: Snowflake’s proprietary embedding model, Arctic Embed, is state-of-the-art, a quarter the size of OpenAI’s embedding model, yet achieves higher benchmark scores [00:11:00].
Hybrid Search: Cortex Search employs a hybrid approach, combining vector search with lexical keyword search to reduce hallucinations [00:12:14].
Use Cases: It supports external RAG applications, internal productivity tools, and modernizing traditional enterprise search [00:12:01].

Enterprise AI Adoption and Governance

Snowflake’s strategy leverages its existing data platform to facilitate AI adoption by focusing on data governance, security, and integrated solutions [00:13:07].

Data Governance and Security

A major advantage for Snowflake is its established data governance framework [00:13:10]. All Snowflake models run within the Snowflake cloud, right next to the data, ensuring security and compliance [00:16:37].

Granular Access Controls: Snowflake’s platform is built with granular access controls from the ground up [00:17:02]. This means AI systems, including search, automatically respect existing data permissions, preventing data leakage and ensuring appropriate responses based on user roles (e.g., an HR chatbot providing different answers based on the inquirer’s access) [00:17:09]. Many customers spend years establishing their data governance, which now directly benefits their AI deployments [00:18:52].
Cortex Guard: Using underlying technologies like Llama Guard, Cortex Guard provides an additional layer of security, addressing concerns around unscripted LLM responses and ensuring alignment with company policies and brand [00:27:14].

Transition to Production

While many enterprise AI use cases are still in the Proof of Concept (POC) phase, there is a clear transition to production for internal productivity and analysis applications [00:15:03]. Widespread external deployments require more confidence, product innovation (e.g., mechanisms for users to check answers), and further technological advancements to mitigate risks like hallucination, especially in regulated industries [00:15:17].

Model Selection and Optimization

Snowflake advises customers to start small with large off-the-shelf models combined with RAG solutions for initial POCs [00:19:23].

Fine-tuning: For production systems, fine-tuning smaller models is recommended for latency and cost advantages [00:19:38]. Snowflake makes fine-tuning simple [00:21:10].
Custom Models: For companies with large, unique datasets, particularly in regulated industries (e.g., healthcare with specialized language), building custom pre-trained models makes sense [00:19:49]. Snowflake supports customers in this process, leveraging its efficient Arctic development approach [00:19:54].
Model Popularity: Customers often select models based on brand recognition rather than solely on capability, though this is changing with better evaluation frameworks becoming available [00:21:36].
Cost: While LLMs can be expensive, the rapid decline in costs and the internal-focused nature of current high-volume enterprise AI use cases mean cost is not yet a significant blocker for production deployment [00:25:20].

Infrastructure and Acquisitions

Snowflake’s AI infrastructure is bolstered by strategic acquisitions and an emphasis on efficient model deployment.

vLLM Integration

Snowflake benefits from having the founders of the vLLM team as employees, which allows for robust support for new models [00:22:36]. They continuously optimize the inference stack for efficiency, including multi-node inference and fine-tuning, and upstream these upgrades to vLLM [00:23:03]. This ensures that new, large models like Llama 3.1 405B can run efficiently [00:22:56].

Strategic Acquisitions

TruEra: Snowflake acquired TruEra, which provides the open-source Trulens platform for LLM evaluation and observability [00:14:10]. This acquisition is crucial for customers to evaluate and monitor their AI systems at scale, especially as they move from POCs to production [00:14:42].
Neva: The acquisition of Neva was instrumental, bringing in Snowflake’s CEO and contributing core technology to Cortex Search and its underlying embedding model [00:28:56].

Future of AI at Snowflake

Snowflake aims to be a leading AI platform, enabling users to interact with both structured and unstructured data using natural language [00:34:21].

Agents: Snowflake plans to enable agents to plug into the system, recognizing the emerging importance of agentic systems for complex workflows like BI analysis [00:32:47], [00:39:17].
Ecosystem Play: Leveraging its “Data Cloud” model, Snowflake makes it easy for customers to integrate and build AI applications on existing data, including third-party datasets from the financial ecosystem, ensuring governance and security [00:34:46].
Partnerships: Snowflake partners with model providers (e.g., Mistral, Reka, AI21) and companies offering end-to-end solutions (e.g., Landing AI) to allow applications to run securely within the Snowflake environment, accessing data directly without needing to move it [00:29:33].
Internal Use: Snowflake extensively uses LLMs internally for sales call summaries, employee assistants querying internal documents, and optimizing its SQL engine and Marketplace [00:30:20].

Comparison with Databricks

Snowflake differentiates itself by its “single product” ethos, aiming for highly integrated and easy-to-use AI functionalities [00:31:37]. Snowflake notably integrated AI into SQL from the outset with Cortex [00:32:49]. High quality systems like Cortex Analyst and Cortex Search, along with a strong governance foundation, are key differentiators [00:32:04].

Opportunities in AI Infrastructure

While end-to-end platforms are growing in importance, there remain significant opportunities for startups in various aspects of the AI infrastructure stack due to the market’s massive growth and ongoing innovation [00:33:33]. The inference stack, in particular, continues to see rapid advancements leading to cost reductions [00:35:52]. There is a shift towards customers wanting full end-to-end products rather than just building blocks [00:45:30].

Baris Gocken’s Insights

Underhyped/Overhyped: Evaluation is currently underhyped, while agents are overhyped in the short term, though they are expected to match the hype in the future [00:40:39].
Biggest Surprise: The core challenge of defining the exact problem to solve in text-to-SQL proved to be a critical and informative process [00:41:20].
Open vs. Closed Source: Open-source models, particularly from Meta and Mistral, have been highly influential in fostering a diverse ecosystem and providing flexibility for innovation [00:42:11].
Exciting AI Startup: Baris is most excited about model startups like Mistral, noting their small, capable teams, rapid development, and effective awareness creation [00:42:49].
Personal Application Interest: Baris would build an AI application in the assistance space, focusing on “many small, specific agents” that act on users’ behalf for unique tasks, rather than broad, general-purpose assistants [00:38:29]. This requires nailing a few deep use cases rather than many small ones [00:43:54].
Interesting Platform: Platform companies generally present compelling opportunities in AI, as current capabilities are awaiting true platformization [00:44:51].

Tubegraph

Explorer

Table of Contents