Challenges and innovations in SQL generation and BI applications

From: redpointai
Baris Gokkekan, Snowflake’s Head of AI, leads the company’s extensive AI initiatives, including products like Cortex AI, their data cloud, and their own Large Language Model (LLM) named Arctic [00:00:01]. Gokkekan’s work focuses on the infrastructure supporting AI tools in production, offering insights into common use cases and the effectiveness of techniques like fine-tuning and Retrieval-Augmented Generation (RAG) [00:14:52].

Arctic LLM: Tailored for Enterprise BI

Snowflake decided to build its own LLM, Arctic, primarily because many customers sought to develop Business Intelligence (BI) type experiences using AI [01:03:00]. A key need was for text-to-SQL capabilities, converting natural language into SQL queries to interact with data effectively [01:12:00]. While existing models were good, they often struggled with the complexities of SQL [01:27:00].

Arctic was developed by a relatively small team, including researchers from DeepSpeed and vLLM, in approximately three to four months [01:44:00]. It was designed specifically for enterprise needs, excelling at SQL co-pilot functionalities and high-quality chatbots, rather than general tasks like composing poetry [02:01:00]. This optimization was achieved through an innovative, efficient architecture for both training and inference, allowing the model to be built at one-eighth the cost of similar models, and a meticulously crafted “data recipe” [02:25:00].

Snowflake Cortex AI and Key Use Cases

Snowflake Cortex is a managed service that hosts various LLMs, including Arctic, as well as models from Mistral and Meta [03:09:00]. It supports three primary use cases:

AI for BI: Generating SQL queries from natural language [03:27:00].
Chatbots for Unstructured Data: Building chatbots to interact with documents and PDFs [03:32:00].
Data Extraction and NLP: Using natural language to extract and process data from semi-structured text like sales call logs or customer support tickets [03:48:00].

Challenges in SQL Generation and BI Applications

Generating SQL from natural language is particularly challenging, described as an “Iceberg problem” where demos are easy but real-world implementation is messy [04:27:00]. Key challenges include:

Complex Data Environments: Enterprise data often involves tens of thousands of tables and hundreds of thousands of columns with inconsistent or “weird names” (e.g., “rev one,” “rev two” for revenue) [04:47:00].
Accuracy of Complex Joins: LLMs struggle to correctly identify and implement complex joins, which are unique to each company’s data model [06:37:00]. Semantic understanding of these unique data spaces is a significant hurdle [06:52:00].
Hallucinations and Errors: Models may hallucinate column names, get joins wrong, or generate non-executable queries [07:03:00]. LLMs are prone to providing answers even when they shouldn’t, leading to potential inaccuracies [07:40:00].
Trust in Critical Applications: For BI use cases, especially financial reporting, a 90-95% accuracy rate may not be sufficient for a CFO to fully trust the results [05:43:00].
Defining the Problem: The classic machine learning challenge of defining the problem and what “good” looks like remains critical in the LLM world [04:48:00].

Innovations and Solutions in BI

To address these challenges, Snowflake has developed several innovations within Cortex:

Cortex Analyst: This product is a sophisticated system that orchestrates multiple LLMs (three to four) to answer BI questions [05:12:00].
Self-Healing Systems: The system generates SQL, checks its validity, and knows when to ask for clarifications or abstain from answering questions it cannot confidently resolve [05:19:00].
Human-in-the-Loop and Verified Queries: To ensure high quality, particularly for critical BI, systems incorporate human oversight. Customers can create “verified queries” which the system can then prioritize, increasing confidence in the results [05:51:00].
Precision Over Recall: Snowflake consciously prioritizes precision in its BI applications, choosing not to answer all questions but instead focusing on accuracy and knowing when to seek clarification or decline to answer [07:55:00]. This approach tailors the experience for business users who require high quality, unlike analysts who might accept less precise results for exploration [08:20:00].

For data extraction, things are “pretty good” [08:48:00]. An example includes summarizing open-ended employee survey responses, categorizing them, and providing example quotes, a task that previously took hours [09:09:00]. Snowflake aims to make these processes very simple and easy to use, building task-specific functions that minimize the need for complex prompting or data pipeline work by customers [09:39:00].

Data Governance and AI in Snowflake

A significant advantage for Snowflake is its inherent data governance capabilities [13:09:00]. For enterprises, the first step in AI adoption is often ensuring comfort with the platform’s security and data governance policies [13:10:00]. Snowflake’s models run directly within the platform, right next to the customer’s data, offering a strong value proposition [16:37:00].

Snowflake’s granular access controls, built from the ground up, are crucial [17:02:00]. For example, in an HR chatbot, different users receive different answers based on their access permissions, preventing data leakage and ensuring accuracy [17:11:00]. This pre-existing data governance infrastructure is a huge benefit for customers building AI applications [18:50:00].

AI Evaluation and Production Readiness

Transitioning from Proof of Concepts (POCs) to production systems is a current focus for enterprises [15:00:00]. Snowflake acquired TruEra, an open-source LLM evaluation and observability platform (TruLens), to help customers assess and improve their AI systems at scale [14:06:00].

While internal productivity use cases are seeing slower transitions to production, external-facing applications still require more confidence due to risks like hallucination, especially in regulated industries [15:09:00]. Future product innovations are needed to make end-users more comfortable with AI answers that might not always be right, and to provide mechanisms for checking validity [15:49:00].

Regarding costs, while everyone is aware that LLMs are expensive, they haven’t been a major blocker yet for internal use cases, as volumes aren’t massive and costs are rapidly decreasing [25:20:00].

Future of AI at Snowflake

Snowflake’s goal with Arctic is not to build a general-purpose model like GPT-5, but rather to continue focusing on specific needs of Snowflake customers, such as SQL generation and RAG quality [26:17:00]. They also offer flexible options for customers to fine-tune existing models or even train their own custom models, particularly for regulated industries with unique datasets [20:07:00].

A future area of excitement is enabling agents to plug into the system, leveraging Snowflake’s data platform to allow customers, like asset managers, to easily build chatbots that connect to vast financial ecosystems (e.g., S&P, FactSet data) [34:41:00]. This emphasizes the value of having data already consolidated within Snowflake, complete with governance and security guarantees [35:36:00].

Tubegraph

Explorer

Table of Contents