From: aidotengineer
Retriever.com aims to be as transformative to the browser as Netscape was at its creation [00:00:30]. The browser often acts as a bottleneck for many workflows [00:00:37]. Many individuals spend hours daily manually copying and pasting information between websites, Google Sheets, or CRMs [00:00:42]. Other common challenges include:
- Offshoring data scraping to third parties, which is expensive and unreliable [00:00:50].
- Setting up RPA Bots that frequently break when website layouts change [00:00:58].
- Data silos where some data is only available on websites and other data via APIs, creating a hassle for users to combine [00:01:03]. These issues lead to untapped potentials for leveraging data [00:01:14].
Retriever.com: An AI Web Agent Chrome Extension
Retriever addresses these problems as a Chrome extension that leverages being an AI web agent [00:01:20]. Users can open a side panel and provide tasks for the agent to perform autonomously across web pages, as well as extract structured data to sheets [00:01:27].
Key Capabilities and Use Cases
Retriever’s AI in workflow automation and augmentation facilitates various tasks:
-
Autonomous Task Execution:
- Users can provide natural language prompts, such as “find and follow the latent space podcast page,” and the AI web agent will interact with the page to complete the task autonomously, including filling search fields and interacting with page elements [00:02:02].
-
Structured Data Extraction:
- A built-in feature allows users to extract data to Google Sheets with prompts like “for every article on the page extract this data and click this export button” [00:02:30]. This approach is highly cost-effective, potentially costing less than a penny per page extraction [00:03:00].
-
Multi-Tab Actions and Extractions:
- Retriever can perform actions across multiple tabs [00:03:10]. It can take a Google Sheet column of URLs, open them, interact with them, and extract data [00:03:15].
- Complex use cases include extracting specific fields from the first five PDFs on an archive search page, breaking them down into subtasks, opening them as new tabs, and processing them simultaneously [00:03:23].
- When comparing product pages (e.g., Amazon products), the agent can automatically determine relevant fields to extract without a specific prompt, allowing for effective product comparison [00:03:56].
- It can even extract URLs, such as image source URLs for visual comparison [00:04:29].
- Actions can be performed on tabs before extraction, such as changing a review sort order from “top reviews” to “most recent” across multiple pages simultaneously, and then extracting the details [00:04:41].
-
Advanced Research and Summarization:
- Users can select multiple design documents and prompt Retriever to “extract key points and summary,” which it will then process and provide a concise summary for each document [00:06:21]. This works across Google Docs, PDFs, and Google Sheets [00:06:51].
- For market research, users can ask Retriever to visit company homepages (e.g., Retriever, Browse AI), navigate to pricing pages, and extract data like company strategy, features, and pricing into a sheet [00:07:13]. This “deep search” feature allows the agent to explore and navigate multiple pages to find requested data [00:07:47]. Retriever is noted as one of the first to implement this deep search [00:07:56].
- More complex financial market research can involve selecting a Google Sheet with stock information and asking for data like P/E ratio from Yahoo Finance and revenue for the last two years, and even computing new data fields like revenue growth [00:08:31]. The agent can correct user errors, such as navigating to Yahoo Finance even if a company’s website URL was incorrectly provided [00:09:15].
-
Dynamic Function Calling and Third-Party Integrations:
- A key feature is dynamic function calling, allowing users to define and integrate any API or third-party tool [00:09:39].
- An example given is WhatsApp integration to send messages to customer phone numbers listed in a Google Sheet, enabling automated social communications across platforms like Instagram, Facebook, and WhatsApp [00:10:10]. This demonstrates the integration of AI coding agents with third-party tools for seamless workflows.
-
Graph Generation:
- Retriever includes a “graph bot” tool, a mini-agent within Retriever Agent Studio, that can generate dynamic data analysis graphs from existing data on the fly [00:11:20]. This leverages LLMs’ capability to represent data in various formats [00:11:42].
Retriever’s Advantages and Approach
Retriever’s approach offers several benefits and challenges of using AI in workflow compared to other agents in the landscape (e.g., OpenAI Operator, Anthropic Claude, Google Mariner):
- Text-Based Approach: Unlike many agents that use vision-based approaches (taking screenshots), Retriever uses a text-based approach by leveraging the web page’s underlying structure [00:12:15]. This results in less hallucination and is less expensive than taking multiple screenshots for every action [00:12:33].
- Client-Side Chrome Extension: Most competitors use browsers on the cloud, which leads to non-personalized results and requires expensive proxies to funnel network requests [00:12:51]. Retriever’s client-side extension is cheaper infrastructurally and allows access to local or login-protected sites [00:13:32].
- Enhanced Security: With Retriever, users do not need to share or store passwords, as the agent sees exactly what the user is logged into [00:13:47]. Cloud-hosted browsers often require password storage, posing security risks, and may struggle with paywalls or Cloudflare protections [00:13:59].
- Multi-Tab Parallel Processing: The text-based approach enables processing active and background tabs, or multiple tabs at once, which is not possible with vision-based methods as background tabs are not rendered [00:13:38]. This significantly speeds up performance [00:15:00].
- Focus on Productivity and Automation: Retriever prioritizes productivity and automation use cases, recognizing that AI excels at automating manual and repetitive tasks [00:15:18].
- Distributed Subtasks: For long-horizon tasks, Retriever distributes subtasks as new tabs, reducing failure rates compared to competitors that attempt one long action on a single tab [00:15:55].
- Extensible Function Calling: Users can define and share function calls, making it much more extensible and scalable than custom third-party integrations [00:16:13]. This represents an advanced form of integration of AI agents into existing infrastructure.
Mission and Future Vision
Retriever’s mission is to revolutionize data extraction with transparent and efficient AI [00:16:34]. The long-term goal is to allow people to collaboratively construct cost-efficient datasets by leveraging the extension to volunteer and extract data from potentially hundreds or thousands of websites (e.g., local government events) [00:16:41]. This collaborative data set construction represents an exciting new future for evaluating and optimizing AI agents and workflows and integrating AI into natural workflows [00:17:14].