From: aidotengineer

Retriever.com envisions a future where its browser extension can revolutionize data extraction and enable collaborative data set construction [00:16:34].

The Vision: Collaborative Data Sets

Retriever’s long-term goal is to allow individuals to collaborate across their local laptops to collectively construct data sets [00:16:41]. This approach aims to establish highly cost-efficient and affordable data sets [00:16:50].

Addressing Current Challenges

Currently, tasks such as extracting data from hundreds or thousands of websites (e.g., for local government events in the SF Bay Area) are not feasible or cost-effective [00:16:55]. These processes are often manual, involving copying and pasting data, or relying on expensive and unreliable third-party scraping services or fragile RPA bots that break when websites change [00:00:44]. Data silos, where information is available on websites but not APIs, also create a hassle for users trying to combine data [00:01:03].

How Retriever Enables Collaboration

By leveraging Retriever’s Chrome extension, users can volunteer and collaboratively construct these data sets [00:17:09]. This is seen as an exciting new future and use case for AI agents in the browser [00:17:14].

Retriever’s Approach

Retriever functions as an AI web agent within a Chrome extension, allowing users to autonomously perform tasks across pages and extract structured data to sheets [00:01:20]. Its text-based approach minimizes hallucination and allows actions on multiple tabs, including background tabs that vision-based approaches cannot access [00:14:31]. This enables parallel processing and speeds up performance [00:15:00]. As a client-side Chrome extension, it is infrastructure-wise cheaper and can access content behind login walls or paywalls, which cloud-hosted browsers cannot do without security risks like password sharing [00:13:32].

Retriever distributes complex tasks into subtasks across new tabs, significantly reducing failure rates compared to competitors that attempt long-horizon tasks on a single tab [00:16:01]. It also supports user-defined function calling for third-party tool integrations, making it more extensible and scalable than custom, pre-set connectors [00:16:13].