From: aidotengineer
Retriever.com envisions a future where its browser extension can revolutionize data extraction and enable collaborative data set construction [00:16:34].
The Vision: Collaborative Data Sets
Retriever’s long-term goal is to allow individuals to collaborate across their local laptops to collectively construct data sets [00:16:41]. This approach aims to establish highly cost-efficient and affordable data sets [00:16:50].
Addressing Current Challenges
Currently, tasks such as extracting data from hundreds or thousands of websites (e.g., for local government events in the SF Bay Area) are not feasible or cost-effective [00:16:55]. These processes are often manual, involving copying and pasting data, or relying on expensive and unreliable third-party scraping services or fragile RPA bots that break when websites change [00:00:44]. Data silos, where information is available on websites but not APIs, also create a hassle for users trying to combine data [00:01:03].
How Retriever Enables Collaboration
By leveraging Retriever’s Chrome extension, users can volunteer and collaboratively construct these data sets [00:17:09]. This is seen as an exciting new future and use case for AI agents in the browser [00:17:14].
Retriever’s Approach
Retriever functions as an AI web agent within a Chrome extension, allowing users to autonomously perform tasks across pages and extract structured data to sheets [00:01:20]. Its text-based approach minimizes hallucination and allows actions on multiple tabs, including background tabs that vision-based approaches cannot access [00:14:31]. This enables parallel processing and speeds up performance [00:15:00]. As a client-side Chrome extension, it is infrastructure-wise cheaper and can access content behind login walls or paywalls, which cloud-hosted browsers cannot do without security risks like password sharing [00:13:32].
Retriever distributes complex tasks into subtasks across new tabs, significantly reducing failure rates compared to competitors that attempt long-horizon tasks on a single tab [00:16:01]. It also supports user-defined function calling for third-party tool integrations, making it more extensible and scalable than custom, pre-set connectors [00:16:13].