Minimax M2.7 was challenged with navigating to a large Ecom site, curate a selection of 5 different products, and add them all to the cart with included reasoning behind choices. (or try it yourself - open source, MIT license, and BYOK!)
npm i @quanta-intellect/vessel-browser
Vessel is a browser that I've been working on which is designed specifically for agents with human-in-the-loop visibility. It comes with a local MCP server allowing any harness that supports custom MCP to control the browser. Additionally, you can BYOK to 8+ different providers (including custom OAI compatible endpoints and local models).
One of my favorite features of the browser is persistent, bi-directional highlighting - meaning that both you AND the agent can highlight anything on the screen and the agent receives the context.
Vessel Browser is unique in that it surfaces available tools contextually to the agent, meaning the agent doesn't have to decide between 80+ tools at any given time, but rather is focused on a subset of tools most applicable to the current state.
We should really have a release date range slider on the /models page. Tired of "trending/most downloaded" being the best way to sort and still seeing models from 2023 on the first page just because they're embedded in enterprise pipelines and get downloaded repeatedly. "Recently Created/Recently Updated" don't solve the discovery problem considering the amount of noise to sift through.
Slight caveat: Trending actually does have some recency bias, but it's not strong/precise enough.