Hacker News
Linked pages
- GitHub - apple/ml-ferret https://github.com/apple/ml-ferret 428 comments
- GitHub - lavague-ai/LaVague: Automate automation with Large Action Model framework https://github.com/lavague-ai/LaVague 95 comments
- GitHub - sindresorhus/awesome: 😎 Awesome lists about all kinds of interesting topics https://github.com/sindresorhus/awesome 69 comments
- GitHub - microsoft/UFO: A UI-Focused Agent for Windows OS Interaction. https://github.com/microsoft/UFO 62 comments
- OS-Copilot: Towards Generalist Computer Agents with Self-Improvement https://os-copilot.github.io/ 40 comments
- OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments https://os-world.github.io/ 39 comments
- Cradle: Empowering Foundation Agents Towards General Computer Control https://baai-agents.github.io/Cradle/ 26 comments
- GitHub - Significant-Gravitas/Auto-GPT: An experimental open-source attempt to make GPT-4 fully autonomous. https://github.com/Significant-Gravitas/Auto-GPT 22 comments
- GitHub - ddupont808/GPT-4V-Act: AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI https://github.com/ddupont808/GPT-4V-Act 22 comments
- The Open Interpreter Project https://openinterpreter.com/ 21 comments
- [2404.01744] Octopus v2: On-device language model for super agent https://arxiv.org/abs/2404.01744 17 comments
- WebArena: A Realistic Web Environment for Building Autonomous Agents https://webarena.dev/ 9 comments
- GitHub - onuratakan/gpt-computer-assistant: gpt-4o for windows, macos and ubuntu https://github.com/onuratakan/gpt-computer-assistant 8 comments
- MULTI·ON https://multion.ai 7 comments
- [2404.05719] Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs https://arxiv.org/abs/2404.05719 7 comments
- [2202.08137] A data-driven approach for learning to control computers https://arxiv.org/abs/2202.08137 6 comments
- GitHub - wandb/openui: OpenUI let's you describe UI using your imagination, then see it rendered live. https://github.com/wandb/openui 5 comments
- OpenUI by W&B https://openui.fly.dev 3 comments
- GitHub - nat/natbot: Drive a browser with GPT-3 https://github.com/nat/natbot 1 comment
- Mind2Web https://osu-nlp-group.github.io/Mind2Web/ 1 comment