← Back to Product Feed

Hacker News Show HN: Drive any macOS app in the background without stealing the cursor

A solution for macOS UI automation that allows AI agents to interact with native apps (click, type, scroll, read) in the background without stealing the cursor, focus, or disrupting the human user's session.

66
Traction Score
25
Discussions
Apr 29, 2026
Launch Date
View Origin Link

Product Positioning & Context

AI Executive Synthesis
A solution for macOS UI automation that allows AI agents to interact with native apps (click, type, scroll, read) in the background without stealing the cursor, focus, or disrupting the human user's session.
Cua Driver addresses a critical limitation in desktop UI automation for AI agents: the disruptive nature of foreground interaction. By enabling background operation on macOS without cursor hijacking or focus stealing, it unlocks new possibilities for agent-driven workflows. This solution directly tackles the developer pain point of needing VMs or containers for concurrent automation, offering a more integrated and efficient approach. The technical breakthrough, leveraging `SLEventPostToPid` and `yabai` patterns, demonstrates deep platform-level expertise. Market implications are significant for QA automation, product demo generation, personal assistants, and data extraction, allowing agents to operate continuously without human interruption. This innovation is crucial for scaling agent-based solutions in enterprise environments, where seamless background execution is paramount for productivity and user experience.
Hi HN, Francesco from Cua here.
I hacked this project together last weekend, inspired by the Codex Computer-Use release and lessons learned from deploying GUI-operating agents for our customers.The main problem: when a UI automation process controls a desktop app today, it usually takes over the human’s session. Your cursor moves, keyboard focus gets stolen, windows jump to the front, and you have to stop working until the agent is done. That is why we have historically avoided encouraging users to run these processes directly on their host machine, instead relying on VMs or GUI containers for concurrency and background execution.But computer-use - the tools we give agents to operate computers like humans - does not scale cleanly that way. As models get smarter, agents need to share hosts safely, run in the background, and avoid collisions with the human or other agents using the same machine.We realized macOS has no first-class API for "drive this app without touching the cursor". CGEventPost routes through the hardware input stream, so it moves your cursor. CGEvent.postToPid avoids the cursor warp, but Chromium treats those events as untrusted and silently drops clicks at the renderer boundary. Activating the target app first raises the window and pulls focus, defeating the point of background execution.Cua Driver is our attempt at a real fix: a background computer-use driver for macOS that lets an agent click, type, scroll, and read native apps while your cursor, frontmost app, and Space stay where they are. The default interface is a CLI, so it is easy to script or call from any coding agent shell.Try it on macOS 14+:/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/cua-d...)"The first internal use case was delegated demo recording. We ask Claude Code to drive an app while 'cua-driver recording start' captures the trajectory, screenshots, actions, and click markers. The result is an agent-generated product demo, Screen Studio inspired.Other things we have used it for:- Replacing Vercel’s agent-browser and other browser-use CLIs. With Claude Code and Cua Driver, you do not need Chrome DevTools Protocol at all.- A dev-loop QA agent that reproduces a visual bug, edits code, rebuilds, and verifies the UI while my editor stays frontmost.- Personal-assistant flows that use iMessage from Claude Code, Hermes, or other general-purpose agent CLIs.- Pulling visual context from Chrome, Figma, Preview, or YouTube windows I am not looking at, without relying on their APIs.What made this harder than expected:- CGEventPost warps the cursor because it goes through the HID stream.- CGEvent.postToPid does not warp the cursor, but Chromium drops it at the renderer IPC boundary.- Activating the target first raises the window and can drag you across Spaces.- Electron apps stop keeping useful AX trees alive when windows are occluded without a private remote-aware SPI.The unlock was SkyLight. SLEventPostToPid is a sibling of the public per-PID call, but it travels through a WindowServer channel Chromium accepts as trusted. Pair it with yabai’s focus-without-raise pattern, plus an off-screen primer click at (-1, -1), and the click lands without the window ever raising.One thing we learned: the right addressing mode depends on the app. Native macOS apps usually have rich AX trees, Chromium-family apps often need a hybrid of AX and screenshots, and apps like Blender or CAD tools may expose almost no useful AX surface. The mistake is defaulting to pixels everywhere - or defaulting to AX everywhere.Long technical writeup: https://github.com/trycua/cua/blob/main/blog/inside-macos-wi...I would like feedback from people building Mac automation, agent harnesses, or accessibility tooling. If it breaks on an macOS app you care about, that is useful data for us.
background computer-use driver macOS app automation UI automation process AI agents CGEventPost CGEvent.postToPid Chromium renderer boundary

Related Ecosystem & Alternatives

Discover adjacent products, open-source repositories, and developer tools sharing similar technical architecture.

Deep-Dive FAQs

What is Drive any macOS app in the background without stealing the cursor?
Drive any macOS app in the background without stealing the cursor is analyzed by our AI as: A solution for macOS UI automation that allows AI agents to interact with native apps (click, type, scroll, read) in the background without stealing the cursor, focus, or disrupting the human user's session.. It focuses on Cua Driver addresses a critical limitation in desktop UI automation for AI agents: the disruptive nature of foreground interaction. By enabling bac...
Where did Drive any macOS app in the background without stealing the cursor originate?
Data for Drive any macOS app in the background without stealing the cursor was aggregated directly from the Hacker News community ecosystem, representing raw developer and early-adopter sentiment.
When was Drive any macOS app in the background without stealing the cursor publicly launched?
The initial public indexing or launch date for Drive any macOS app in the background without stealing the cursor within our tracked developer communities was recorded on April 29, 2026.
How popular is Drive any macOS app in the background without stealing the cursor?
Drive any macOS app in the background without stealing the cursor has achieved measurable traction, logging over 66 traction score and facilitating 25 recorded discussions or engagements.
Which technical categories define Drive any macOS app in the background without stealing the cursor?
Based on metadata extraction, Drive any macOS app in the background without stealing the cursor is categorized under topics such as: background computer-use driver, macOS app automation, UI automation process, AI agents.
What are some commercial alternatives to Drive any macOS app in the background without stealing the cursor?
Our semantic intelligence engine identifies potential commercial alternatives in the SaaS space, such as Ferrari Luce, which offers overlapping value propositions.
How does the creator describe Drive any macOS app in the background without stealing the cursor?
The original author or development team describes the product as follows: "Hi HN, Francesco from Cua here. I hacked this project together last weekend, inspired by the Codex Computer-Use release and lessons learned from deploying GUI-operating agents for our customers.The..."

Community Voice & Feedback

BenFranklin100 • Apr 29, 2026
Being new to the idea of using agents to run programs on one’s computer, could someone provide several use cases?
dmazhukov • Apr 29, 2026
[dead]
j-conn • Apr 28, 2026
Incredible! I’m interested in doing something similar on windows, have you looked into that at all? Apparently codex computer use plans to support this on windows in the future. Were you able to see how codex was doing it, or the inspiration was just “they’ve shown it’s possible”?
pimlottc • Apr 28, 2026
What is specific about this for using with agents? As opposed to offering it as a general automation library for any use?
dtran • Apr 28, 2026
This is one of the coolest hacks I've seen recently. Having done some much less involved MacOS hacking, I can't help but wonder if we may finally see momentum behind some flavor of agent-friendly Linux/Android if Apple doesn't give us more ways to let agents interact with our machines.
alsetmusic • Apr 28, 2026
I tried out their Loom vm software a couple of months back. Worked well, fwiw. I'm not using it anymore because I decided to just give agents direct (supervised) access to my devices.
krackers • Apr 28, 2026
Nice! Thanks for the technical writeup, ~2 weeks from me wondering how it's implemented [1] to being able to play with a replicated version![1] https://news.ycombinator.com/item?id=47799128
davey2wavey • Apr 28, 2026
Its looking great.The audit trail question is interesting and I haven't seen it come up much. When an agent clicks through an ERP or edits a file, you've got logs, but how do you explain the "why" behind each decision to, say, a compliance team?Curious if that's something you're thinking about or if it's too early.
LatencyKills • Apr 28, 2026
Ex-Apple engineer here. I really like your implementation. A few years ago I built a similar tool to help me automate the testing of some of my native macOS apps. Being able to run multiple UI automation tests simultaneously was the big win in my case.My only criticism is enabling telemetry by default. I'm a fan of having people opt-in.

Discovery Source

Hacker News Hacker News

Aggregated via automated community intelligence tracking.

Tech Stack Dependencies

No direct open-source NPM package mentions detected in the product documentation.

Media Tractions & Mentions

No mainstream media stories specifically mentioning this product name have been intercepted yet.

Deep Research & Science

No direct peer-reviewed scientific literature matched with this product's architecture.