Developer Tool · Android

Give your AI agent Eyes

Koder Eye turns any Android device into a live UI sensor. It exposes the full accessibility tree, real-time events, screenshots, OCR, and gesture injection as a local REST API — securely over ADB, with no cloud calls.

keye · Python 3
$ keye setup # adb forward tcp:9876 tcp:9876 ✓ port 9876 forwarded ✓ connection OK $ keye text Sign in Forgot password? Create account $ keye buttons text view_id bounds Sign in dev.example:id/btn_signin (120,800,200,56) $ keye tap --id dev.example:id/btn_signin ✓ tap dispatched at (220.0, 828.0) $ keye watch [42] TYPE_VIEW_CLICKED dev.example btn_signin [43] TYPE_WINDOW_STATE_CHANGED dev.example.HomeActivity

Everything an agent needs to reason about your screen

Six capabilities in a single lightweight Android app. No root, no USB shell during a session — just ADB port forwarding.

🌲

Full UI Tree

Complete accessibility tree via GET /screen, flat text via /screen/text, clickables via /screen/buttons. Compose Button labels resolved with DFS fallback.

📸

Screenshot + OCR

On-device screenshot via POST /capture and ML Kit Latin-script recognition via POST /ocr. Region crop supported. No cloud calls, no internet.

Real-time Events

Accessibility events delivered via SSE stream (GET /events/stream) or long-poll (GET /events?wait=5000). No busy-polling needed.

🤖

Gesture Injection

Tap, scroll, type, back, home, recents via POST endpoints. Gated by an in-app toggle — OFF by default. Every action is logged for audit.

🔒

Secure by Design

Binds to 127.0.0.1:9876 only. Bearer token required (32-byte random, stored in Android Keystore AES-256-GCM). Rate-limited at 20 RPS (5 RPS for heavy endpoints).

🖥️

Host CLI

Zero-dependency Python 3 CLI (keye). 18 commands including keye fanout text for multi-device aggregation and keye text --since-last for diff-only output.

How It Works

Three steps to give your agent eyes: install, forward, call.

1. Install & enable

Install the APK, grant the Accessibility permission, and start the foreground service. No root required.

  • Download from the Koder Store
  • Open Eye → Enable Accessibility → grant in Settings
  • Press Start Server — the app binds on 127.0.0.1:9876
  • Copy the Bearer token shown in the app
setup
$ keye setup adb forward tcp:9876 tcp:9876 ✓ port 9876 forwarded ✓ connection OK $ keye token --set a3b9c2... $ keye healthz ok version 0.2.0 accessibility
agent usage · Python
import requests, pathlib token = pathlib.Path('~/.config/koder-eye/token' ).expanduser().read_text().strip() h = {'Authorization': f'Bearer {token}'} base = 'http://127.0.0.1:9876' # What's on screen? text = requests.get(base+'/screen/text', headers=h).json() # → ['Sign in', 'Forgot password?', ...] # Tap the Sign In button requests.post(base+'/tap', headers=h, json={'view_id': 'dev.example:id/btn_signin'})

2. Query the UI tree

Every HTTP call returns structured data your agent can parse and reason about — no image captioning required.

  • Text, bounds, enabled state, resource IDs
  • Diff endpoint for change detection (/screen/diff?since=)
  • Window list for dialogs and overlays
  • OCR for text rendered on Canvas or WebView

3. Act & observe

Close the loop: inject gestures, observe the resulting events, verify the new screen state — all in a tight, auditable loop.

  • Injection toggle is OFF by default — enable per session
  • Every injected action emits an EYE-INJECTED-* event
  • SSE stream for continuous observation without polling
  • Rate limiting protects the device from runaway loops
observe · SSE stream
$ keye watch # Streams live events from the device seq=44 TYPE_VIEW_CLICKED pkg=dev.example cls=Button text=Sign in seq=45 TYPE_WINDOW_STATE_CHANGED pkg=dev.example cls=HomeActivity seq=46 EYE-INJECTED-TAP tap dispatched at (220.0, 828.0) # Ctrl+C to stop

How It Compares

Koder Eye is the only option designed to be consumed by a long-running AI agent over a persistent HTTP connection.

Feature Koder Eye uiautomator2 Maestro Manual ADB
Local HTTP API (no USB shell per call)
Full accessibility tree
Real-time event stream (SSE)
Long-poll (no busy-loop needed)
On-device OCR (no cloud)
Screen diff / change detection
Bearer token auth
Multi-device fanout
No test framework required

FAQ

Is root required?

No. Koder Eye uses the standard AccessibilityService API, which requires only the user to enable the service in Settings. No root, no special Android build.

Is my screen data sent to the cloud?

Never. The Eye server binds exclusively to 127.0.0.1:9876 — unreachable from the LAN. Data flows only between the Android device and your laptop over the ADB tunnel. OCR runs fully on-device via ML Kit. There is no outbound HTTP code in the app.

Can another app on the device read the UI tree?

No. A valid Bearer token is required on every request except /healthz. The token is a 32-byte random value stored in EncryptedSharedPreferences (Android Keystore AES-256-GCM) and only delivered to your laptop via the ADB tunnel.

What Android version is required?

Android 9.0 (API 28) or higher. Screenshot capture (POST /capture) requires API 30+. The app targets API 36 (Android 16).

What happens if my agent polls too fast?

The built-in rate limiter returns 429 Too Many Requests with a Retry-After header. Default limits: 20 RPS for lightweight endpoints, 5 RPS for /screen, /capture, and /ocr. Use GET /events?wait=5000 instead of polling /screen repeatedly.

Is injection enabled by default?

No. The "Allow injection" toggle in the app is OFF by default. The agent cannot tap, scroll, or type without the developer explicitly enabling it. Every injection emits an EYE-INJECTED-* event in the log.

Ready to give your agent a window into Android?

Download Koder Eye, run keye setup, and your agent has eyes in seconds.

Download APK (v0.2.1) Read the docs