Oppo Open-Sources X-OmniClaw, an On-Device AI Agent Framework for Android
Oppo's AI team has released X-OmniClaw, an open-source AI agent framework for Android that runs core perception and action logic on-device, using the cloud only for heavy reasoning. Unlike cloud-based mobile AI systems that operate on virtual Android copies, X-OmniClaw directly accesses the phone's camera, screen, and local files, enabling real-time context awareness. The framework is built on three pillars: Omni Perception (combining camera, screen, and voice input via a vision-language model), Omni Memory (maintaining long-term semantic memory from photo gallery and session history for continuous assistance), and Omni Action (executing tasks via XML parsing, OCR, and on-device vision, plus behavior cloning for shortcut replay). Demonstrated capabilities include identifying products via camera and searching Taobao, acting as a math tutor on screen, and assembling highlight videos from gallery photos. X-OmniClaw builds on the HermesApp codebase and is inspired by OpenClaw, extending agentic AI to smartphones. The code is available on GitHub, with Oppo pledging ongoing updates.
Key facts
- Runs core AI logic on-device, using cloud only for heavy reasoning.
- Uses phone camera, screen, and microphone for real-time perception.
- Builds long-term semantic memory from photo gallery and session history.
- Behavior cloning lets users record and replay navigation shortcuts via Android deeplinks.
- Outperforms cloud-based agents by accessing actual device hardware and local files.