Achieved 99%+ stability, blocked critical bugs, saved thousands of developer hours.
Technology
|
Engineering
|
Generative AI
Uber developed DragonCrawl, an AI-powered mobile testing system to automate app testing across 85 cities and 50+ languages with zero manual maintenance.
Uber created an AI system that tests their mobile app automatically by reading the screen like a human would and deciding what buttons to tap or forms to fill out. Instead of writing rigid test scripts that break when the app changes, the AI understands what it's looking at and adapts to new layouts or features. This means engineers spend less time fixing broken tests and more time building new features, while catching bugs before customers see them.
Uber's DragonCrawl transforms mobile testing from brittle script-based automation into adaptive, language-driven testing. The system uses MPNet, a 110-million parameter transformer model, to interpret mobile app screens as structured text and determine appropriate testing actions. Rather than relying on hardcoded UI selectors that break with interface changes, DragonCrawl reads screen content and test goals in natural language, then selects the most relevant action through a retrieval-based approach.
The technical architecture extracts UI elements using accessibility services, converts them into structured representations, and feeds this context along with test objectives to the language model. The model returns actions like "tap login button" which are dynamically mapped to actual UI elements at runtime. This eliminates the maintenance overhead of traditional test scripts while enabling human-like adaptability to UI changes, popup handling, and error recovery.
Operating across Uber's global platform, DragonCrawl successfully tests in 85 of 89 cities, supporting over 50 languages with 99%+ production stability. The system includes guardrails, fallback strategies, and loop detection to handle edge cases. By reframing testing as a language problem rather than a scripting challenge, Uber achieved zero-maintenance automation that scales globally while blocking critical bugs before they reach users.
Think of DragonCrawl like having a very patient, multilingual intern who never gets tired of testing your app. While traditional automated testing is like following a recipe that fails if you move the salt shaker, this AI intern can look at your kitchen, understand what you're trying to cook, and adapt when ingredients are in different places. It speaks 50+ languages, works in 85 different kitchens simultaneously, and never needs coffee breaks.
5
/5
A pioneering application of generative AI to replace brittle script-based mobile testing with adaptive, multilingual, zero-maintenance automation at global scale.
Timeline:
24 months
Cost:
4,500,000
Headcount:
15