DeclarUI: Bridging Design and Development with Automated Declarative UI Code Generation
Problem Statement
Translating UI designs into functional declarative code is a labor-intensive bottleneck in mobile development, and while MLLMs show promise, they struggle with accurate UI component recognition and capturing multi-page interaction logic. Existing MLLM-based approaches produce code with poor visual fidelity, incomplete interaction flows, and frequent compilation failures. A structured, hybrid approach is needed to bridge the gap between design artifacts and production-ready code.
Key Novelty
- Page Transition Graphs (PTGs): A structured representation that models complex inter-page navigation and interaction logic, enabling MLLMs to generate functionally complete multi-screen applications rather than isolated screens.
- Synergistic CV + MLLM pipeline: Precise computer vision-based component segmentation is used as a preprocessing step to improve MLLM input quality, significantly boosting visual fidelity over end-to-end MLLM baselines.
- Iterative compiler-driven optimization: Generated code is automatically compiled and feedback from compiler errors is fed back into the system in iterative refinement loops, achieving a 98% compilation success rate.
Evaluation Highlights
- DeclarUI achieves a 96.8% PTG coverage rate and 98% compilation success rate on React Native, representing a 123% improvement in PTG coverage and a 29% boost in compilation success over state-of-the-art MLLMs.
- Visual similarity scores improve by up to 55% over SOTA MLLM baselines; user studies with professional developers confirm the generated code meets industrial-grade standards in readability, maintainability, and modification time.
Breakthrough Assessment
Methodology
- Step 1 - Component Segmentation: Apply computer vision techniques to precisely detect and segment UI components from design screenshots/files, producing structured, component-level inputs rather than raw images for the MLLM.
- Step 2 - PTG Construction & MLLM Code Generation: Build Page Transition Graphs from multi-screen design artifacts to capture navigation logic, then prompt MLLMs with both the segmented component data and PTG structure to generate declarative UI code (React Native/Flutter/ArkUI).
- Step 3 - Iterative Compiler-Driven Optimization: Compile the generated code, capture compiler errors and warnings, and feed them back as prompts to the MLLM for iterative refinement until compilation succeeds or quality thresholds are met.
System Components
Detects and isolates individual UI components (buttons, text fields, images, etc.) from design images to provide precise, structured inputs that reduce MLLM visual recognition errors.
Constructs a graph representing all pages/screens and their navigational relationships, enabling the system to model and generate complete multi-page interaction logic rather than single-screen code.
Uses a multimodal LLM conditioned on segmented component data and PTG structure to generate declarative UI code in the target framework (React Native, Flutter, or ArkUI).
Compiles generated code, parses error/warning feedback, and re-prompts the MLLM with this feedback to iteratively fix issues, driving the system toward a high compilation success rate.
Results
| Metric/Benchmark | SOTA MLLM Baseline | DeclarUI | Delta |
|---|---|---|---|
| PTG Coverage Rate (React Native) | ~43% (inferred) | 96.8% | +123% relative |
| Compilation Success Rate (React Native) | ~76% (inferred) | 98% | +29% relative |
| Visual Similarity Score | Baseline | Up to 55% higher | +55% relative |
| Flutter/ArkUI Generalization | Not demonstrated | Successful | Cross-framework |
Key Takeaways
- Structured intermediate representations (like PTGs) are critical for MLLM-based code generation tasks with complex state/navigation logic — raw image-to-code prompting leaves significant functional completeness on the table.
- Compiler feedback as an iterative optimization signal is a highly practical and effective strategy for improving LLM-generated code quality without retraining, applicable beyond UI generation to any code synthesis task.
- Hybrid CV + MLLM pipelines consistently outperform end-to-end MLLM approaches for structured visual understanding tasks; investing in a dedicated perception front-end significantly reduces the cognitive load on the generative model.
Abstract
Declarative UI frameworks have gained widespread adoption in mobile app development, offering benefits such as improved code readability and easier maintenance. Despite these advantages, the process of translating UI designs into functional code remains challenging and time-consuming. Recent advancements in multimodal large language models (MLLMs) have shown promise in directly generating mobile app code from user interface (UI) designs. However, the direct application of MLLMs to this task is limited by challenges in accurately recognizing UI components and comprehensively capturing interaction logic. To address these challenges, we propose DeclarUI, an automated approach that synergizes computer vision (CV), MLLMs, and iterative compiler-driven optimization to generate and refine declarative UI code from designs. DeclarUI enhances visual fidelity, functional completeness, and code quality through precise component segmentation, Page Transition Graphs (PTGs) for modeling complex inter-page relationships, and iterative optimization. In our evaluation, DeclarUI outperforms baselines on React Native, a widely adopted declarative UI framework, achieving a 96.8% PTG coverage rate and a 98% compilation success rate. Notably, DeclarUI demonstrates significant improvements over state-of-the-art MLLMs, with a 123% increase in PTG coverage rate, up to 55% enhancement in visual similarity scores, and a 29% boost in compilation success rate. We further demonstrate DeclarUI’s generalizability through successful applications to Flutter and ArkUI frameworks. User studies with professional developers confirm that DeclarUI’s generated code meets industrial-grade standards in code availability, modification time, readability, and maintainability. By streamlining app development, improving efficiency, and fostering designer-developer collaboration, DeclarUI offers a practical solution to the persistent challenges in mobile UI development.