[Paper] [Project Page]
[🤗LeX-Enhancer (Model)] [🤗LeX-Lumina (Model)] [🤗LeX-10K (Data)] [🤗LeX-Bench (Benchmark)]
This is the official repository for LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis.
- Proposed LeX-Art, a system bridging prompt expressiveness and text rendering fidelity.
- Curated LeX-10K, a dataset of 10K high-resolution (1024×1024) aesthetically refined images.
- Developed LeX-Enhancer for prompt enrichment and trained two text-to-image models, LeX-FLUX and LeX-Lumina.
- Introduced LeX-Bench for evaluating fidelity, aesthetics, and alignment, along with the Pairwise Normalized Edit Distance (PNED) metric for text accuracy.
Generating visually appealing and accurate text within images is challenging due to the difficulty of balancing text fidelity, aesthetic integration, and stylistic diversity. To address this, we introduce LeX, a framework that enhances text-to-image generation through LeX-Enhancer, a 14B-parameter prompt optimizer, and LeX-10K, a high-quality dataset. Using this, we train LeX-Flux (12B) and LeX-Lumina (2B), achieving state-of-the-art performance. We also propose LeX-Bench and PNED, a novel metric for evaluating text correctness and aesthetics. Experiments show LeX-Lumina achieving a 79.81% PNED gain on CreateBench, and LeX-FLUX outperforming baselines in color (+3.18%), positional (+4.45%), and font accuracy (+3.81%).
- ✅ March 27, 2025. 💥 We release LeX-Art, including:
- Checkpoints, Inference and Evaluate code.
- Website.
git clone https://github.com/zhaoshitian/LeX-Art.git
cd LeX-Art
conda create -n lex python=3.10
# if cuda version == 12.1
pip install torch==2.4.0 torchvision==0.19.0 --index-url https://download.pytorch.org/whl/cu121
pip install git+https://github.com/huggingface/diffusers.git
pip install transformers
We provide multiple tools for inference, tailored to different tasks:
- LeX-Enhancer: A tool designed to enhance prompts for improved text-to-image generation.
- LeX-Lumina: A text-to-image (T2I) model further trained on Lumina-Image-2.0, capable of generating high-quality images with precise text rendering from prompts.
- LeX-FLUX: A text-to-image (T2I) model further trained on FLUX.1, capable of generating high-quality images with precise text rendering from prompts.
Click on the links above for detailed instructions on how to use each tool.
For detailed instructions on model evaluation, please refer to the Evaluation README.
- Release the inference code.
- Release the evaluation code.
- Release the data and checkpoints for LeX Series.
- Release the training code for LeX-Lumina.
- Release the training code for LeX-FLUX.
If you find LeX-Art useful for your research and applications, please cite using this BibTeX:
@article{zhao2025lexart,
title={LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis},
author={Zhao, Shitian and Wu, Qilong and Li, Xinyue and Zhang, Bo and Li, Ming and Qin, Qi and Liu, Dongyang and Zhang, Kaipeng and Li, Hongsheng and Qiao, Yu and Gao, Peng and Fu, Bin and Li, Zhen},
journal={arXiv preprint arXiv:2503.21749},
year={2025}
}
Our work is primarily built upon FLUX, Lumina-Image-2.0, Qwen, DeepSeek, sd-scripts, etc. We extend our gratitude to all these authors for their significant contributions to the community.