Add files using upload-large-folder tool

Browse files

Files changed (2) hide show

README.md +15 -13
assets/model_struct.jpg +2 -2

README.md CHANGED Viewed

@@ -13,22 +13,24 @@ library_name: transformers
 <hr>
 <div align="center" style="line-height: 1;">
-[//]: # (  <a href='https://meituan-longcat.github.io/LongCat-Image/'><img src='https://img.shields.io/badge/Project-Page-green'></a>)
-  <a href='https://arxiv.org/abs/'><img src='https://img.shields.io/badge/Technical-Report-red'></a>
-  <a href='https://huggingface.co/meituan-longcat/LongCat-Image'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue'></a>
-  <a href='https://huggingface.co/meituan-longcat/LongCat-Image-Dev'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Dev-blue'></a>
-  <a href='https://huggingface.co/meituan-longcat/LongCat-Image-Edit'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Edit-blue'></a>
- <a href='https://github.com/meituan-longcat/LongCat-Image'><img src='https://img.shields.io/badge/GitHub-Code-black'></a>
 </div>
 <div align="center" style="line-height: 1;">
-  <a href='https://github.com/meituan-longcat/LongCat-Flash-Chat/blob/main/figures/wechat_official_accounts.png'><img src='https://img.shields.io/badge/WeChat-LongCat-brightgreen?logo=wechat&logoColor=white'></a>
-  <a href='https://x.com/Meituan_LongCat'><img src='https://img.shields.io/badge/Twitter-LongCat-white?logo=x&logoColor=white'></a>
 </div>
 ## Introduction
-We introduce **LongCat-Image**, a pioneering open-source, bilingual (Chinese-English) foundation model for image generation with **6B parameters**, designed to address core challenges in multilingual text rendering, photorealism, deployment efficiency, and developer accessibility. LongCat-Image delivers exceptional generation quality and remarkable model efficiency, achieving state-of-the-art performance while significantly lowering deployment barriers.
 <div align="center">
   <img src="assets/model_struct.jpg" width="90%" alt="LongCat-Image Generation Examples" />
 </div>
@@ -36,8 +38,8 @@ We introduce **LongCat-Image**, a pioneering open-source, bilingual (Chinese-Eng
 ### Key Features
 - 🌟 **Exceptional Efficiency and Performance**: With only **6B parameters**, LongCat-Image surpasses numerous open-source models that are several times larger across multiple benchmarks, demonstrating the immense potential of efficient model design.
-- 🌟 **Powerful Chinese Text Rendering**: The model demonstrates superior accuracy and stability in rendering common Chinese characters compared to existing SOTA open-source models and achieves industry-leading coverage of the Chinese dictionary.
-- 🌟 **Remarkable Photorealism**: Through an innovative data strategy and training framework, our model achieves remarkable photorealism in generated images.
 [//]: # (For more details, please refer to the comprehensive [***LongCat-Image Technical Report***]&#40;https://arxiv.org/abs/2412.11963&#41;.)
@@ -71,7 +73,7 @@ python setup.py develop
 ```
 ### Run Text-to-Image Generation
 ```shell
 import torch
 from transformers import AutoProcessor

 <hr>
 <div align="center" style="line-height: 1;">
+    <a href='https://arxiv.org/abs/'><img src='https://img.shields.io/badge/Technical-Report-red'></a>
+    <a href='https://github.com/meituan-longcat/LongCat-Image'><img src='https://img.shields.io/badge/GitHub-Code-black'></a>
+    <a href='https://github.com/meituan-longcat/LongCat-Flash-Chat/blob/main/figures/wechat_official_accounts.png'><img src='https://img.shields.io/badge/WeChat-LongCat-brightgreen?logo=wechat&logoColor=white'></a>
+    <a href='https://x.com/Meituan_LongCat'><img src='https://img.shields.io/badge/Twitter-LongCat-white?logo=x&logoColor=white'></a>
 </div>
 <div align="center" style="line-height: 1;">
+[//]: # (  <a href='https://meituan-longcat.github.io/LongCat-Image/'><img src='https://img.shields.io/badge/Project-Page-green'></a>)
+  <a href='https://huggingface.co/meituan-longcat/LongCat-Image'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-LongCat--Image-blue'></a>
+  <a href='https://huggingface.co/meituan-longcat/LongCat-Image-Dev'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-LongCat--Image--Dev-blue'></a>
+  <a href='https://huggingface.co/meituan-longcat/LongCat-Image-Edit'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-LongCat--Image--Edit-blue'></a>
 </div>
 ## Introduction
+We introduce **LongCat-Image**, a pioneering open-source and bilingual (Chinese-English) foundation model for image generation, designed to address core challenges in multilingual text rendering, photorealism, deployment efficiency, and developer accessibility prevalent in current leading models.
 <div align="center">
   <img src="assets/model_struct.jpg" width="90%" alt="LongCat-Image Generation Examples" />
 </div>
 ### Key Features
 - 🌟 **Exceptional Efficiency and Performance**: With only **6B parameters**, LongCat-Image surpasses numerous open-source models that are several times larger across multiple benchmarks, demonstrating the immense potential of efficient model design.
+- 🌟 **Powerful Chinese Text Rendering**: LongCat-Image demonstrates superior accuracy and stability in rendering common Chinese characters compared to existing SOTA open-source models and achieves industry-leading coverage of the Chinese dictionary.
+- 🌟 **Remarkable Photorealism**: Through an innovative data strategy and training framework, LongCat-Image achieves remarkable photorealism in generated images.
 [//]: # (For more details, please refer to the comprehensive [***LongCat-Image Technical Report***]&#40;https://arxiv.org/abs/2412.11963&#41;.)
 ```
 ### Run Text-to-Image Generation
+**💡 Tip**: Using a stronger LLM model for prompt engineering can further improve image generation quality. Please refer to [inference_t2i.py](https://github.com/meituan-longcat/LongCat-Image/blob/main/scripts/inference_t2i.py#L28) for detailed usage.
 ```shell
 import torch
 from transformers import AutoProcessor

assets/model_struct.jpg CHANGED Viewed

Git LFS Details

SHA256: da7ab070461885767ddef82ec79aebd3984c7ca0e59db2f1e456438f1442fa8a
Pointer size: 132 Bytes
Size of remote file: 3.43 MB

Git LFS Details

SHA256: aadef7db22c66c8060e3f7df5657ae6b77c728429ef28cc57c61638343768bc1
Pointer size: 132 Bytes
Size of remote file: 3.63 MB