hujiefrank commited on
Commit
3bb0869
Β·
verified Β·
1 Parent(s): ea8a082

Add files using upload-large-folder tool

Browse files
Files changed (2) hide show
  1. README.md +15 -13
  2. assets/model_struct.jpg +2 -2
README.md CHANGED
@@ -13,22 +13,24 @@ library_name: transformers
13
  <hr>
14
 
15
  <div align="center" style="line-height: 1;">
16
-
17
- [//]: # ( <a href='https://meituan-longcat.github.io/LongCat-Image/'><img src='https://img.shields.io/badge/Project-Page-green'></a>)
18
- <a href='https://arxiv.org/abs/'><img src='https://img.shields.io/badge/Technical-Report-red'></a>
19
- <a href='https://huggingface.co/meituan-longcat/LongCat-Image'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue'></a>
20
- <a href='https://huggingface.co/meituan-longcat/LongCat-Image-Dev'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Dev-blue'></a>
21
- <a href='https://huggingface.co/meituan-longcat/LongCat-Image-Edit'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Edit-blue'></a>
22
- <a href='https://github.com/meituan-longcat/LongCat-Image'><img src='https://img.shields.io/badge/GitHub-Code-black'></a>
23
  </div>
24
 
25
  <div align="center" style="line-height: 1;">
26
- <a href='https://github.com/meituan-longcat/LongCat-Flash-Chat/blob/main/figures/wechat_official_accounts.png'><img src='https://img.shields.io/badge/WeChat-LongCat-brightgreen?logo=wechat&logoColor=white'></a>
27
- <a href='https://x.com/Meituan_LongCat'><img src='https://img.shields.io/badge/Twitter-LongCat-white?logo=x&logoColor=white'></a>
 
 
 
28
  </div>
29
 
 
 
30
  ## Introduction
31
- We introduce **LongCat-Image**, a pioneering open-source, bilingual (Chinese-English) foundation model for image generation with **6B parameters**, designed to address core challenges in multilingual text rendering, photorealism, deployment efficiency, and developer accessibility. LongCat-Image delivers exceptional generation quality and remarkable model efficiency, achieving state-of-the-art performance while significantly lowering deployment barriers.
32
  <div align="center">
33
  <img src="assets/model_struct.jpg" width="90%" alt="LongCat-Image Generation Examples" />
34
  </div>
@@ -36,8 +38,8 @@ We introduce **LongCat-Image**, a pioneering open-source, bilingual (Chinese-Eng
36
 
37
  ### Key Features
38
  - 🌟 **Exceptional Efficiency and Performance**: With only **6B parameters**, LongCat-Image surpasses numerous open-source models that are several times larger across multiple benchmarks, demonstrating the immense potential of efficient model design.
39
- - 🌟 **Powerful Chinese Text Rendering**: The model demonstrates superior accuracy and stability in rendering common Chinese characters compared to existing SOTA open-source models and achieves industry-leading coverage of the Chinese dictionary.
40
- - 🌟 **Remarkable Photorealism**: Through an innovative data strategy and training framework, our model achieves remarkable photorealism in generated images.
41
 
42
  [//]: # (For more details, please refer to the comprehensive [***LongCat-Image Technical Report***]&#40;https://arxiv.org/abs/2412.11963&#41;.)
43
 
@@ -71,7 +73,7 @@ python setup.py develop
71
  ```
72
 
73
  ### Run Text-to-Image Generation
74
-
75
  ```shell
76
  import torch
77
  from transformers import AutoProcessor
 
13
  <hr>
14
 
15
  <div align="center" style="line-height: 1;">
16
+ <a href='https://arxiv.org/abs/'><img src='https://img.shields.io/badge/Technical-Report-red'></a>
17
+ <a href='https://github.com/meituan-longcat/LongCat-Image'><img src='https://img.shields.io/badge/GitHub-Code-black'></a>
18
+ <a href='https://github.com/meituan-longcat/LongCat-Flash-Chat/blob/main/figures/wechat_official_accounts.png'><img src='https://img.shields.io/badge/WeChat-LongCat-brightgreen?logo=wechat&logoColor=white'></a>
19
+ <a href='https://x.com/Meituan_LongCat'><img src='https://img.shields.io/badge/Twitter-LongCat-white?logo=x&logoColor=white'></a>
 
 
 
20
  </div>
21
 
22
  <div align="center" style="line-height: 1;">
23
+
24
+ [//]: # ( <a href='https://meituan-longcat.github.io/LongCat-Image/'><img src='https://img.shields.io/badge/Project-Page-green'></a>)
25
+ <a href='https://huggingface.co/meituan-longcat/LongCat-Image'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-LongCat--Image-blue'></a>
26
+ <a href='https://huggingface.co/meituan-longcat/LongCat-Image-Dev'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-LongCat--Image--Dev-blue'></a>
27
+ <a href='https://huggingface.co/meituan-longcat/LongCat-Image-Edit'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-LongCat--Image--Edit-blue'></a>
28
  </div>
29
 
30
+
31
+
32
  ## Introduction
33
+ We introduce **LongCat-Image**, a pioneering open-source and bilingual (Chinese-English) foundation model for image generation, designed to address core challenges in multilingual text rendering, photorealism, deployment efficiency, and developer accessibility prevalent in current leading models.
34
  <div align="center">
35
  <img src="assets/model_struct.jpg" width="90%" alt="LongCat-Image Generation Examples" />
36
  </div>
 
38
 
39
  ### Key Features
40
  - 🌟 **Exceptional Efficiency and Performance**: With only **6B parameters**, LongCat-Image surpasses numerous open-source models that are several times larger across multiple benchmarks, demonstrating the immense potential of efficient model design.
41
+ - 🌟 **Powerful Chinese Text Rendering**: LongCat-Image demonstrates superior accuracy and stability in rendering common Chinese characters compared to existing SOTA open-source models and achieves industry-leading coverage of the Chinese dictionary.
42
+ - 🌟 **Remarkable Photorealism**: Through an innovative data strategy and training framework, LongCat-Image achieves remarkable photorealism in generated images.
43
 
44
  [//]: # (For more details, please refer to the comprehensive [***LongCat-Image Technical Report***]&#40;https://arxiv.org/abs/2412.11963&#41;.)
45
 
 
73
  ```
74
 
75
  ### Run Text-to-Image Generation
76
+ **πŸ’‘ Tip**: Using a stronger LLM model for prompt engineering can further improve image generation quality. Please refer to [inference_t2i.py](https://github.com/meituan-longcat/LongCat-Image/blob/main/scripts/inference_t2i.py#L28) for detailed usage.
77
  ```shell
78
  import torch
79
  from transformers import AutoProcessor
assets/model_struct.jpg CHANGED

Git LFS Details

  • SHA256: da7ab070461885767ddef82ec79aebd3984c7ca0e59db2f1e456438f1442fa8a
  • Pointer size: 132 Bytes
  • Size of remote file: 3.43 MB

Git LFS Details

  • SHA256: aadef7db22c66c8060e3f7df5657ae6b77c728429ef28cc57c61638343768bc1
  • Pointer size: 132 Bytes
  • Size of remote file: 3.63 MB