Add files using upload-large-folder tool
Browse files- README.md +15 -13
- assets/model_struct.jpg +2 -2
README.md
CHANGED
|
@@ -13,22 +13,24 @@ library_name: transformers
|
|
| 13 |
<hr>
|
| 14 |
|
| 15 |
<div align="center" style="line-height: 1;">
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
<a href='https://huggingface.co/meituan-longcat/LongCat-Image-Dev'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Dev-blue'></a>
|
| 21 |
-
<a href='https://huggingface.co/meituan-longcat/LongCat-Image-Edit'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Edit-blue'></a>
|
| 22 |
-
<a href='https://github.com/meituan-longcat/LongCat-Image'><img src='https://img.shields.io/badge/GitHub-Code-black'></a>
|
| 23 |
</div>
|
| 24 |
|
| 25 |
<div align="center" style="line-height: 1;">
|
| 26 |
-
|
| 27 |
-
<a href='https://
|
|
|
|
|
|
|
|
|
|
| 28 |
</div>
|
| 29 |
|
|
|
|
|
|
|
| 30 |
## Introduction
|
| 31 |
-
We introduce **LongCat-Image**, a pioneering open-source
|
| 32 |
<div align="center">
|
| 33 |
<img src="assets/model_struct.jpg" width="90%" alt="LongCat-Image Generation Examples" />
|
| 34 |
</div>
|
|
@@ -36,8 +38,8 @@ We introduce **LongCat-Image**, a pioneering open-source, bilingual (Chinese-Eng
|
|
| 36 |
|
| 37 |
### Key Features
|
| 38 |
- π **Exceptional Efficiency and Performance**: With only **6B parameters**, LongCat-Image surpasses numerous open-source models that are several times larger across multiple benchmarks, demonstrating the immense potential of efficient model design.
|
| 39 |
-
- π **Powerful Chinese Text Rendering**:
|
| 40 |
-
- π **Remarkable Photorealism**: Through an innovative data strategy and training framework,
|
| 41 |
|
| 42 |
[//]: # (For more details, please refer to the comprehensive [***LongCat-Image Technical Report***](https://arxiv.org/abs/2412.11963).)
|
| 43 |
|
|
@@ -71,7 +73,7 @@ python setup.py develop
|
|
| 71 |
```
|
| 72 |
|
| 73 |
### Run Text-to-Image Generation
|
| 74 |
-
|
| 75 |
```shell
|
| 76 |
import torch
|
| 77 |
from transformers import AutoProcessor
|
|
|
|
| 13 |
<hr>
|
| 14 |
|
| 15 |
<div align="center" style="line-height: 1;">
|
| 16 |
+
<a href='https://arxiv.org/abs/'><img src='https://img.shields.io/badge/Technical-Report-red'></a>
|
| 17 |
+
<a href='https://github.com/meituan-longcat/LongCat-Image'><img src='https://img.shields.io/badge/GitHub-Code-black'></a>
|
| 18 |
+
<a href='https://github.com/meituan-longcat/LongCat-Flash-Chat/blob/main/figures/wechat_official_accounts.png'><img src='https://img.shields.io/badge/WeChat-LongCat-brightgreen?logo=wechat&logoColor=white'></a>
|
| 19 |
+
<a href='https://x.com/Meituan_LongCat'><img src='https://img.shields.io/badge/Twitter-LongCat-white?logo=x&logoColor=white'></a>
|
|
|
|
|
|
|
|
|
|
| 20 |
</div>
|
| 21 |
|
| 22 |
<div align="center" style="line-height: 1;">
|
| 23 |
+
|
| 24 |
+
[//]: # ( <a href='https://meituan-longcat.github.io/LongCat-Image/'><img src='https://img.shields.io/badge/Project-Page-green'></a>)
|
| 25 |
+
<a href='https://huggingface.co/meituan-longcat/LongCat-Image'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-LongCat--Image-blue'></a>
|
| 26 |
+
<a href='https://huggingface.co/meituan-longcat/LongCat-Image-Dev'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-LongCat--Image--Dev-blue'></a>
|
| 27 |
+
<a href='https://huggingface.co/meituan-longcat/LongCat-Image-Edit'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-LongCat--Image--Edit-blue'></a>
|
| 28 |
</div>
|
| 29 |
|
| 30 |
+
|
| 31 |
+
|
| 32 |
## Introduction
|
| 33 |
+
We introduce **LongCat-Image**, a pioneering open-source and bilingual (Chinese-English) foundation model for image generation, designed to address core challenges in multilingual text rendering, photorealism, deployment efficiency, and developer accessibility prevalent in current leading models.
|
| 34 |
<div align="center">
|
| 35 |
<img src="assets/model_struct.jpg" width="90%" alt="LongCat-Image Generation Examples" />
|
| 36 |
</div>
|
|
|
|
| 38 |
|
| 39 |
### Key Features
|
| 40 |
- π **Exceptional Efficiency and Performance**: With only **6B parameters**, LongCat-Image surpasses numerous open-source models that are several times larger across multiple benchmarks, demonstrating the immense potential of efficient model design.
|
| 41 |
+
- π **Powerful Chinese Text Rendering**: LongCat-Image demonstrates superior accuracy and stability in rendering common Chinese characters compared to existing SOTA open-source models and achieves industry-leading coverage of the Chinese dictionary.
|
| 42 |
+
- π **Remarkable Photorealism**: Through an innovative data strategy and training framework, LongCat-Image achieves remarkable photorealism in generated images.
|
| 43 |
|
| 44 |
[//]: # (For more details, please refer to the comprehensive [***LongCat-Image Technical Report***](https://arxiv.org/abs/2412.11963).)
|
| 45 |
|
|
|
|
| 73 |
```
|
| 74 |
|
| 75 |
### Run Text-to-Image Generation
|
| 76 |
+
**π‘ Tip**: Using a stronger LLM model for prompt engineering can further improve image generation quality. Please refer to [inference_t2i.py](https://github.com/meituan-longcat/LongCat-Image/blob/main/scripts/inference_t2i.py#L28) for detailed usage.
|
| 77 |
```shell
|
| 78 |
import torch
|
| 79 |
from transformers import AutoProcessor
|
assets/model_struct.jpg
CHANGED
|
Git LFS Details
|
|
Git LFS Details
|