https://huggingface.co/zai-org/GLM-4.6V-Flash

#1587

by SabinStargem - opened 28 days ago

Discussion

SabinStargem

28 days ago

This is a tiny member of the GLM family, weighing in at about 9b.

https://huggingface.co/zai-org/GLM-4.6V-Flash

nicoboss

28 days ago

We need to wait for https://github.com/ggml-org/llama.cpp/pull/16600 to be merged first. That PR unfortunately seems somewhat stale as the last commit was 1 month ago.

nicoboss

28 days ago

I could actually try this one thanks to https://github.com/ggml-org/llama.cpp/pull/14823 but it will lack any vision capabilities.

jacek2024

28 days ago

I am not sure is my "text only" support is valid for this one but you can try.

nicoboss

28 days ago

I am not sure is my "text only" support is valid for this one but you can try.

@jacek2024 It is. You are amazing! Thank you so much for adding text-only support for this one.
Maybe something similar could be done for Glm4vMoeForConditionalGeneration but assuming https://github.com/ggml-org/llama.cpp/pull/16600 is not forever abandoned waiting for it makes more sense.

-2000   19 GLM-4.6V-Flash                                run/imatrix (GPU-2d) 46/40 0.91s/c 0.3/4.8m(?-8.1) [11/315] 5.3983
-2000   19 si GLM-4.6V-Flash                               run/static 2/12,Q4_K_S [27/523] (hfu f16)

nicoboss

28 days ago

•

edited 28 days ago

It's queued!

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#GLM-4.6V-Flash-GGUF for text-only quants to appear.

jacek2024

28 days ago

•

edited 28 days ago

great! if Air won't be released we may try doing same for 4.6V

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment