The original AutoGLM-Phone-9B model supports multimodality, but this GGUF model does not support multimodality.

by sean2342 - opened 4 days ago

4 days ago

AutoGLM-Phone-9B model needs to support image processing in order to be called by Open-AutoGLM to implement mobile phone automation operations. Currently, when this GGUF model is called, it prompts that image input is not supported. Looking forward to a multimodal GGUF format model, so it can be conveniently loaded in Ollama or LM Studio for local calls.

nicoboss

4 days ago

llama.cpp currently does unfortunately not support vision for the Glm4vForConditionalGeneration architecture. We will add vision support once llama.cpp support for it is implemented.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment