πŸš€ Marketing Product Caption Generator (Fine-Tuned with Unsloth)

This is a fine-tuned vision-language model capable of generating creative, on-brand marketing copy from product images and user prompts. It's based on unsloth/Llama-3.2-11B-Vision-Instruct and has been fine-tuned to understand product visuals and generate compelling titles, descriptions, and taglines.

🎯 Use Case: Generate high-quality marketing content for e-commerce, social media, or product listings β€” just provide an image and a short prompt!


πŸ“Έ Example Input

  • Image: A photo of a minimalist water bottle
  • User Prompt: "Eco-friendly design for active millennials. Tone: fresh and sustainable."

✍️ Example Output

Title: PureFlow Eco Bottle
Description: Sleek and lightweight, this BPA-free bottle combines modern design with planet-friendly materials. Ideal for workouts, commutes, or everyday hydration.
Tagline: Stay refreshed. Stay responsible.


🧠 Model Details

Attribute Value
Base Model unsloth/Llama-3.2-11B-Vision-Instruct
Training Framework Unsloth (LoRA + 4bit fine-tuning)
Merged & Saved Yes (16-bit merged full model)
Modality Vision-to-Text (Multimodal)
Use Case Marketing copy, product descriptions, social media captions
License Apache 2.0

Uploaded finetuned model

  • Developed by: sirineddd
  • License: apache-2.0
  • Finetuned from model : unsloth/llama-3.2-11b-vision-instruct-unsloth-bnb-4bit

This mllama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
27
Safetensors
Model size
11B params
Tensor type
BF16
Β·
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train sirineddd/Llama-3.2-11B-Vision-Instruct_finetune