Finetuned FunctionGemma Model: emredeveloper/functiongemma-tools
This repository hosts a finetuned version of Google's functiongemma-270m-it model, specifically adapted for advanced tool-calling capabilities with integrated reasoning. This model has been enhanced using the LLM360/TxT360-3efforts dataset, which focuses on providing detailed thinking processes alongside tool calls.
Model Overview
- Base Model:
unsloth/functiongemma-270m-it - Finetuning Method: LoRA (Low-Rank Adaptation) using Unsloth library.
- Dataset:
LLM360/TxT360-3efforts(agent split, 50,000 examples streamed). - Key Enhancement: Improved tool-calling with explicit
<think>...</think>blocks for internal reasoning, crucial for complex multi-step tasks.
Finetuning Details
This model was finetuned in Google Colab using the Unsloth library, which significantly speeds up the finetuning process and reduces VRAM usage.
Training Configuration:
- Max Sequence Length: 4096
- LoRA Rank (r): 128
- LoRA Alpha: 256
- LoRA Dropout: 0
- Gradient Checkpointing: Enabled with "unsloth" optimization
- Batch Size: 4 (per device), with gradient accumulation steps = 2
- Learning Rate: 2e-4
- Optimizer:
adamw_8bit - Training Steps: 100
Special attention was given to applying a custom chat template that incorporates <think> tags for explicit reasoning and aligns with the functiongemma's tool-calling format. The training focused on responses only, masking out the instruction part to enhance the model's generation quality.
How to Use
This model can be loaded and used for inference with the Hugging Face transformers library, especially when combined with Unsloth for optimized performance.
Installation
First, ensure you have the necessary libraries installed:
pip install transformers unsloth[cuda] torch
Loading the Model
from unsloth import FastLanguageModel
import torch
from transformers import TextStreamer
max_seq_length = 4096
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "emredeveloper/functiongemma-tools", # Your finetuned model on Hugging Face Hub
max_seq_length = max_seq_length,
load_in_4bit = False,
load_in_8bit = False,
load_in_16bit = True, # Or False if you saved merged_4bit
)
Inference Example
To perform inference, you'll need to construct your messages and tools according to the functiongemma chat template. Here's an example demonstrating basic tool calling with thinking:
import json
tools_example = [
{
"type": "function",
"function": {
"name": "get_amazon_product_details",
"description": (
"Retrieves comprehensive product information from Amazon, "
"including title, price, description, specifications, and availability."
),
"parameters": {
"type": "object",
"properties": {
"asin": {
"type": "string",
"description": "The Amazon ASIN of the product.",
}
},
"required": ["asin"],
},
},
}
]
messages_example = [
{
"role": "system",
"content": (
"You are a shopping assistant. Use tools when you need detailed "
"Amazon product data such as price and specifications."
),
},
{
"role": "user",
"content": "Is the espresso machine with ASIN B0XYZ12345 any good for home use?"
},
]
# Apply the chat template for generation
text = tokenizer.apply_chat_template(
messages_example,
tools = tools_example,
tokenize = False,
add_generation_prompt = True,
).removeprefix('<bos>') # Remove <bos> token if present
# Generate a response
_ = model.generate(
**tokenizer(text, return_tensors = "pt").to("cuda"),
max_new_tokens = 1024,
streamer = TextStreamer(tokenizer, skip_prompt = False),
top_p = 0.95, top_k = 64, temperature = 1.0,
)
This will produce an output similar to:
<start_of_turn>model
<think>User is asking for an opinion, but I need factual product details first such as price, features, and reviews. I should call the Amazon product details tool with the provided ASIN.</think><start_function_call>call:get_amazon_product_details{asin:<escape>B0XYZ12345<escape>}<end_function_call>
Applications
This finetuned functiongemma model is ideal for:
- Advanced AI Assistants: Building intelligent agents that can reason about complex tasks and use external tools effectively.
- Tool-Augmented LLMs: Enhancing LLMs with the ability to dynamically call functions and interpret their results.
- Complex Workflow Automation: Automating multi-step processes that require logical reasoning and interaction with external systems.
- Research in Tool Learning: Studying and developing more sophisticated tool-learning mechanisms for LLMs.
Feedback and Issues
For any questions, issues, or contributions, please refer to the Unsloth Discord channel or open an issue on the Hugging Face repository.
- Downloads last month
- -