γ-MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Paper
•
2410.13859
•
Published
•
8
$\gamma$-MOD is a novel approach to enhance computational efficiency in Multimodal Large Language Models (MLLMs) by incorporating Mixture-of-Depth (MoD) layers. This plug-and-play strategy seamlessly replaces redundant dense layers, significantly reducing computational costs while maintaining performance.
$\gamma$-MOD introduces a new paradigm that focuses on reducing activated tokens, offering superior efficiency compared to existing methods. The approach is inspired by the concept of activated tokens and aims to transform dense MLLM layers into sparse MoD layers, ultimately making MLLMs more accessible and applicable in resource-constrained environments. Key features include:
BibTeX:
@misc{luo2024gammamodexploringmixtureofdepthadaptation,
title={$\gamma-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models},
author={Yaxin Luo and Gen Luo and Jiayi Ji and Yiyi Zhou and Xiaoshuai Sun and Zhiqiang Shen and Rongrong Ji},
year={2024},
eprint={2410.13859},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2410.13859},
}