stereoplegic 's Collections Context compression
updated
In-Context Learning Creates Task Vectors
Paper
• 2310.15916
• Published
• 44
When can transformers reason with abstract symbols?
Paper
• 2310.09753
• Published
• 3
Improving Length-Generalization in Transformers via Task Hinting
Paper
• 2310.00726
• Published
• 1
In-context Autoencoder for Context Compression in a Large Language Model
Paper
• 2307.06945
• Published
• 29
Adapting Language Models to Compress Contexts
Paper
• 2305.14788
• Published
• 1
Context Compression for Auto-regressive Transformers with Sentinel
Tokens
Paper
• 2310.08152
• Published
• 1
Learning to Compress Prompts with Gist Tokens
Paper
• 2304.08467
• Published
• 3
Dynamic Context Pruning for Efficient and Interpretable Autoregressive
Transformers
Paper
• 2305.15805
• Published
• 1
Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM
Inference with Transferable Prompt
Paper
• 2305.11186
• Published
• 1
Self-slimmed Vision Transformer
Paper
• 2111.12624
• Published
• 1
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Paper
• 2310.17157
• Published
• 14
RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective
Augmentation
Paper
• 2310.04408
• Published
• 1
Dynamic Token Pruning in Plain Vision Transformers for Semantic
Segmentation
Paper
• 2308.01045
• Published
• 1
Adaptive Token Sampling For Efficient Vision Transformers
Paper
• 2111.15667
• Published
• 1
Dynamic Token-Pass Transformers for Semantic Segmentation
Paper
• 2308.01944
• Published
• 1
Multi-Scale And Token Mergence: Make Your ViT More Efficient
Paper
• 2306.04897
• Published
• 1
Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient
Vision Transformers
Paper
• 2303.13755
• Published
• 1
Nugget 2D: Dynamic Contextual Compression for Scaling Decoder-only
Language Models
Paper
• 2310.02409
• Published
• 1
LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive
Prompt-Based Few-Shot Fine-Tuning
Paper
• 2305.18169
• Published
• 1
ComputeGPT: A computational chat model for numerical problems
Paper
• 2305.06223
• Published
• 1
XPrompt: Exploring the Extreme of Prompt Tuning
Paper
• 2210.04457
• Published
• 1
Reducing Sequence Length by Predicting Edit Operations with Large
Language Models
Paper
• 2305.11862
• Published
• 2
Diet Code Is Healthy: Simplifying Programs for Pre-trained Models of
Code
Paper
• 2206.14390
• Published
• 1
Split, Encode and Aggregate for Long Code Search
Paper
• 2208.11271
• Published
• 1
Recursively Summarizing Enables Long-Term Dialogue Memory in Large
Language Models
Paper
• 2308.15022
• Published
• 4
Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing
Important Tokens
Paper
• 2305.04241
• Published
• 1
Latency Adjustable Transformer Encoder for Language Understanding
Paper
• 2201.03327
• Published
• 1
Block-Skim: Efficient Question Answering for Transformer
Paper
• 2112.08560
• Published
• 1
Learned Token Pruning for Transformers
Paper
• 2107.00910
• Published
• 1
Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention
Graph in Pre-Trained Transformers
Paper
• 2305.17328
• Published
• 2
Learned Thresholds Token Merging and Pruning for Vision Transformers
Paper
• 2307.10780
• Published
• 1
Can the Inference Logic of Large Language Models be Disentangled into
Symbolic Concepts?
Paper
• 2304.01083
• Published
• 1
System 2 Attention (is something you might need too)
Paper
• 2311.11829
• Published
• 43
CoLT5: Faster Long-Range Transformers with Conditional Computation
Paper
• 2303.09752
• Published
• 2
Random-LTD: Random and Layerwise Token Dropping Brings Efficient
Training for Large-scale Transformers
Paper
• 2211.11586
• Published
• 1
TCRA-LLM: Token Compression Retrieval Augmented Large Language Model for
Inference Cost Reduction
Paper
• 2310.15556
• Published
• 1
Extending Context Window of Large Language Models via Semantic
Compression
Paper
• 2312.09571
• Published
• 16
LLoCO: Learning Long Contexts Offline
Paper
• 2404.07979
• Published
• 22
SelfCP: Compressing Long Prompt to 1/12 Using the Frozen Large Language
Model Itself
Paper
• 2405.17052
• Published
• 2
Equipping Transformer with Random-Access Reading for Long-Context
Understanding
Paper
• 2405.13216
• Published
• 1