Scaling Intelligence

university

https://scalingintelligence.stanford.edu/

ScalingIntelligence

AI & ML interests

None defined yet.

Bradley

updated a dataset 3 months ago

ScalingIntelligence/monkey_business

Viewer • Updated Oct 8, 2025 • 2.88k • 395 • 19

ekellbuch

authored 2 papers 4 months ago

Pathologies of Predictive Diversity in Deep Ensembles

Paper • 2302.00704 • Published Feb 1, 2023 • 1

Brain-to-Text Benchmark '24: Lessons Learned

Paper • 2412.17227 • Published Dec 23, 2024 • 1

simarora

authored a paper 5 months ago

Cartridges: Lightweight and general-purpose long context representations via self-study

Paper • 2506.06266 • Published Jun 6, 2025 • 7

ekellbuch

authored 2 papers 5 months ago

Archon: An Architecture Search Framework for Inference-Time Techniques

Paper • 2409.15254 • Published Sep 23, 2024 • 1

Shrinking the Generation-Verification Gap with Weak Verifiers

Paper • 2506.18203 • Published Jun 22, 2025 • 1

anneouyang

updated a dataset 5 months ago

ScalingIntelligence/KernelBench

Viewer • Updated Jul 21, 2025 • 270 • 4.35k • 35

a1zhang

authored 3 papers 7 months ago

SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?

Paper • 2410.03859 • Published Oct 4, 2024 • 1

KernelBench: Can LLMs Write Efficient GPU Kernels?

Paper • 2502.10517 • Published Feb 14, 2025 • 3

VideoGameBench: Can Vision-Language Models complete popular video games?

Paper • 2505.18134 • Published May 23, 2025 • 6

simonguozirui

authored 2 papers 10 months ago

BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts

Paper • 2408.08274 • Published Aug 15, 2024 • 13

KernelBench: Can LLMs Write Efficient GPU Kernels?

Paper • 2502.10517 • Published Feb 14, 2025 • 3

simarora

authored 5 papers about 1 year ago

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Paper • 2306.11698 • Published Jun 20, 2023 • 12

Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT

Paper • 2402.07440 • Published Feb 12, 2024 • 1

Simple linear attention language models balance the recall-throughput tradeoff

Paper • 2402.18668 • Published Feb 28, 2024 • 20

Just read twice: closing the recall gap for recurrent language models

Paper • 2407.05483 • Published Jul 7, 2024

LoLCATs: On Low-Rank Linearizing of Large Language Models

Paper • 2410.10254 • Published Oct 14, 2024

Bradley

authored a paper over 1 year ago

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

Paper • 2407.21787 • Published Jul 31, 2024 • 13

danbider

authored a paper over 1 year ago

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 90

ekellbuch

authored a paper over 1 year ago

Deep Ensembles Work, But Are They Necessary?

Paper • 2202.06985 • Published Feb 14, 2022