efe
AI & ML interests
NLP, LLM
Recent Activity
reacted
to
danielhanchen's
post
with โค๏ธ
about 4 hours ago
You can now do reinforcement learning training with 7ร longer context and no accuracy loss, via our new batching algorithms.
Long reasoning chains in RL are costly, but now we enable you to train gpt-oss with GRPO & reach 380K context on a 192GB GPU.
Blog: https://unsloth.ai/docs/new/grpo-long-context
reacted
to
danielhanchen's
post
with โค๏ธ
about 4 hours ago
You can now do reinforcement learning training with 7ร longer context and no accuracy loss, via our new batching algorithms.
Long reasoning chains in RL are costly, but now we enable you to train gpt-oss with GRPO & reach 380K context on a 192GB GPU.
Blog: https://unsloth.ai/docs/new/grpo-long-context