·
AI & ML interests
None yet
Organizations
alexshengzhili/testing_lora
Updated
alexshengzhili/qwen2_5_vl_7b_repcount
8B
•
Updated
•
4
alexshengzhili/Qwen2.5-3B-Open-R1-Code-GRPO-r2
Text Generation
•
3B
•
Updated
•
9
alexshengzhili/Qwen2.5-1.5B-Open-R1-Code-GRPO-r2
2B
•
Updated
•
8
alexshengzhili/Qwen-2.5-7B-Simple-RL
Updated
alexshengzhili/Qwen2.5-7B-Open-R1-Code-GRPO-r2
Updated
alexshengzhili/Qwen2.5-1.5B-Open-R1-Code-GRPO
Updated
alexshengzhili/llama3_8b_dpo_0908_preference_4_conference_shuffled_2023
Text Generation
•
8B
•
Updated
•
6
alexshengzhili/mistral_3_0908_preference_4_conference_shuffled_2023_sft
Text Generation
•
7B
•
Updated
•
4
alexshengzhili/dpo_0908_preference_4_conference_shuffled_2023_checkpoint_30
Text Generation
•
7B
•
Updated
•
6
alexshengzhili/phi3-dpo_0908_preference_4_conference_shuffled_2023
Text Generation
•
4B
•
Updated
•
6
alexshengzhili/phi3-dpo_0907_preference_iclr2023
Text Generation
•
4B
•
Updated
•
5
alexshengzhili/llama3.1-8b-lora_dpo_0907_preference_iclr2023
Text Generation
•
8B
•
Updated
•
6
alexshengzhili/llama3.1-8b-0806_iclr2023_cleaned
Text Generation
•
8B
•
Updated
•
7
alexshengzhili/phi3-0806_iclr2023_cleaned
Text Generation
•
4B
•
Updated
•
8
alexshengzhili/phi3-0608_all_of_train-dpo-merged
Text Generation
•
4B
•
Updated
•
7
alexshengzhili/ph3-0607-lora-dpo-beta-0dot1-merged
Text Generation
•
4B
•
Updated
•
5
alexshengzhili/ph3-0606-lora-dpo-beta-0dot2-merged
Text Generation
•
4B
•
Updated
•
5
alexshengzhili/ph3-0606-lora-dpo-merged
Text Generation
•
4B
•
Updated
•
7
alexshengzhili/ph3-0606-lora-sft-merged
Text Generation
•
4B
•
Updated
•
8
alexshengzhili/llava-v1.5-13b-dpo
Text Generation
•
Updated
•
8
•
5
alexshengzhili/llava-dpo-13b
Text Generation
•
Updated
•
7
alexshengzhili/llava-dpo-13b-lora
alexshengzhili/llava-lora-dpo-1227lrvtail2000_from_sft-self-sampled-beta-0.5-lr-5e-5-avg-False-epoch-3
Updated
alexshengzhili/llava-v1.5-13b-lora-coh-interleaf-lrv1500llava2000
Updated
alexshengzhili/llava-v1.5-13b-lora-1227-COH-lrv0-3230llava0-5879_interleaved.json
alexshengzhili/llava-lora-dpo-1227lrvtail2000_sft-self-sampled-beta-0.5-lr-5e-6-avg-False-epoch-3
alexshengzhili/llava-lora-dpo-1227lrvtail2000_sft-self-sampled-beta-0.5-lr-5e-6-avg-False-epoch-2
alexshengzhili/llava-lora-dpo-1227lrvtail2000_sft-self-sampled-beta-0.5-lr-5e-5-avg-False-epoch-3
alexshengzhili/llava-lora-dpo-1227lrvtail2000_sft-self-sampled-beta-0.5-lr-5e-5-avg-False-epoch-2
Updated