zechenzhangAGI

Zechen Zhang

@zechenzhangAGI

Building the future of AI-human collaborations

82 skills 140,384 total stars

find ~/zechenzhangAGI/ -name "*.skill"

Parameter-efficient fine-tuning for LLMs using LoRA, QLoRA, and 25+ methods. Use when fine-tuning large models...

High-performance RLHF framework with Ray+vLLM acceleration. Use for PPO, GRPO, RLOO, DPO training of large models...

RNN+Transformer hybrid with O(n) inference. Linear time, infinite context, no KV cache. Train like GPT (parallel),...

Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO,...

Provides guidance for LLM post-training with RL using slime, a Megatron+SGLang framework. Use when training GLM...

Expert guidance for GRPO/RL fine-tuning with TRL for reasoning and task-specific model training

Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization

Implements and trains LLMs using Lightning AI's LitGPT with 20+ pretrained architectures (Llama, Gemma, Phi, Qwen,...

Simple Preference Optimization for LLM alignment. Reference-free alternative to DPO with better performance (+6.4...

Provides guidance for training LLMs with reinforcement learning using verl (Volcano Engine RL). Use when...

Provides guidance for PyTorch-native agentic RL using torchforge, Meta's library separating infra from algorithms....

Language-independent tokenizer treating text as raw Unicode. Supports BPE and Unigram algorithms. Fast (50k...

Expert guidance for fine-tuning LLMs with LLaMA-Factory - WebUI no-code, 100+ models, 2/3/4/5/6/8-bit QLoRA,...

Educational GPT implementation in ~300 lines. Reproduces GPT-2 (124M) on OpenWebText. Clean, hackable code for...

Provides guidance for enterprise-grade RL training using miles, a production-ready fork of slime. Use when training...

Write publication-ready ML/AI papers for NeurIPS, ICML, ICLR, ACL, AAAI, COLM. Use when drafting papers from...

State-space model with O(n) complexity vs Transformers' O(n²). 5× faster inference, million-token sequences, no KV...

Fast tokenizers optimized for research and production. Rust-based implementation tokenizes 1GB in <20 seconds....

Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment,...