zechenzhangAGI

Zechen Zhang

@zechenzhangAGI

Building the future of AI-human collaborations

82 skills 140,384 total stars

find ~/zechenzhangAGI/ -name "*.skill"

Simple Preference Optimization for LLM alignment. Reference-free alternative to DPO with better performance (+6.4...

Scalable data processing for ML workloads. Streaming execution across CPU/GPU, supports Parquet/CSV/JSON/images....

Meta's 7-8B specialized moderation model for LLM input/output filtering. 6 safety categories - violence/hate, sexual...

Provides guidance for PyTorch-native agentic RL using torchforge, Meta's library separating infra from algorithms....

Trains large language models (2B-462B parameters) using NVIDIA Megatron-Core with advanced parallelism strategies....

Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for...

Provides guidance for training LLMs with reinforcement learning using verl (Volcano Engine RL). Use when...

High-level PyTorch framework with Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), callbacks...

Language-independent tokenizer treating text as raw Unicode. Supports BPE and Unigram algorithms. Fast (50k...

Reserved and on-demand GPU cloud instances for ML training and inference. Use when you need dedicated GPU instances...

Expert guidance for fine-tuning LLMs with LLaMA-Factory - WebUI no-code, 100+ models, 2/3/4/5/6/8-bit QLoRA,...

Educational GPT implementation in ~300 lines. Reproduces GPT-2 (124M) on OpenWebText. Clean, hackable code for...

Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization

Write publication-ready ML/AI papers for NeurIPS, ICML, ICLR, ACL, AAAI, COLM. Use when drafting papers from...

State-space model with O(n) complexity vs Transformers' O(n²). 5× faster inference, million-token sequences, no KV...

Fast tokenizers optimized for research and production. Rust-based implementation tokenizes 1GB in <20 seconds....

Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment,...

Expert guidance for distributed training with DeepSpeed - ZeRO optimization stages, pipeline parallelism,...