Refactor high-complexity React components in Dify frontend. Use when `pnpm analyze-component...
npx skills add jrajasekera/claude-skills --skill "z-ai-api"
Install specific skill from multi-skill repository
# Description
|
# SKILL.md
name: z-ai-api
description: |
Z.ai API integration for building applications with GLM models. Use when working with Z.ai/ZhipuAI APIs for: (1) Chat completions with GLM-4.7/4.6/4.5 models, (2) Vision/multimodal tasks with GLM-4.6V, (3) Image generation with GLM-Image or CogView-4, (4) Video generation with CogVideoX-3 or Vidu models, (5) Audio transcription with GLM-ASR-2512, (6) Function calling and tool use, (7) Web search integration, (8) Translation, slide/poster generation agents. Triggers: Z.ai, ZhipuAI, GLM, BigModel, Zhipu, CogVideoX, CogView, Vidu.
Z.ai API Skill
Quick Reference
Base URL: https://api.z.ai/api/paas/v4
Coding Plan URL: https://api.z.ai/api/coding/paas/v4
Auth: Authorization: Bearer YOUR_API_KEY
Core Endpoints
| Endpoint | Purpose |
|---|---|
/chat/completions |
Text/vision chat |
/images/generations |
Image generation |
/videos/generations |
Video generation (async) |
/audio/transcriptions |
Speech-to-text |
/web_search |
Web search |
/async-result/{id} |
Poll async tasks |
/v1/agents |
Translation, slides, effects |
Model Selection
Chat (pick by need):
- glm-4.7 β Latest flagship, best quality, agentic coding
- glm-4.7-flash β Fast, high quality
- glm-4.6 β Reliable general use
- glm-4.5-flash β Fastest, lower cost
Vision:
- glm-4.6v β Best multimodal (images, video, files)
- glm-4.6v-flash β Fast vision
Media:
- glm-image β High-quality images (HD, ~20s)
- cogview-4-250304 β Fast images (~5-10s)
- cogvideox-3 β Video, up to 4K, 5-10s
- viduq1-text/image β Vidu video generation
Implementation Patterns
Basic Chat
from zai import ZaiClient
client = ZaiClient(api_key="YOUR_KEY")
response = client.chat.completions.create(
model="glm-4.7",
messages=[
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)
OpenAI SDK Compatibility
from openai import OpenAI
client = OpenAI(
api_key="YOUR_ZAI_KEY",
base_url="https://api.z.ai/api/paas/v4/"
)
# Use exactly like OpenAI SDK
Streaming
response = client.chat.completions.create(
model="glm-4.7",
messages=[...],
stream=True
)
for chunk in response:
print(chunk.choices[0].delta.content, end="")
Function Calling
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string"}
},
"required": ["city"]
}
}
}]
response = client.chat.completions.create(
model="glm-4.7",
messages=[{"role": "user", "content": "Weather in Tokyo?"}],
tools=tools,
tool_choice="auto"
)
# Handle tool_calls in response.choices[0].message.tool_calls
Vision (Images/Video/Files)
response = client.chat.completions.create(
model="glm-4.6v",
messages=[{
"role": "user",
"content": [
{"type": "image_url", "image_url": {"url": "https://..."}},
{"type": "text", "text": "Describe this image"}
]
}]
)
Image Generation
response = client.images.generate(
model="glm-image",
prompt="A serene mountain at sunset",
size="1280x1280",
quality="hd"
)
print(response.data[0].url) # Expires in 30 days
Video Generation (Async)
# Submit
response = client.videos.generate(
model="cogvideox-3",
prompt="A cat playing with yarn",
size="1920x1080",
duration=5
)
task_id = response.id
# Poll for result
import time
while True:
result = client.async_result.get(task_id)
if result.task_status == "SUCCESS":
print(result.video_result[0].url)
break
time.sleep(5)
Web Search Integration
response = client.chat.completions.create(
model="glm-4.7",
messages=[{"role": "user", "content": "Latest AI news?"}],
tools=[{
"type": "web_search",
"web_search": {
"enable": True,
"search_result": True
}
}]
)
# Access response.web_search for sources
Thinking Mode (Chain-of-Thought)
response = client.chat.completions.create(
model="glm-4.7",
messages=[...],
thinking={"type": "enabled"},
stream=True # Recommended with thinking
)
# Access reasoning_content in response
Key Parameters
| Parameter | Values | Notes |
|---|---|---|
temperature |
0.0-1.0 | GLM-4.7: 1.0, GLM-4.5: 0.6 default |
top_p |
0.01-1.0 | Default ~0.95 |
max_tokens |
varies | GLM-4.7: 128K, GLM-4.5: 96K max |
stream |
bool | Enable SSE streaming |
response_format |
{"type": "json_object"} |
Force JSON output |
Error Handling
- 429: Rate limited β implement exponential backoff
- 401: Bad API key β verify credentials
- sensitive: Content filtered β modify input
if response.choices[0].finish_reason == "tool_calls":
# Execute function and continue conversation
elif response.choices[0].finish_reason == "length":
# Increase max_tokens or truncate
elif response.choices[0].finish_reason == "sensitive":
# Content was filtered
Reference Files
For detailed API specifications, consult:
- references/chat-completions.md β Full chat API, parameters, models
- references/tools-and-functions.md β Function calling, web search, retrieval
- references/media-generation.md β Image, video, audio APIs
- references/agents.md β Translation, slides, effects agents
- references/error-codes.md β Error handling, rate limits
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.