最佳实践

Prompt 工程

使用系统消息

通过系统消息设定行为和语气：

messages = [
    {"role": "system", "content": "你是一个资深 Python 开发者。回答时附带代码示例。"},
    {"role": "user", "content": "如何按值排序字典？"},
]

写得具体

中国模型对详细、结构化的提示词响应更好：

❌ 差：「写代码」
✅ 好：「写一个 Python 函数，接收一个整数列表，返回出现频率最高的 3 个值」

性能优化

设置 max_tokens

限制响应长度以降低成本和延迟：

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "总结：..."}],
    max_tokens=200,  # 简短摘要
)

聊天界面使用流式

聊天类应用建议始终使用流式响应：

stream = client.chat.completions.create(
    model="deepseek-chat",
    messages=messages,
    stream=True,
)

成本管理

追踪 Token 用量

每次记录用量以监控成本：

response = client.chat.completions.create(...)
print(f"成本：{response.usage.total_tokens} tokens")
# prompt_tokens、completion_tokens、total_tokens

选择合适的模型

任务	最佳模型	原因
聊天/客服	`deepseek-chat`	性价比最高
数学/逻辑	`deepseek-reasoner`	推理专长
翻译	`qwen-max`	多语言优化
中文内容	`glm-4`	原生中文模型
预算紧张	`qwen-plus`	便宜又好用

错误处理

始终优雅地处理 API 错误：

from openai import (
    APIError, RateLimitError, APIConnectionError, AuthenticationError
)

try:
    response = client.chat.completions.create(...)
except AuthenticationError:
    print("API Key 无效")
except RateLimitError:
    print("请求过多 — 正在退避")
    time.sleep(10)
except APIConnectionError:
    print("网络问题 — 重试中")
except APIError as e:
    print(f"API 错误：{e}")

安全

使用环境变量，永远不要硬编码 Key
定期轮换 Key
开发和生产使用不同的 Key
绝不在客户端代码或公开仓库中暴露 Key