指南
并行化请求
使用 xAI API 时,您可能需要处理数百甚至数千个请求。按顺序发送这些请求可能非常耗时。
为了提高效率,您可以使用AsyncOpenAI
从openai
SDK,支持同时发送多个请求。下面的示例是一个 Python 脚本,演示了如何使用AsyncOpenAI
异步批处理和处理请求,从而显著减少总体执行时间:
xAI API 当前不提供批处理 API。
速率限制
调整
max_concurrent
param 来控制并行请求的最大数量。您无法在 API 控制台中显示的速率限制之外并行化请求。
import asyncio
import os
from asyncio import Semaphore
from typing import List
from openai import AsyncOpenAI
client = AsyncOpenAI(
api_key=os.getenv("XAI_API_KEY"),
base_url="https://api.x.ai/v1"
)
async def send_request(sem: Semaphore, request: str) -> dict:
"""Send a single request to xAI with semaphore control."""
# The 'async with sem' ensures only a limited number of requests run at once
async with sem:
return await client.chat.completions.create(
model="grok-2-latest",
messages=[{"role": "user", "content": request}]
)
async def process_requests(requests: List[str], max_concurrent: int = 2) -> List[dict]:
"""Process multiple requests with controlled concurrency."""
# Create a semaphore that limits how many requests can run at the same time
# Think of it like having only 2 "passes" to make requests simultaneously
sem = Semaphore(max_concurrent)
# Create a list of tasks (requests) that will run using the semaphore
tasks = [send_request(sem, request) for request in requests]
# asyncio.gather runs all tasks in parallel but respects the semaphore limit
# It waits for all tasks to complete and returns their results
return await asyncio.gather(*tasks)
async def main() -> None:
"""Main function to handle requests and display responses."""
requests = [
"Tell me a joke",
"Write a funny haiku",
"Generate a funny X post",
"Say something unhinged"
]
# This starts processing all requests in parallel, but only 2 at a time
# Instead of waiting for each request to finish before starting the next,
# we can have 2 requests running at once, making it faster overall
responses = await process_requests(requests)
# Print each response in order
for i, response in enumerate(responses):
print(f"# Response {i}:")
print(response.choices[0].message.content)
if __name__ == "__main__":
asyncio.run(main())