Skip to content

第15天:LLM API开发实战

学习目标

  • 掌握LLM API的封装方法
  • 实现流式输出
  • 实现Function Calling
  • 开发LLM API封装库

课程内容

1. LLM API基础

1.1 API调用流程

基本流程

1. 准备输入(prompt、messages)
2. 调用API
3. 处理响应
4. 返回结果

代码示例

python
import requests
import json

class LLMAPIClient:
    def __init__(self, api_key, base_url, model):
        self.api_key = api_key
        self.base_url = base_url
        self.model = model
    
    def call(self, prompt, **kwargs):
        """
        调用LLM API
        
        Args:
            prompt: 输入提示
            **kwargs: 其他参数(temperature、max_tokens等)
        
        Returns:
            response: API响应
        """
        url = f"{self.base_url}/chat/completions"
        
        headers = {
            "Content-Type": "application/json",
            "Authorization": f"Bearer {self.api_key}"
        }
        
        payload = {
            "model": self.model,
            "messages": [
                {"role": "user", "content": prompt}
            ],
            **kwargs
        }
        
        response = requests.post(url, headers=headers, data=json.dumps(payload))
        response.raise_for_status()
        
        return response.json()
    
    def get_text(self, prompt, **kwargs):
        """
        获取文本输出
        
        Args:
            prompt: 输入提示
            **kwargs: 其他参数
        
        Returns:
            text: 输出文本
        """
        response = self.call(prompt, **kwargs)
        return response["choices"][0]["message"]["content"]

1.2 多模型支持

统一接口

python
class MultiModelLLMClient:
    def __init__(self):
        self.clients = {}
    
    def add_client(self, name, api_key, base_url, model):
        """
        添加模型客户端
        
        Args:
            name: 模型名称
            api_key: API密钥
            base_url: 基础URL
            model: 模型名称
        """
        self.clients[name] = LLMAPIClient(api_key, base_url, model)
    
    def call(self, model_name, prompt, **kwargs):
        """
        调用指定模型
        
        Args:
            model_name: 模型名称
            prompt: 输入提示
            **kwargs: 其他参数
        
        Returns:
            response: API响应
        """
        if model_name not in self.clients:
            raise ValueError(f"Model {model_name} not found")
        
        return self.clients[model_name].call(prompt, **kwargs)
    
    def get_text(self, model_name, prompt, **kwargs):
        """
        获取文本输出
        
        Args:
            model_name: 模型名称
            prompt: 输入提示
            **kwargs: 其他参数
        
        Returns:
            text: 输出文本
        """
        if model_name not in self.clients:
            raise ValueError(f"Model {model_name} not found")
        
        return self.clients[model_name].get_text(prompt, **kwargs)

2. 流式输出

2.1 流式输出原理

流式输出流程

1. 发送请求(stream=True)
2. 接收数据块(chunk)
3. 处理数据块
4. 实时输出

2.2 实现流式输出

OpenAI流式输出

python
from openai import OpenAI

class StreamingLLMClient:
    def __init__(self, api_key):
        self.client = OpenAI(api_key=api_key)
    
    def stream(self, model, prompt, **kwargs):
        """
        流式输出
        
        Args:
            model: 模型名称
            prompt: 输入提示
            **kwargs: 其他参数
        
        Yields:
            chunk: 数据块
        """
        stream = self.client.chat.completions.create(
            model=model,
            messages=[
                {"role": "user", "content": prompt}
            ],
            stream=True,
            **kwargs
        )
        
        for chunk in stream:
            if chunk.choices[0].delta.content:
                yield chunk.choices[0].delta.content
    
    def stream_print(self, model, prompt, **kwargs):
        """
        流式打印
        
        Args:
            model: 模型名称
            prompt: 输入提示
            **kwargs: 其他参数
        """
        for chunk in self.stream(model, prompt, **kwargs):
            print(chunk, end="", flush=True)
        print()  # 换行

通用流式输出

python
import requests
import json

class GenericStreamingLLMClient:
    def __init__(self, api_key, base_url):
        self.api_key = api_key
        self.base_url = base_url
    
    def stream(self, model, prompt, **kwargs):
        """
        流式输出
        
        Args:
            model: 模型名称
            prompt: 输入提示
            **kwargs: 其他参数
        
        Yields:
            chunk: 数据块
        """
        url = f"{self.base_url}/chat/completions"
        
        headers = {
            "Content-Type": "application/json",
            "Authorization": f"Bearer {self.api_key}"
        }
        
        payload = {
            "model": model,
            "messages": [
                {"role": "user", "content": prompt}
            ],
            "stream": True,
            **kwargs
        }
        
        response = requests.post(url, headers=headers, data=json.dumps(payload), stream=True)
        response.raise_for_status()
        
        for line in response.iter_lines():
            if line:
                line = line.decode('utf-8')
                if line.startswith('data: '):
                    data = line[6:]
                    if data == '[DONE]':
                        break
                    
                    try:
                        chunk = json.loads(data)
                        if 'choices' in chunk and len(chunk['choices']) > 0:
                            delta = chunk['choices'][0].get('delta', {})
                            content = delta.get('content', '')
                            if content:
                                yield content
                    except json.JSONDecodeError:
                        pass
    
    def stream_print(self, model, prompt, **kwargs):
        """
        流式打印
        
        Args:
            model: 模型名称
            prompt: 输入提示
            **kwargs: 其他参数
        """
        for chunk in self.stream(model, prompt, **kwargs):
            print(chunk, end="", flush=True)
        print()  # 换行

2.3 流式输出到文件

python
class StreamingToFile:
    def __init__(self, client):
        self.client = client
    
    def stream_to_file(self, model, prompt, file_path, **kwargs):
        """
        流式输出到文件
        
        Args:
            model: 模型名称
            prompt: 输入提示
            file_path: 文件路径
            **kwargs: 其他参数
        """
        with open(file_path, 'w', encoding='utf-8') as f:
            for chunk in self.client.stream(model, prompt, **kwargs):
                f.write(chunk)
                f.flush()

3. Function Calling

3.1 Function Calling原理

Function Calling流程

1. 定义函数
2. 调用LLM
3. LLM返回函数调用
4. 执行函数
5. 返回结果给LLM
6. LLM生成最终答案

3.2 实现Function Calling

OpenAI Function Calling

python
from openai import OpenAI
import json

class FunctionCallingClient:
    def __init__(self, api_key):
        self.client = OpenAI(api_key=api_key)
        self.functions = {}
    
    def register_function(self, name, func, description, parameters):
        """
        注册函数
        
        Args:
            name: 函数名称
            func: 函数对象
            description: 函数描述
            parameters: 参数定义
        """
        self.functions[name] = {
            'func': func,
            'definition': {
                'name': name,
                'description': description,
                'parameters': parameters
            }
        }
    
    def call(self, model, prompt, **kwargs):
        """
        调用LLM并处理函数调用
        
        Args:
            model: 模型名称
            prompt: 输入提示
            **kwargs: 其他参数
        
        Returns:
            response: 最终响应
        """
        messages = [
            {"role": "user", "content": prompt}
        ]
        
        functions = [f['definition'] for f in self.functions.values()]
        
        # 第一次调用
        response = self.client.chat.completions.create(
            model=model,
            messages=messages,
            functions=functions,
            function_call="auto",
            **kwargs
        )
        
        message = response.choices[0].message
        
        # 处理函数调用
        while message.function_call:
            function_name = message.function_call.name
            function_args = json.loads(message.function_call.arguments)
            
            # 执行函数
            function_result = self.functions[function_name]['func'](**function_args)
            
            # 添加函数结果到消息
            messages.append({
                "role": "assistant",
                "content": None,
                "function_call": message.function_call
            })
            messages.append({
                "role": "function",
                "name": function_name,
                "content": json.dumps(function_result)
            })
            
            # 再次调用
            response = self.client.chat.completions.create(
                model=model,
                messages=messages,
                functions=functions,
                **kwargs
            )
            
            message = response.choices[0].message
        
        return message.content

使用示例

python
# 创建客户端
client = FunctionCallingClient(api_key="your-api-key")

# 定义函数
def get_weather(city):
    """
    获取天气
    
    Args:
        city: 城市名称
    
    Returns:
        weather: 天气信息
    """
    # 模拟API调用
    weather_data = {
        "北京": {"temperature": 25, "weather": "晴"},
        "上海": {"temperature": 28, "weather": "多云"},
        "广州": {"temperature": 32, "weather": "雷阵雨"}
    }
    return weather_data.get(city, {"temperature": 0, "weather": "未知"})

def get_stock_price(symbol):
    """
    获取股票价格
    
    Args:
        symbol: 股票代码
    
    Returns:
        price: 股票价格
    """
    # 模拟API调用
    stock_data = {
        "AAPL": 150.25,
        "GOOGL": 2800.50,
        "MSFT": 300.75
    }
    return stock_data.get(symbol, 0)

# 注册函数
client.register_function(
    name="get_weather",
    func=get_weather,
    description="获取指定城市的天气",
    parameters={
        "type": "object",
        "properties": {
            "city": {
                "type": "string",
                "description": "城市名称"
            }
        },
        "required": ["city"]
    }
)

client.register_function(
    name="get_stock_price",
    func=get_stock_price,
    description="获取股票价格",
    parameters={
        "type": "object",
        "properties": {
            "symbol": {
                "type": "string",
                "description": "股票代码"
            }
        },
        "required": ["symbol"]
    }
)

# 调用
response = client.call(
    model="gpt-4",
    prompt="北京今天天气怎么样?"
)
print(response)

response = client.call(
    model="gpt-4",
    prompt="苹果公司股票价格是多少?"
)
print(response)

3.3 通用Function Calling

python
class UniversalFunctionCalling:
    def __init__(self, llm_client):
        self.llm_client = llm_client
        self.functions = {}
    
    def register_function(self, name, func, description, parameters):
        """
        注册函数
        """
        self.functions[name] = {
            'func': func,
            'definition': {
                'name': name,
                'description': description,
                'parameters': parameters
            }
        }
    
    def call(self, model, prompt, **kwargs):
        """
        通用函数调用
        """
        messages = [
            {"role": "system", "content": "你是一个AI助手,可以使用工具来回答问题。"},
            {"role": "user", "content": prompt}
        ]
        
        functions = [f['definition'] for f in self.functions.values()]
        
        max_iterations = 5
        iteration = 0
        
        while iteration < max_iterations:
            # 调用LLM
            response = self.llm_client.call(
                model=model,
                messages=messages,
                functions=functions,
                **kwargs
            )
            
            message = response['choices'][0]['message']
            
            # 检查是否有函数调用
            if 'function_call' not in message:
                return message['content']
            
            # 执行函数
            function_name = message['function_call']['name']
            function_args = json.loads(message['function_call']['arguments'])
            
            function_result = self.functions[function_name]['func'](**function_args)
            
            # 添加到消息历史
            messages.append(message)
            messages.append({
                "role": "function",
                "name": function_name,
                "content": json.dumps(function_result)
            })
            
            iteration += 1
        
        return "达到最大迭代次数,无法完成请求"

4. LLM API封装库

4.1 完整的API封装库

python
"""
LLM API封装库
支持多种LLM API,提供统一的接口
"""

from typing import Optional, Dict, Any, Iterator, Callable
import requests
import json
from openai import OpenAI
import anthropic
import google.generativeai as genai

class LLMClient:
    """LLM客户端基类"""
    
    def __init__(self, api_key: str, model: str):
        self.api_key = api_key
        self.model = model
    
    def call(self, messages: list, **kwargs) -> Dict[str, Any]:
        """调用LLM"""
        raise NotImplementedError
    
    def stream(self, messages: list, **kwargs) -> Iterator[str]:
        """流式输出"""
        raise NotImplementedError
    
    def get_text(self, prompt: str, **kwargs) -> str:
        """获取文本"""
        response = self.call([{"role": "user", "content": prompt}], **kwargs)
        return self._extract_text(response)
    
    def _extract_text(self, response: Dict[str, Any]) -> str:
        """提取文本"""
        raise NotImplementedError


class OpenAIClient(LLMClient):
    """OpenAI客户端"""
    
    def __init__(self, api_key: str, model: str = "gpt-4"):
        super().__init__(api_key, model)
        self.client = OpenAI(api_key=api_key)
    
    def call(self, messages: list, **kwargs) -> Dict[str, Any]:
        response = self.client.chat.completions.create(
            model=self.model,
            messages=messages,
            **kwargs
        )
        return response.model_dump()
    
    def stream(self, messages: list, **kwargs) -> Iterator[str]:
        stream = self.client.chat.completions.create(
            model=self.model,
            messages=messages,
            stream=True,
            **kwargs
        )
        
        for chunk in stream:
            if chunk.choices[0].delta.content:
                yield chunk.choices[0].delta.content
    
    def _extract_text(self, response: Dict[str, Any]) -> str:
        return response['choices'][0]['message']['content']


class AnthropicClient(LLMClient):
    """Anthropic客户端"""
    
    def __init__(self, api_key: str, model: str = "claude-3-opus-20240229"):
        super().__init__(api_key, model)
        self.client = anthropic.Anthropic(api_key=api_key)
    
    def call(self, messages: list, **kwargs) -> Dict[str, Any]:
        message = self.client.messages.create(
            model=self.model,
            messages=messages,
            **kwargs
        )
        return message.model_dump()
    
    def stream(self, messages: list, **kwargs) -> Iterator[str]:
        with self.client.messages.stream(
            model=self.model,
            messages=messages,
            **kwargs
        ) as stream:
            for text in stream.text_stream:
                yield text
    
    def _extract_text(self, response: Dict[str, Any]) -> str:
        return response['content'][0]['text']


class GeminiClient(LLMClient):
    """Gemini客户端"""
    
    def __init__(self, api_key: str, model: str = "gemini-1.5-pro"):
        super().__init__(api_key, model)
        genai.configure(api_key=api_key)
        self.model = genai.GenerativeModel(model)
    
    def call(self, messages: list, **kwargs) -> Dict[str, Any]:
        # 转换消息格式
        prompt = self._convert_messages(messages)
        
        response = self.model.generate_content(
            prompt,
            generation_config=genai.types.GenerationConfig(**kwargs)
        )
        
        return {
            'content': [{'text': response.text}],
            'usage': {
                'prompt_tokens': response.usage_metadata.prompt_token_count,
                'completion_tokens': response.usage_metadata.candidates_token_count,
                'total_tokens': response.usage_metadata.total_token_count
            }
        }
    
    def stream(self, messages: list, **kwargs) -> Iterator[str]:
        prompt = self._convert_messages(messages)
        
        response = self.model.generate_content(
            prompt,
            stream=True,
            generation_config=genai.types.GenerationConfig(**kwargs)
        )
        
        for chunk in response:
            if chunk.text:
                yield chunk.text
    
    def _convert_messages(self, messages: list) -> str:
        """转换消息格式"""
        return "\n".join([f"{msg['role']}: {msg['content']}" for msg in messages])
    
    def _extract_text(self, response: Dict[str, Any]) -> str:
        return response['content'][0]['text']


class GenericLLMClient(LLMClient):
    """通用LLM客户端(兼容OpenAI格式)"""
    
    def __init__(self, api_key: str, base_url: str, model: str):
        super().__init__(api_key, model)
        self.base_url = base_url
    
    def call(self, messages: list, **kwargs) -> Dict[str, Any]:
        url = f"{self.base_url}/chat/completions"
        
        headers = {
            "Content-Type": "application/json",
            "Authorization": f"Bearer {self.api_key}"
        }
        
        payload = {
            "model": self.model,
            "messages": messages,
            **kwargs
        }
        
        response = requests.post(url, headers=headers, data=json.dumps(payload))
        response.raise_for_status()
        
        return response.json()
    
    def stream(self, messages: list, **kwargs) -> Iterator[str]:
        url = f"{self.base_url}/chat/completions"
        
        headers = {
            "Content-Type": "application/json",
            "Authorization": f"Bearer {self.api_key}"
        }
        
        payload = {
            "model": self.model,
            "messages": messages,
            "stream": True,
            **kwargs
        }
        
        response = requests.post(url, headers=headers, data=json.dumps(payload), stream=True)
        response.raise_for_status()
        
        for line in response.iter_lines():
            if line:
                line = line.decode('utf-8')
                if line.startswith('data: '):
                    data = line[6:]
                    if data == '[DONE]':
                        break
                    
                    try:
                        chunk = json.loads(data)
                        if 'choices' in chunk and len(chunk['choices']) > 0:
                            delta = chunk['choices'][0].get('delta', {})
                            content = delta.get('content', '')
                            if content:
                                yield content
                    except json.JSONDecodeError:
                        pass
    
    def _extract_text(self, response: Dict[str, Any]) -> str:
        return response['choices'][0]['message']['content']


class LLMFactory:
    """LLM工厂类"""
    
    @staticmethod
    def create_client(provider: str, api_key: str, model: str, **kwargs) -> LLMClient:
        """
        创建LLM客户端
        
        Args:
            provider: 提供商(openai、anthropic、gemini、generic)
            api_key: API密钥
            model: 模型名称
            **kwargs: 其他参数
        
        Returns:
            client: LLM客户端
        """
        if provider == "openai":
            return OpenAIClient(api_key, model)
        elif provider == "anthropic":
            return AnthropicClient(api_key, model)
        elif provider == "gemini":
            return GeminiClient(api_key, model)
        elif provider == "generic":
            base_url = kwargs.get("base_url")
            if not base_url:
                raise ValueError("base_url is required for generic client")
            return GenericLLMClient(api_key, base_url, model)
        else:
            raise ValueError(f"Unknown provider: {provider}")


class LLMManager:
    """LLM管理器"""
    
    def __init__(self):
        self.clients = {}
    
    def add_client(self, name: str, provider: str, api_key: str, model: str, **kwargs):
        """
        添加客户端
        
        Args:
            name: 客户端名称
            provider: 提供商
            api_key: API密钥
            model: 模型名称
            **kwargs: 其他参数
        """
        self.clients[name] = LLMFactory.create_client(provider, api_key, model, **kwargs)
    
    def get_client(self, name: str) -> LLMClient:
        """
        获取客户端
        
        Args:
            name: 客户端名称
        
        Returns:
            client: LLM客户端
        """
        if name not in self.clients:
            raise ValueError(f"Client {name} not found")
        return self.clients[name]

4.2 使用示例

python
# 创建管理器
manager = LLMManager()

# 添加客户端
manager.add_client(
    name="gpt4",
    provider="openai",
    api_key="your-openai-api-key",
    model="gpt-4"
)

manager.add_client(
    name="claude",
    provider="anthropic",
    api_key="your-anthropic-api-key",
    model="claude-3-opus-20240229"
)

manager.add_client(
    name="gemini",
    provider="gemini",
    api_key="your-google-api-key",
    model="gemini-1.5-pro"
)

# 调用
gpt4_client = manager.get_client("gpt4")
response = gpt4_client.get_text("请介绍一下GPT-4")
print(response)

# 流式输出
print("流式输出:")
for chunk in gpt4_client.stream([{"role": "user", "content": "请写一首关于春天的诗"}]):
    print(chunk, end="", flush=True)
print()

# 对比多个模型
prompt = "请用Python写一个快速排序算法"
for name in ["gpt4", "claude", "gemini"]:
    print(f"\n{name}:")
    client = manager.get_client(name)
    response = client.get_text(prompt)
    print(response)

实践任务

任务:开发LLM API封装库

目标:实现一个完整的LLM API封装库

要求

  1. 支持多种LLM API(OpenAI、Anthropic、Gemini等)
  2. 实现流式输出
  3. 实现Function Calling
  4. 提供统一的接口

代码框架

python
# TODO: 实现LLM API封装库
# 1. 定义基类
# 2. 实现各种客户端
# 3. 实现流式输出
# 4. 实现Function Calling
# 5. 提供工厂类
# 6. 编写测试代码

课后作业

作业1:API封装库设计

题目:设计一个完整的LLM API封装库

要求

  1. 设计类结构
  2. 定义接口规范
  3. 考虑扩展性
  4. 编写设计文档

作业2:流式输出优化

题目:优化流式输出性能

要求

  1. 分析流式输出的性能瓶颈
  2. 提出优化方案
  3. 实现优化代码

作业3:Function Calling扩展

题目:扩展Function Calling功能

要求

  1. 支持异步函数调用
  2. 支持并行函数调用
  3. 实现函数调用缓存

参考资料

官方文档

  1. OpenAI API: https://platform.openai.com/docs/api-reference
  2. Anthropic API: https://docs.anthropic.com/claude/reference/getting-started-with-the-api
  3. Google AI API: https://ai.google.dev/docs

推荐阅读

  1. OpenAI Cookbook: https://github.com/openai/openai-cookbook

    • OpenAI API示例
  2. Anthropic Prompt Library: https://docs.anthropic.com/claude/prompt-library

    • Anthropic提示库

在线资源

  1. LangChain: https://python.langchain.com/

    • LLM应用框架
  2. LlamaIndex: https://docs.llamaindex.ai/

    • 数据框架

扩展阅读

API优化

  • 批处理: 减少API调用次数
  • 缓存: 缓存常用请求
  • 重试: 处理网络错误

安全性

  • API密钥管理: 安全存储密钥
  • 请求限流: 避免超出限制
  • 内容过滤: 过滤敏感内容

下节预告

下一节我们将学习Prompt Engineering,学习如何设计有效的Prompt、Few-shot学习、CoT(Chain of Thought)等技术。


架构师AI杜公众号二维码

扫描二维码关注"架构师AI杜"公众号,获取更多技术内容和最新动态