第15天：LLM API开发实战

学习目标

本节将带领读者掌握LLM API的封装方法，实现流式输出，实现Function Calling，并开发LLM API封装库。

课程内容

1. LLM API基础

1.1 API调用流程

LLM API的调用流程包括准备输入（prompt、messages）、调用API、处理响应、返回结果等步骤。这个流程看似简单，但实际开发中需要考虑很多细节，比如错误处理、重试机制、超时控制等。

代码示例：

python

import requests
import json

class LLMAPIClient:
    def __init__(self, api_key, base_url, model):
        self.api_key = api_key
        self.base_url = base_url
        self.model = model
    
    def call(self, prompt, **kwargs):
        """
        调用LLM API
        
        Args:
            prompt: 输入提示
            **kwargs: 其他参数（temperature、max_tokens等）
        
        Returns:
            response: API响应
        """
        url = f"{self.base_url}/chat/completions"
        
        headers = {
            "Content-Type": "application/json",
            "Authorization": f"Bearer {self.api_key}"
        }
        
        payload = {
            "model": self.model,
            "messages": [
                {"role": "user", "content": prompt}
            ],
            **kwargs
        }
        
        response = requests.post(url, headers=headers, data=json.dumps(payload))
        response.raise_for_status()
        
        return response.json()
    
    def get_text(self, prompt, **kwargs):
        """
        获取文本输出
        
        Args:
            prompt: 输入提示
            **kwargs: 其他参数
        
        Returns:
            text: 输出文本
        """
        response = self.call(prompt, **kwargs)
        return response["choices"][0]["message"]["content"]

1.2 多模型支持

统一接口：

python

class MultiModelLLMClient:
    def __init__(self):
        self.clients = {}
    
    def add_client(self, name, api_key, base_url, model):
        """
        添加模型客户端
        
        Args:
            name: 模型名称
            api_key: API密钥
            base_url: 基础URL
            model: 模型名称
        """
        self.clients[name] = LLMAPIClient(api_key, base_url, model)
    
    def call(self, model_name, prompt, **kwargs):
        """
        调用指定模型
        
        Args:
            model_name: 模型名称
            prompt: 输入提示
            **kwargs: 其他参数
        
        Returns:
            response: API响应
        """
        if model_name not in self.clients:
            raise ValueError(f"Model {model_name} not found")
        
        return self.clients[model_name].call(prompt, **kwargs)
    
    def get_text(self, model_name, prompt, **kwargs):
        """
        获取文本输出
        
        Args:
            model_name: 模型名称
            prompt: 输入提示
            **kwargs: 其他参数
        
        Returns:
            text: 输出文本
        """
        if model_name not in self.clients:
            raise ValueError(f"Model {model_name} not found")
        
        return self.clients[model_name].get_text(prompt, **kwargs)

2. 流式输出

2.1 流式输出原理

流式输出流程：

流式输出的流程包括发送请求（stream=True）、接收数据块（chunk）、处理数据块、实时输出等步骤。通过流式输出，用户可以在模型生成过程中实时看到结果，大大提升了用户体验。

2.2 实现流式输出

OpenAI流式输出：

python

from openai import OpenAI

class StreamingLLMClient:
    def __init__(self, api_key):
        self.client = OpenAI(api_key=api_key)
    
    def stream(self, model, prompt, **kwargs):
        """
        流式输出
        
        Args:
            model: 模型名称
            prompt: 输入提示
            **kwargs: 其他参数
        
        Yields:
            chunk: 数据块
        """
        stream = self.client.chat.completions.create(
            model=model,
            messages=[
                {"role": "user", "content": prompt}
            ],
            stream=True,
            **kwargs
        )
        
        for chunk in stream:
            if chunk.choices[0].delta.content:
                yield chunk.choices[0].delta.content
    
    def stream_print(self, model, prompt, **kwargs):
        """
        流式打印
        
        Args:
            model: 模型名称
            prompt: 输入提示
            **kwargs: 其他参数
        """
        for chunk in self.stream(model, prompt, **kwargs):
            print(chunk, end="", flush=True)
        print()  # 换行

通用流式输出：

python

import requests
import json

class GenericStreamingLLMClient:
    def __init__(self, api_key, base_url):
        self.api_key = api_key
        self.base_url = base_url
    
    def stream(self, model, prompt, **kwargs):
        """
        流式输出
        
        Args:
            model: 模型名称
            prompt: 输入提示
            **kwargs: 其他参数
        
        Yields:
            chunk: 数据块
        """
        url = f"{self.base_url}/chat/completions"
        
        headers = {
            "Content-Type": "application/json",
            "Authorization": f"Bearer {self.api_key}"
        }
        
        payload = {
            "model": model,
            "messages": [
                {"role": "user", "content": prompt}
            ],
            "stream": True,
            **kwargs
        }
        
        response = requests.post(url, headers=headers, data=json.dumps(payload), stream=True)
        response.raise_for_status()
        
        for line in response.iter_lines():
            if line:
                line = line.decode('utf-8')
                if line.startswith('data: '):
                    data = line[6:]
                    if data == '[DONE]':
                        break
                    
                    try:
                        chunk = json.loads(data)
                        if 'choices' in chunk and len(chunk['choices']) > 0:
                            delta = chunk['choices'][0].get('delta', {})
                            content = delta.get('content', '')
                            if content:
                                yield content
                    except json.JSONDecodeError:
                        pass
    
    def stream_print(self, model, prompt, **kwargs):
        """
        流式打印
        
        Args:
            model: 模型名称
            prompt: 输入提示
            **kwargs: 其他参数
        """
        for chunk in self.stream(model, prompt, **kwargs):
            print(chunk, end="", flush=True)
        print()  # 换行

2.3 流式输出到文件

python

class StreamingToFile:
    def __init__(self, client):
        self.client = client
    
    def stream_to_file(self, model, prompt, file_path, **kwargs):
        """
        流式输出到文件
        
        Args:
            model: 模型名称
            prompt: 输入提示
            file_path: 文件路径
            **kwargs: 其他参数
        """
        with open(file_path, 'w', encoding='utf-8') as f:
            for chunk in self.client.stream(model, prompt, **kwargs):
                f.write(chunk)
                f.flush()

3. Function Calling

3.1 Function Calling原理

Function Calling流程：

1. 定义函数
2. 调用LLM
3. LLM返回函数调用
4. 执行函数
5. 返回结果给LLM
6. LLM生成最终答案

3.2 实现Function Calling

OpenAI Function Calling：

python

from openai import OpenAI
import json

class FunctionCallingClient:
    def __init__(self, api_key):
        self.client = OpenAI(api_key=api_key)
        self.functions = {}
    
    def register_function(self, name, func, description, parameters):
        """
        注册函数
        
        Args:
            name: 函数名称
            func: 函数对象
            description: 函数描述
            parameters: 参数定义
        """
        self.functions[name] = {
            'func': func,
            'definition': {
                'name': name,
                'description': description,
                'parameters': parameters
            }
        }
    
    def call(self, model, prompt, **kwargs):
        """
        调用LLM并处理函数调用
        
        Args:
            model: 模型名称
            prompt: 输入提示
            **kwargs: 其他参数
        
        Returns:
            response: 最终响应
        """
        messages = [
            {"role": "user", "content": prompt}
        ]
        
        functions = [f['definition'] for f in self.functions.values()]
        
        # 第一次调用
        response = self.client.chat.completions.create(
            model=model,
            messages=messages,
            functions=functions,
            function_call="auto",
            **kwargs
        )
        
        message = response.choices[0].message
        
        # 处理函数调用
        while message.function_call:
            function_name = message.function_call.name
            function_args = json.loads(message.function_call.arguments)
            
            # 执行函数
            function_result = self.functions[function_name]['func'](**function_args)
            
            # 添加函数结果到消息
            messages.append({
                "role": "assistant",
                "content": None,
                "function_call": message.function_call
            })
            messages.append({
                "role": "function",
                "name": function_name,
                "content": json.dumps(function_result)
            })
            
            # 再次调用
            response = self.client.chat.completions.create(
                model=model,
                messages=messages,
                functions=functions,
                **kwargs
            )
            
            message = response.choices[0].message
        
        return message.content

使用示例：

python

# 创建客户端
client = FunctionCallingClient(api_key="your-api-key")

# 定义函数
def get_weather(city):
    """
    获取天气
    
    Args:
        city: 城市名称
    
    Returns:
        weather: 天气信息
    """
    # 模拟API调用
    weather_data = {
        "北京": {"temperature": 25, "weather": "晴"},
        "上海": {"temperature": 28, "weather": "多云"},
        "广州": {"temperature": 32, "weather": "雷阵雨"}
    }
    return weather_data.get(city, {"temperature": 0, "weather": "未知"})

def get_stock_price(symbol):
    """
    获取股票价格
    
    Args:
        symbol: 股票代码
    
    Returns:
        price: 股票价格
    """
    # 模拟API调用
    stock_data = {
        "AAPL": 150.25,
        "GOOGL": 2800.50,
        "MSFT": 300.75
    }
    return stock_data.get(symbol, 0)

# 注册函数
client.register_function(
    name="get_weather",
    func=get_weather,
    description="获取指定城市的天气",
    parameters={
        "type": "object",
        "properties": {
            "city": {
                "type": "string",
                "description": "城市名称"
            }
        },
        "required": ["city"]
    }
)

client.register_function(
    name="get_stock_price",
    func=get_stock_price,
    description="获取股票价格",
    parameters={
        "type": "object",
        "properties": {
            "symbol": {
                "type": "string",
                "description": "股票代码"
            }
        },
        "required": ["symbol"]
    }
)

# 调用
response = client.call(
    model="gpt-4",
    prompt="北京今天天气怎么样？"
)
print(response)

response = client.call(
    model="gpt-4",
    prompt="苹果公司股票价格是多少？"
)
print(response)

3.3 通用Function Calling

python

class UniversalFunctionCalling:
    def __init__(self, llm_client):
        self.llm_client = llm_client
        self.functions = {}
    
    def register_function(self, name, func, description, parameters):
        """
        注册函数
        """
        self.functions[name] = {
            'func': func,
            'definition': {
                'name': name,
                'description': description,
                'parameters': parameters
            }
        }
    
    def call(self, model, prompt, **kwargs):
        """
        通用函数调用
        """
        messages = [
            {"role": "system", "content": "你是一个AI助手，可以使用工具来回答问题。"},
            {"role": "user", "content": prompt}
        ]
        
        functions = [f['definition'] for f in self.functions.values()]
        
        max_iterations = 5
        iteration = 0
        
        while iteration < max_iterations:
            # 调用LLM
            response = self.llm_client.call(
                model=model,
                messages=messages,
                functions=functions,
                **kwargs
            )
            
            message = response['choices'][0]['message']
            
            # 检查是否有函数调用
            if 'function_call' not in message:
                return message['content']
            
            # 执行函数
            function_name = message['function_call']['name']
            function_args = json.loads(message['function_call']['arguments'])
            
            function_result = self.functions[function_name]['func'](**function_args)
            
            # 添加到消息历史
            messages.append(message)
            messages.append({
                "role": "function",
                "name": function_name,
                "content": json.dumps(function_result)
            })
            
            iteration += 1
        
        return "达到最大迭代次数，无法完成请求"

4. LLM API封装库

4.1 完整的API封装库

python

"""
LLM API封装库
支持多种LLM API，提供统一的接口
"""

from typing import Optional, Dict, Any, Iterator, Callable
import requests
import json
from openai import OpenAI
import anthropic
import google.generativeai as genai

class LLMClient:
    """LLM客户端基类"""
    
    def __init__(self, api_key: str, model: str):
        self.api_key = api_key
        self.model = model
    
    def call(self, messages: list, **kwargs) -> Dict[str, Any]:
        """调用LLM"""
        raise NotImplementedError
    
    def stream(self, messages: list, **kwargs) -> Iterator[str]:
        """流式输出"""
        raise NotImplementedError
    
    def get_text(self, prompt: str, **kwargs) -> str:
        """获取文本"""
        response = self.call([{"role": "user", "content": prompt}], **kwargs)
        return self._extract_text(response)
    
    def _extract_text(self, response: Dict[str, Any]) -> str:
        """提取文本"""
        raise NotImplementedError


class OpenAIClient(LLMClient):
    """OpenAI客户端"""
    
    def __init__(self, api_key: str, model: str = "gpt-4"):
        super().__init__(api_key, model)
        self.client = OpenAI(api_key=api_key)
    
    def call(self, messages: list, **kwargs) -> Dict[str, Any]:
        response = self.client.chat.completions.create(
            model=self.model,
            messages=messages,
            **kwargs
        )
        return response.model_dump()
    
    def stream(self, messages: list, **kwargs) -> Iterator[str]:
        stream = self.client.chat.completions.create(
            model=self.model,
            messages=messages,
            stream=True,
            **kwargs
        )
        
        for chunk in stream:
            if chunk.choices[0].delta.content:
                yield chunk.choices[0].delta.content
    
    def _extract_text(self, response: Dict[str, Any]) -> str:
        return response['choices'][0]['message']['content']


class AnthropicClient(LLMClient):
    """Anthropic客户端"""
    
    def __init__(self, api_key: str, model: str = "claude-3-opus-20240229"):
        super().__init__(api_key, model)
        self.client = anthropic.Anthropic(api_key=api_key)
    
    def call(self, messages: list, **kwargs) -> Dict[str, Any]:
        message = self.client.messages.create(
            model=self.model,
            messages=messages,
            **kwargs
        )
        return message.model_dump()
    
    def stream(self, messages: list, **kwargs) -> Iterator[str]:
        with self.client.messages.stream(
            model=self.model,
            messages=messages,
            **kwargs
        ) as stream:
            for text in stream.text_stream:
                yield text
    
    def _extract_text(self, response: Dict[str, Any]) -> str:
        return response['content'][0]['text']


class GeminiClient(LLMClient):
    """Gemini客户端"""
    
    def __init__(self, api_key: str, model: str = "gemini-1.5-pro"):
        super().__init__(api_key, model)
        genai.configure(api_key=api_key)
        self.model = genai.GenerativeModel(model)
    
    def call(self, messages: list, **kwargs) -> Dict[str, Any]:
        # 转换消息格式
        prompt = self._convert_messages(messages)
        
        response = self.model.generate_content(
            prompt,
            generation_config=genai.types.GenerationConfig(**kwargs)
        )
        
        return {
            'content': [{'text': response.text}],
            'usage': {
                'prompt_tokens': response.usage_metadata.prompt_token_count,
                'completion_tokens': response.usage_metadata.candidates_token_count,
                'total_tokens': response.usage_metadata.total_token_count
            }
        }
    
    def stream(self, messages: list, **kwargs) -> Iterator[str]:
        prompt = self._convert_messages(messages)
        
        response = self.model.generate_content(
            prompt,
            stream=True,
            generation_config=genai.types.GenerationConfig(**kwargs)
        )
        
        for chunk in response:
            if chunk.text:
                yield chunk.text
    
    def _convert_messages(self, messages: list) -> str:
        """转换消息格式"""
        return "\n".join([f"{msg['role']}: {msg['content']}" for msg in messages])
    
    def _extract_text(self, response: Dict[str, Any]) -> str:
        return response['content'][0]['text']


class GenericLLMClient(LLMClient):
    """通用LLM客户端（兼容OpenAI格式）"""
    
    def __init__(self, api_key: str, base_url: str, model: str):
        super().__init__(api_key, model)
        self.base_url = base_url
    
    def call(self, messages: list, **kwargs) -> Dict[str, Any]:
        url = f"{self.base_url}/chat/completions"
        
        headers = {
            "Content-Type": "application/json",
            "Authorization": f"Bearer {self.api_key}"
        }
        
        payload = {
            "model": self.model,
            "messages": messages,
            **kwargs
        }
        
        response = requests.post(url, headers=headers, data=json.dumps(payload))
        response.raise_for_status()
        
        return response.json()
    
    def stream(self, messages: list, **kwargs) -> Iterator[str]:
        url = f"{self.base_url}/chat/completions"
        
        headers = {
            "Content-Type": "application/json",
            "Authorization": f"Bearer {self.api_key}"
        }
        
        payload = {
            "model": self.model,
            "messages": messages,
            "stream": True,
            **kwargs
        }
        
        response = requests.post(url, headers=headers, data=json.dumps(payload), stream=True)
        response.raise_for_status()
        
        for line in response.iter_lines():
            if line:
                line = line.decode('utf-8')
                if line.startswith('data: '):
                    data = line[6:]
                    if data == '[DONE]':
                        break
                    
                    try:
                        chunk = json.loads(data)
                        if 'choices' in chunk and len(chunk['choices']) > 0:
                            delta = chunk['choices'][0].get('delta', {})
                            content = delta.get('content', '')
                            if content:
                                yield content
                    except json.JSONDecodeError:
                        pass
    
    def _extract_text(self, response: Dict[str, Any]) -> str:
        return response['choices'][0]['message']['content']


class LLMFactory:
    """LLM工厂类"""
    
    @staticmethod
    def create_client(provider: str, api_key: str, model: str, **kwargs) -> LLMClient:
        """
        创建LLM客户端
        
        Args:
            provider: 提供商（openai、anthropic、gemini、generic）
            api_key: API密钥
            model: 模型名称
            **kwargs: 其他参数
        
        Returns:
            client: LLM客户端
        """
        if provider == "openai":
            return OpenAIClient(api_key, model)
        elif provider == "anthropic":
            return AnthropicClient(api_key, model)
        elif provider == "gemini":
            return GeminiClient(api_key, model)
        elif provider == "generic":
            base_url = kwargs.get("base_url")
            if not base_url:
                raise ValueError("base_url is required for generic client")
            return GenericLLMClient(api_key, base_url, model)
        else:
            raise ValueError(f"Unknown provider: {provider}")


class LLMManager:
    """LLM管理器"""
    
    def __init__(self):
        self.clients = {}
    
    def add_client(self, name: str, provider: str, api_key: str, model: str, **kwargs):
        """
        添加客户端
        
        Args:
            name: 客户端名称
            provider: 提供商
            api_key: API密钥
            model: 模型名称
            **kwargs: 其他参数
        """
        self.clients[name] = LLMFactory.create_client(provider, api_key, model, **kwargs)
    
    def get_client(self, name: str) -> LLMClient:
        """
        获取客户端
        
        Args:
            name: 客户端名称
        
        Returns:
            client: LLM客户端
        """
        if name not in self.clients:
            raise ValueError(f"Client {name} not found")
        return self.clients[name]

4.2 使用示例

python

# 创建管理器
manager = LLMManager()

# 添加客户端
manager.add_client(
    name="gpt4",
    provider="openai",
    api_key="your-openai-api-key",
    model="gpt-4"
)

manager.add_client(
    name="claude",
    provider="anthropic",
    api_key="your-anthropic-api-key",
    model="claude-3-opus-20240229"
)

manager.add_client(
    name="gemini",
    provider="gemini",
    api_key="your-google-api-key",
    model="gemini-1.5-pro"
)

# 调用
gpt4_client = manager.get_client("gpt4")
response = gpt4_client.get_text("请介绍一下GPT-4")
print(response)

# 流式输出
print("流式输出:")
for chunk in gpt4_client.stream([{"role": "user", "content": "请写一首关于春天的诗"}]):
    print(chunk, end="", flush=True)
print()

# 对比多个模型
prompt = "请用Python写一个快速排序算法"
for name in ["gpt4", "claude", "gemini"]:
    print(f"\n{name}:")
    client = manager.get_client(name)
    response = client.get_text(prompt)
    print(response)

实践任务

任务：开发LLM API封装库

目标：实现一个完整的LLM API封装库

要求：

支持多种LLM API（OpenAI、Anthropic、Gemini等）
实现流式输出
实现Function Calling
提供统一的接口

代码框架：

python

# TODO: 实现LLM API封装库
# 1. 定义基类
# 2. 实现各种客户端
# 3. 实现流式输出
# 4. 实现Function Calling
# 5. 提供工厂类
# 6. 编写测试代码

课后作业

作业1：API封装库设计

题目：设计一个完整的LLM API封装库

要求：

设计类结构
定义接口规范
考虑扩展性
编写设计文档

作业2：流式输出优化

题目：优化流式输出性能

要求：

分析流式输出的性能瓶颈
提出优化方案
实现优化代码

作业3：Function Calling扩展

题目：扩展Function Calling功能

要求：

支持异步函数调用
支持并行函数调用
实现函数调用缓存

参考资料

官方文档

OpenAI API: https://platform.openai.com/docs/api-reference
Anthropic API: https://docs.anthropic.com/claude/reference/getting-started-with-the-api
Google AI API: https://ai.google.dev/docs

在线资源

LangChain: https://python.langchain.com/
- LLM应用框架
LlamaIndex: https://docs.llamaindex.ai/
- 数据框架

扩展阅读

API优化

API优化技术包括批处理（减少API调用次数）、缓存（缓存常用请求）、重试（处理网络错误）等方法。

安全性

安全性措施包括API密钥管理（安全存储密钥）、请求限流（避免超出限制）、内容过滤（过滤敏感内容）等方面。

下节预告

下一节我们将学习Prompt Engineering，学习如何设计有效的Prompt、Few-shot学习、CoT（Chain of Thought）等技术。

扫描二维码关注"架构师AI杜"公众号，获取更多技术内容和最新动态

第15天：LLM API开发实战 ​

学习目标 ​

课程内容 ​

1. LLM API基础 ​

1.1 API调用流程 ​

1.2 多模型支持 ​

2. 流式输出 ​

2.1 流式输出原理 ​

2.2 实现流式输出 ​

2.3 流式输出到文件 ​

3. Function Calling ​

3.1 Function Calling原理 ​

3.2 实现Function Calling ​

3.3 通用Function Calling ​

4. LLM API封装库 ​

4.1 完整的API封装库 ​

4.2 使用示例 ​

实践任务 ​

任务：开发LLM API封装库 ​

课后作业 ​

作业1：API封装库设计 ​

作业2：流式输出优化 ​

作业3：Function Calling扩展 ​

参考资料 ​

官方文档 ​

推荐阅读 ​

在线资源 ​

扩展阅读 ​

API优化 ​

安全性 ​

下节预告 ​

第15天：LLM API开发实战

学习目标

课程内容

1. LLM API基础

1.1 API调用流程

1.2 多模型支持

2. 流式输出

2.1 流式输出原理

2.2 实现流式输出

2.3 流式输出到文件

3. Function Calling

3.1 Function Calling原理

3.2 实现Function Calling

3.3 通用Function Calling

4. LLM API封装库

4.1 完整的API封装库

4.2 使用示例

实践任务

任务：开发LLM API封装库

课后作业

作业1：API封装库设计

作业2：流式输出优化

作业3：Function Calling扩展

参考资料

官方文档

推荐阅读

在线资源

扩展阅读

API优化

安全性

下节预告