Skip to content

第34天:Skills模块总结与项目

学习目标

  • 总结Skills模块核心知识
  • 掌握Skills综合应用
  • 完成Skills实战项目
  • 学会项目部署和优化
  • 掌握Skills生态整合
  • 了解未来发展方向

模块知识总结

核心概念回顾

Skills标准

Skills是Anthropic提出的文档驱动的AI能力标准,具有以下特点:

核心特性

  1. 文档驱动:以skill.md为核心,文档即代码
  2. 结构化定义:使用YAML Front Matter定义元数据
  3. 标准化接口:统一的参数和返回值格式
  4. 自动发现:支持技能的自动发现和加载
  5. 版本管理:支持多版本技能共存

技术架构

┌─────────────────────────────────────────┐
│         Skills Standard               │
├─────────────────────────────────────────┤
│                                       │
│  ┌─────────────┐                     │
│  │  skill.md   │───▶ Front Matter  │
│  └─────────────┘                     │
│         │                             │
│         ▼                             │
│  ┌─────────────┐                     │
│  │  Parameters │───▶ Validation    │
│  └─────────────┘                     │
│         │                             │
│         ▼                             │
│  ┌─────────────┐                     │
│  │  Returns    │───▶ Formatting    │
│  └─────────────┘                     │
│         │                             │
│         ▼                             │
│  ┌─────────────┐                     │
│  │  Examples   │───▶ Usage        │
│  └─────────────┘                     │
│                                       │
└─────────────────────────────────────────┘

MCP vs Skills vs SubAgent

特性MCPSkillsSubAgent
主要用途模型-工具通信AI能力定义任务分解与执行
通信协议JSON-RPC文档驱动消息传递
发现机制手动配置自动发现动态注册
依赖管理简单复杂中等
适用场景工具集成能力封装复杂任务

文档驱动开发

核心理念

  1. 先写文档,再写代码
  2. 文档和代码保持同步
  3. 从文档生成代码框架
  4. 文档即API文档

开发流程

需求分析 → 编写文档 → 文档评审 → 生成代码 → 实现功能 → 测试验证 → 发布

关键技术要点

1. skill.md规范

Front Matter结构

yaml
---
name: "skill-name"
version: "1.0.0"
author: "Author <email@example.com>"
description: "Skill description"
tags: ["tag1", "tag2"]
license: "MIT"
python_requires: ">=3.8"
dependencies:
  - package1>=1.0.0
  - package2>=2.0.0
---

内容结构

  • Description:详细描述
  • Features:功能列表
  • Parameters:参数定义
  • Returns:返回值说明
  • Examples:使用示例
  • Limitations:限制说明

2. 参数验证

Pydantic验证

python
from pydantic import BaseModel, Field, validator

class SkillParameters(BaseModel):
    text: str = Field(..., min_length=1, max_length=10000)
    analysis_type: str = Field(..., regex="^(sentiment|keywords|summary)$")
    
    @validator('text')
    def validate_text(cls, v):
        if not v.strip():
            raise ValueError('Text cannot be empty')
        return v

3. 能力发现

自动发现机制

python
class SkillDiscovery:
    def discover(self, search_path: str) -> List[str]:
        skill_md_files = Path(search_path).glob("**/skill.md")
        
        for skill_md in skill_md_files:
            skill_info = self._parse_skill_md(skill_md)
            self.registry.register(str(skill_md.parent), skill_info)

4. 依赖解析

拓扑排序

python
def resolve_load_order(skill_names: List[str]) -> List[str]:
    load_order = []
    visited = set()
    visiting = set()
    
    def visit(skill_name: str):
        if skill_name in visited:
            return
        
        if skill_name in visiting:
            raise ValueError(f"Circular dependency: {skill_name}")
        
        visiting.add(skill_name)
        
        for dep in get_dependencies(skill_name):
            visit(dep)
        
        visiting.remove(skill_name)
        visited.add(skill_name)
        load_order.append(skill_name)
    
    for name in skill_names:
        visit(name)
    
    return load_order

最佳实践总结

命名规范

  • Skill名称:小写字母,连字符分隔
  • 函数名称:小写字母,下划线分隔
  • 参数名称:描述性,避免缩写
  • 类名称:大驼峰命名

文档规范

  • 描述清晰简洁
  • 参数说明完整
  • 示例代码可运行
  • 包含错误处理说明

错误处理

  • 使用Pydantic验证参数
  • 提供清晰的错误消息
  • 记录详细的日志
  • 优雅地处理异常

性能优化

  • 实现缓存机制
  • 支持批处理
  • 优化内存使用
  • 并行处理请求

实战项目:智能文本分析平台

项目概述

项目名称:Intelligent Text Analysis Platform (ITAP)

项目目标

构建一个基于Skills标准的智能文本分析平台,集成多种文本分析能力,提供统一的API接口。

核心功能

  1. 文本情感分析
  2. 关键词提取
  3. 文本摘要
  4. 实体识别
  5. 语言检测
  6. 文本分类

技术栈

  • 后端:FastAPI
  • 数据库:PostgreSQL + Redis
  • Skills:自定义Skills
  • 前端:React + Ant Design
  • 部署:Docker + Nginx

项目架构

系统架构

┌─────────────────────────────────────────────────┐
│                  Client                     │
└───────────────────┬─────────────────────┘


┌─────────────────────────────────────────────────┐
│              API Gateway                      │
└───────────────────┬─────────────────────┘

        ┌───────────┼───────────┐
        │           │           │
        ▼           ▼           ▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│  Skill 1  │ │  Skill 2  │ │  Skill 3  │
└───────────┘ └───────────┘ └───────────┘
        │           │           │
        └───────────┼───────────┘


┌─────────────────────────────────────────────────┐
│              Database Layer                  │
├─────────────────────────────────────────────────┤
│  PostgreSQL  │  Redis  │  File Storage      │
└─────────────────────────────────────────────────┘

目录结构

itap/
├── backend/
│   ├── app/
│   │   ├── api/
│   │   │   ├── routes/
│   │   │   └── dependencies.py
│   │   ├── core/
│   │   │   ├── config.py
│   │   │   └── security.py
│   │   ├── models/
│   │   │   └── database.py
│   │   ├── services/
│   │   │   ├── skill_service.py
│   │   │   └── analysis_service.py
│   │   └── main.py
│   ├── skills/
│   │   ├── sentiment-analyzer/
│   │   ├── keyword-extractor/
│   │   ├── text-summarizer/
│   │   ├── entity-recognizer/
│   │   └── language-detector/
│   ├── tests/
│   ├── Dockerfile
│   └── requirements.txt
├── frontend/
│   ├── src/
│   │   ├── components/
│   │   ├── pages/
│   │   ├── services/
│   │   └── App.js
│   ├── package.json
│   └── Dockerfile
├── docker-compose.yml
└── README.md

后端实现

主应用

python
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from app.api.routes import analysis, skills
from app.core.config import settings

app = FastAPI(
    title="Intelligent Text Analysis Platform",
    description="AI-powered text analysis using Skills standard",
    version="1.0.0"
)

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

app.include_router(analysis.router, prefix="/api/analysis", tags=["Analysis"])
app.include_router(skills.router, prefix="/api/skills", tags=["Skills"])

@app.get("/")
async def root():
    return {
        "message": "Intelligent Text Analysis Platform API",
        "version": "1.0.0",
        "docs": "/docs"
    }

@app.get("/health")
async def health_check():
    return {"status": "healthy"}

分析服务

python
from typing import Dict, Any, List
from app.services.skill_service import SkillService
import logging

logger = logging.getLogger(__name__)

class AnalysisService:
    def __init__(self, skill_service: SkillService):
        self.skill_service = skill_service
    
    async def analyze_text(
        self,
        text: str,
        analysis_types: List[str]
    ) -> Dict[str, Any]:
        results = {}
        
        for analysis_type in analysis_types:
            try:
                skill_name = self._get_skill_for_analysis(analysis_type)
                result = await self.skill_service.execute_skill(
                    skill_name,
                    {"text": text, "analysis_type": analysis_type}
                )
                
                if result["success"]:
                    results[analysis_type] = result["data"]
                else:
                    results[analysis_type] = {
                        "error": result["error"]
                    }
                    logger.error(
                        f"Analysis failed for {analysis_type}: {result['error']}"
                    )
            except Exception as e:
                logger.error(f"Error in {analysis_type}: {str(e)}")
                results[analysis_type] = {"error": str(e)}
        
        return {
            "success": True,
            "results": results,
            "metadata": {
                "text_length": len(text),
                "word_count": len(text.split()),
                "analysis_types": analysis_types
            }
        }
    
    def _get_skill_for_analysis(self, analysis_type: str) -> str:
        skill_mapping = {
            "sentiment": "sentiment-analyzer",
            "keywords": "keyword-extractor",
            "summary": "text-summarizer",
            "entities": "entity-recognizer",
            "language": "language-detector",
            "classification": "text-classifier"
        }
        
        return skill_mapping.get(analysis_type)
    
    async def batch_analyze(
        self,
        texts: List[str],
        analysis_types: List[str]
    ) -> List[Dict[str, Any]]:
        results = []
        
        for text in texts:
            result = await self.analyze_text(text, analysis_types)
            results.append(result)
        
        return results

API路由

python
from fastapi import APIRouter, HTTPException, Depends
from pydantic import BaseModel
from typing import List
from app.services.analysis_service import AnalysisService

router = APIRouter()

class AnalysisRequest(BaseModel):
    text: str
    analysis_types: List[str]

class BatchAnalysisRequest(BaseModel):
    texts: List[str]
    analysis_types: List[str]

@router.post("/analyze")
async def analyze_text(
    request: AnalysisRequest,
    service: AnalysisService = Depends(get_analysis_service)
):
    if not request.text.strip():
        raise HTTPException(status_code=400, detail="Text cannot be empty")
    
    valid_types = [
        "sentiment", "keywords", "summary",
        "entities", "language", "classification"
    ]
    
    for analysis_type in request.analysis_types:
        if analysis_type not in valid_types:
            raise HTTPException(
                status_code=400,
                detail=f"Invalid analysis type: {analysis_type}"
            )
    
    result = await service.analyze_text(
        request.text,
        request.analysis_types
    )
    
    return result

@router.post("/batch-analyze")
async def batch_analyze(
    request: BatchAnalysisRequest,
    service: AnalysisService = Depends(get_analysis_service)
):
    if not request.texts:
        raise HTTPException(status_code=400, detail="Texts cannot be empty")
    
    if len(request.texts) > 100:
        raise HTTPException(
            status_code=400,
            detail="Maximum 100 texts allowed per batch"
        )
    
    result = await service.batch_analyze(
        request.texts,
        request.analysis_types
    )
    
    return result

Skills实现

情感分析Skill

python
from typing import Dict, Any
from pydantic import BaseModel, Field, validator

class SentimentParameters(BaseModel):
    text: str = Field(..., min_length=1, max_length=10000)
    model: str = Field(default="rule-based", regex="^(rule-based|ml|transformer)$")
    include_scores: bool = Field(default=False)
    
    @validator('text')
    def validate_text(cls, v):
        if not v.strip():
            raise ValueError('Text cannot be empty')
        return v.strip()

class SentimentAnalyzer:
    def __init__(self):
        self.positive_words = [
            'good', 'great', 'excellent', 'amazing',
            'wonderful', 'fantastic', 'awesome', 'love'
        ]
        self.negative_words = [
            'bad', 'terrible', 'awful', 'horrible',
            'poor', 'worst', 'hate', 'dislike'
        ]
    
    def execute(self, params: SentimentParameters) -> Dict[str, Any]:
        try:
            result = self._analyze(params)
            
            return {
                "success": True,
                "data": result,
                "metadata": {
                    "model": params.model,
                    "text_length": len(params.text)
                },
                "error": None
            }
        except Exception as e:
            return {
                "success": False,
                "data": None,
                "metadata": {},
                "error": str(e)
            }
    
    def _analyze(self, params: SentimentParameters) -> Dict[str, Any]:
        if params.model == "rule-based":
            return self._rule_based_analysis(params)
        else:
            return self._ml_analysis(params)
    
    def _rule_based_analysis(self, params: SentimentParameters) -> Dict[str, Any]:
        text_lower = params.text.lower()
        positive_count = sum(1 for word in self.positive_words if word in text_lower)
        negative_count = sum(1 for word in self.negative_words if word in text_lower)
        
        if positive_count > negative_count:
            sentiment = "positive"
            score = positive_count / (positive_count + negative_count + 1)
        elif negative_count > positive_count:
            sentiment = "negative"
            score = negative_count / (positive_count + negative_count + 1)
        else:
            sentiment = "neutral"
            score = 0.5
        
        result = {
            "sentiment": sentiment,
            "score": score,
            "positive_words": [w for w in self.positive_words if w in text_lower],
            "negative_words": [w for w in self.negative_words if w in text_lower]
        }
        
        if params.include_scores:
            result["confidence"] = min(0.9, 0.5 + abs(score - 0.5))
        
        return result
    
    def _ml_analysis(self, params: SentimentParameters) -> Dict[str, Any]:
        return self._rule_based_analysis(params)

关键词提取Skill

python
from typing import Dict, Any, List
from pydantic import BaseModel, Field, validator
import re
from collections import Counter

class KeywordParameters(BaseModel):
    text: str = Field(..., min_length=1, max_length=10000)
    max_keywords: int = Field(default=10, ge=1, le=50)
    min_word_length: int = Field(default=3, ge=2, le=10)
    include_scores: bool = Field(default=True)
    
    @validator('text')
    def validate_text(cls, v):
        if not v.strip():
            raise ValueError('Text cannot be empty')
        return v.strip()

class KeywordExtractor:
    def __init__(self):
        self.stop_words = {
            'the', 'a', 'an', 'and', 'or', 'but',
            'in', 'on', 'at', 'to', 'for', 'of',
            'with', 'by', 'is', 'are', 'was', 'were'
        }
    
    def execute(self, params: KeywordParameters) -> Dict[str, Any]:
        try:
            result = self._extract(params)
            
            return {
                "success": True,
                "data": result,
                "metadata": {
                    "text_length": len(params.text),
                    "max_keywords": params.max_keywords
                },
                "error": None
            }
        except Exception as e:
            return {
                "success": False,
                "data": None,
                "metadata": {},
                "error": str(e)
            }
    
    def _extract(self, params: KeywordParameters) -> Dict[str, Any]:
        words = re.findall(r'\b\w+\b', params.text.lower())
        
        filtered_words = [
            word for word in words
            if len(word) >= params.min_word_length
            and word not in self.stop_words
        ]
        
        word_counts = Counter(filtered_words)
        
        total_words = len(filtered_words)
        keywords = []
        
        for word, count in word_counts.most_common(params.max_keywords):
            score = count / total_words
            keyword_data = {"word": word, "count": count}
            
            if params.include_scores:
                keyword_data["score"] = score
            
            keywords.append(keyword_data)
        
        return {
            "keywords": keywords,
            "total_words": total_words,
            "unique_words": len(word_counts)
        }

前端实现

主页面

javascript
import React, { useState } from 'react';
import { Card, Button, Select, Input, message, Spin } from 'antd';
import { analyzeText } from '../services/api';

const { TextArea } = Input;
const { Option } = Select;

function AnalysisPage() {
  const [text, setText] = useState('');
  const [analysisTypes, setAnalysisTypes] = useState(['sentiment']);
  const [loading, setLoading] = useState(false);
  const [results, setResults] = useState(null);

  const handleAnalyze = async () => {
    if (!text.trim()) {
      message.error('Please enter some text');
      return;
    }

    setLoading(true);
    try {
      const response = await analyzeText(text, analysisTypes);
      setResults(response.data);
      message.success('Analysis completed');
    } catch (error) {
      message.error('Analysis failed: ' + error.message);
    } finally {
      setLoading(false);
    }
  };

  return (
    <div style={{ padding: '24px' }}>
      <Card title="Text Analysis">
        <TextArea
          rows={6}
          value={text}
          onChange={(e) => setText(e.target.value)}
          placeholder="Enter text to analyze..."
        />
        
        <div style={{ marginTop: '16px' }}>
          <Select
            mode="multiple"
            style={{ width: '100%' }}
            placeholder="Select analysis types"
            value={analysisTypes}
            onChange={setAnalysisTypes}
          >
            <Option value="sentiment">Sentiment Analysis</Option>
            <Option value="keywords">Keyword Extraction</Option>
            <Option value="summary">Text Summarization</Option>
            <Option value="entities">Entity Recognition</Option>
            <Option value="language">Language Detection</Option>
          </Select>
        </div>
        
        <Button
          type="primary"
          onClick={handleAnalyze}
          loading={loading}
          style={{ marginTop: '16px' }}
        >
          Analyze
        </Button>
      </Card>

      {results && (
        <Card title="Results" style={{ marginTop: '24px' }}>
          <Spin spinning={loading}>
            <pre>{JSON.stringify(results, null, 2)}</pre>
          </Spin>
        </Card>
      )}
    </div>
  );
}

export default AnalysisPage;

部署配置

Docker Compose

yaml
version: '3.8'

services:
  backend:
    build: ./backend
    ports:
      - "8000:8000"
    environment:
      - DATABASE_URL=postgresql://user:password@db:5432/itap
      - REDIS_URL=redis://redis:6379
    depends_on:
      - db
      - redis
    volumes:
      - ./skills:/app/skills

  frontend:
    build: ./frontend
    ports:
      - "3000:3000"
    depends_on:
      - backend

  db:
    image: postgres:13
    environment:
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=password
      - POSTGRES_DB=itap
    volumes:
      - postgres_data:/var/lib/postgresql/data

  redis:
    image: redis:6
    volumes:
      - redis_data:/data

  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf
    depends_on:
      - backend
      - frontend

volumes:
  postgres_data:
  redis_data:

Nginx配置

nginx
upstream backend {
    server backend:8000;
}

upstream frontend {
    server frontend:3000;
}

server {
    listen 80;
    server_name localhost;

    location /api/ {
        proxy_pass http://backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    location / {
        proxy_pass http://frontend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

项目优化

性能优化

1. 缓存策略

python
from functools import lru_cache
import hashlib

class CachedAnalysisService(AnalysisService):
    def __init__(self, skill_service: SkillService):
        super().__init__(skill_service)
        self.cache = {}
    
    async def analyze_text(
        self,
        text: str,
        analysis_types: List[str]
    ) -> Dict[str, Any]:
        cache_key = self._generate_cache_key(text, analysis_types)
        
        if cache_key in self.cache:
            return self.cache[cache_key]
        
        result = await super().analyze_text(text, analysis_types)
        self.cache[cache_key] = result
        
        return result
    
    def _generate_cache_key(self, text: str, analysis_types: List[str]) -> str:
        key_str = f"{text}:{','.join(sorted(analysis_types))}"
        return hashlib.md5(key_str.encode()).hexdigest()

2. 异步处理

python
from fastapi import BackgroundTasks

@router.post("/analyze-async")
async def analyze_text_async(
    request: AnalysisRequest,
    background_tasks: BackgroundTasks,
    service: AnalysisService = Depends(get_analysis_service)
):
    task_id = str(uuid.uuid4())
    
    background_tasks.add_task(
        service.analyze_text,
        request.text,
        request.analysis_types
    )
    
    return {
        "task_id": task_id,
        "status": "processing",
        "message": "Analysis started in background"
    }

监控和日志

日志配置

python
import logging
from logging.handlers import RotatingFileHandler

def setup_logging():
    logger = logging.getLogger()
    logger.setLevel(logging.INFO)
    
    file_handler = RotatingFileHandler(
        'app.log',
        maxBytes=1024 * 1024,
        backupCount=5
    )
    file_handler.setFormatter(
        logging.Formatter(
            '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
        )
    )
    
    logger.addHandler(file_handler)

性能监控

python
import time
from prometheus_client import Counter, Histogram

REQUEST_COUNT = Counter('requests_total', 'Total requests')
REQUEST_LATENCY = Histogram('request_latency_seconds', 'Request latency')

async def monitor_request(request, call_next):
    start_time = time.time()
    
    REQUEST_COUNT.inc()
    
    response = await call_next(request)
    
    latency = time.time() - start_time
    REQUEST_LATENCY.observe(latency)
    
    response.headers["X-Process-Time"] = str(latency)
    return response

未来发展方向

Skills标准演进

可能的改进方向

  1. 更丰富的类型系统:支持更复杂的数据类型
  2. 异步支持:原生支持异步操作
  3. 流式处理:支持流式数据处理
  4. 多语言支持:更好的多语言支持
  5. 性能优化:更高效的执行机制

生态扩展

发展方向

  1. 更多工具支持:集成更多开发工具
  2. 更好的IDE支持:提供更好的编辑器集成
  3. 自动化测试:自动化测试和验证
  4. AI辅助开发:AI辅助Skill开发
  5. 跨平台支持:支持更多平台和语言

总结

Skills模块总结与项目完成了对Skills标准开发的全面学习。本节我们:

  1. 总结了Skills模块的核心知识
  2. 完成了智能文本分析平台项目
  3. 实现了完整的后端和前端
  4. 配置了部署环境
  5. 提供了性能优化方案
  6. 展望了Skills的未来发展

通过这个项目,我们综合运用了Skills标准、文档驱动开发、能力发现、依赖管理等技术,构建了一个完整的AI应用平台。

参考资源