Appearance
第34天:Skills模块总结与项目
学习目标
- 总结Skills模块核心知识
- 掌握Skills综合应用
- 完成Skills实战项目
- 学会项目部署和优化
- 掌握Skills生态整合
- 了解未来发展方向
模块知识总结
核心概念回顾
Skills标准
Skills是Anthropic提出的文档驱动的AI能力标准,具有以下特点:
核心特性:
- 文档驱动:以skill.md为核心,文档即代码
- 结构化定义:使用YAML Front Matter定义元数据
- 标准化接口:统一的参数和返回值格式
- 自动发现:支持技能的自动发现和加载
- 版本管理:支持多版本技能共存
技术架构:
┌─────────────────────────────────────────┐
│ Skills Standard │
├─────────────────────────────────────────┤
│ │
│ ┌─────────────┐ │
│ │ skill.md │───▶ Front Matter │
│ └─────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Parameters │───▶ Validation │
│ └─────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Returns │───▶ Formatting │
│ └─────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Examples │───▶ Usage │
│ └─────────────┘ │
│ │
└─────────────────────────────────────────┘MCP vs Skills vs SubAgent
| 特性 | MCP | Skills | SubAgent |
|---|---|---|---|
| 主要用途 | 模型-工具通信 | AI能力定义 | 任务分解与执行 |
| 通信协议 | JSON-RPC | 文档驱动 | 消息传递 |
| 发现机制 | 手动配置 | 自动发现 | 动态注册 |
| 依赖管理 | 简单 | 复杂 | 中等 |
| 适用场景 | 工具集成 | 能力封装 | 复杂任务 |
文档驱动开发
核心理念:
- 先写文档,再写代码
- 文档和代码保持同步
- 从文档生成代码框架
- 文档即API文档
开发流程:
需求分析 → 编写文档 → 文档评审 → 生成代码 → 实现功能 → 测试验证 → 发布关键技术要点
1. skill.md规范
Front Matter结构:
yaml
---
name: "skill-name"
version: "1.0.0"
author: "Author <email@example.com>"
description: "Skill description"
tags: ["tag1", "tag2"]
license: "MIT"
python_requires: ">=3.8"
dependencies:
- package1>=1.0.0
- package2>=2.0.0
---内容结构:
- Description:详细描述
- Features:功能列表
- Parameters:参数定义
- Returns:返回值说明
- Examples:使用示例
- Limitations:限制说明
2. 参数验证
Pydantic验证:
python
from pydantic import BaseModel, Field, validator
class SkillParameters(BaseModel):
text: str = Field(..., min_length=1, max_length=10000)
analysis_type: str = Field(..., regex="^(sentiment|keywords|summary)$")
@validator('text')
def validate_text(cls, v):
if not v.strip():
raise ValueError('Text cannot be empty')
return v3. 能力发现
自动发现机制:
python
class SkillDiscovery:
def discover(self, search_path: str) -> List[str]:
skill_md_files = Path(search_path).glob("**/skill.md")
for skill_md in skill_md_files:
skill_info = self._parse_skill_md(skill_md)
self.registry.register(str(skill_md.parent), skill_info)4. 依赖解析
拓扑排序:
python
def resolve_load_order(skill_names: List[str]) -> List[str]:
load_order = []
visited = set()
visiting = set()
def visit(skill_name: str):
if skill_name in visited:
return
if skill_name in visiting:
raise ValueError(f"Circular dependency: {skill_name}")
visiting.add(skill_name)
for dep in get_dependencies(skill_name):
visit(dep)
visiting.remove(skill_name)
visited.add(skill_name)
load_order.append(skill_name)
for name in skill_names:
visit(name)
return load_order最佳实践总结
命名规范
- Skill名称:小写字母,连字符分隔
- 函数名称:小写字母,下划线分隔
- 参数名称:描述性,避免缩写
- 类名称:大驼峰命名
文档规范
- 描述清晰简洁
- 参数说明完整
- 示例代码可运行
- 包含错误处理说明
错误处理
- 使用Pydantic验证参数
- 提供清晰的错误消息
- 记录详细的日志
- 优雅地处理异常
性能优化
- 实现缓存机制
- 支持批处理
- 优化内存使用
- 并行处理请求
实战项目:智能文本分析平台
项目概述
项目名称:Intelligent Text Analysis Platform (ITAP)
项目目标:
构建一个基于Skills标准的智能文本分析平台,集成多种文本分析能力,提供统一的API接口。
核心功能:
- 文本情感分析
- 关键词提取
- 文本摘要
- 实体识别
- 语言检测
- 文本分类
技术栈:
- 后端:FastAPI
- 数据库:PostgreSQL + Redis
- Skills:自定义Skills
- 前端:React + Ant Design
- 部署:Docker + Nginx
项目架构
系统架构
┌─────────────────────────────────────────────────┐
│ Client │
└───────────────────┬─────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ API Gateway │
└───────────────────┬─────────────────────┘
│
┌───────────┼───────────┐
│ │ │
▼ ▼ ▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│ Skill 1 │ │ Skill 2 │ │ Skill 3 │
└───────────┘ └───────────┘ └───────────┘
│ │ │
└───────────┼───────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Database Layer │
├─────────────────────────────────────────────────┤
│ PostgreSQL │ Redis │ File Storage │
└─────────────────────────────────────────────────┘目录结构
itap/
├── backend/
│ ├── app/
│ │ ├── api/
│ │ │ ├── routes/
│ │ │ └── dependencies.py
│ │ ├── core/
│ │ │ ├── config.py
│ │ │ └── security.py
│ │ ├── models/
│ │ │ └── database.py
│ │ ├── services/
│ │ │ ├── skill_service.py
│ │ │ └── analysis_service.py
│ │ └── main.py
│ ├── skills/
│ │ ├── sentiment-analyzer/
│ │ ├── keyword-extractor/
│ │ ├── text-summarizer/
│ │ ├── entity-recognizer/
│ │ └── language-detector/
│ ├── tests/
│ ├── Dockerfile
│ └── requirements.txt
├── frontend/
│ ├── src/
│ │ ├── components/
│ │ ├── pages/
│ │ ├── services/
│ │ └── App.js
│ ├── package.json
│ └── Dockerfile
├── docker-compose.yml
└── README.md后端实现
主应用
python
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from app.api.routes import analysis, skills
from app.core.config import settings
app = FastAPI(
title="Intelligent Text Analysis Platform",
description="AI-powered text analysis using Skills standard",
version="1.0.0"
)
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
app.include_router(analysis.router, prefix="/api/analysis", tags=["Analysis"])
app.include_router(skills.router, prefix="/api/skills", tags=["Skills"])
@app.get("/")
async def root():
return {
"message": "Intelligent Text Analysis Platform API",
"version": "1.0.0",
"docs": "/docs"
}
@app.get("/health")
async def health_check():
return {"status": "healthy"}分析服务
python
from typing import Dict, Any, List
from app.services.skill_service import SkillService
import logging
logger = logging.getLogger(__name__)
class AnalysisService:
def __init__(self, skill_service: SkillService):
self.skill_service = skill_service
async def analyze_text(
self,
text: str,
analysis_types: List[str]
) -> Dict[str, Any]:
results = {}
for analysis_type in analysis_types:
try:
skill_name = self._get_skill_for_analysis(analysis_type)
result = await self.skill_service.execute_skill(
skill_name,
{"text": text, "analysis_type": analysis_type}
)
if result["success"]:
results[analysis_type] = result["data"]
else:
results[analysis_type] = {
"error": result["error"]
}
logger.error(
f"Analysis failed for {analysis_type}: {result['error']}"
)
except Exception as e:
logger.error(f"Error in {analysis_type}: {str(e)}")
results[analysis_type] = {"error": str(e)}
return {
"success": True,
"results": results,
"metadata": {
"text_length": len(text),
"word_count": len(text.split()),
"analysis_types": analysis_types
}
}
def _get_skill_for_analysis(self, analysis_type: str) -> str:
skill_mapping = {
"sentiment": "sentiment-analyzer",
"keywords": "keyword-extractor",
"summary": "text-summarizer",
"entities": "entity-recognizer",
"language": "language-detector",
"classification": "text-classifier"
}
return skill_mapping.get(analysis_type)
async def batch_analyze(
self,
texts: List[str],
analysis_types: List[str]
) -> List[Dict[str, Any]]:
results = []
for text in texts:
result = await self.analyze_text(text, analysis_types)
results.append(result)
return resultsAPI路由
python
from fastapi import APIRouter, HTTPException, Depends
from pydantic import BaseModel
from typing import List
from app.services.analysis_service import AnalysisService
router = APIRouter()
class AnalysisRequest(BaseModel):
text: str
analysis_types: List[str]
class BatchAnalysisRequest(BaseModel):
texts: List[str]
analysis_types: List[str]
@router.post("/analyze")
async def analyze_text(
request: AnalysisRequest,
service: AnalysisService = Depends(get_analysis_service)
):
if not request.text.strip():
raise HTTPException(status_code=400, detail="Text cannot be empty")
valid_types = [
"sentiment", "keywords", "summary",
"entities", "language", "classification"
]
for analysis_type in request.analysis_types:
if analysis_type not in valid_types:
raise HTTPException(
status_code=400,
detail=f"Invalid analysis type: {analysis_type}"
)
result = await service.analyze_text(
request.text,
request.analysis_types
)
return result
@router.post("/batch-analyze")
async def batch_analyze(
request: BatchAnalysisRequest,
service: AnalysisService = Depends(get_analysis_service)
):
if not request.texts:
raise HTTPException(status_code=400, detail="Texts cannot be empty")
if len(request.texts) > 100:
raise HTTPException(
status_code=400,
detail="Maximum 100 texts allowed per batch"
)
result = await service.batch_analyze(
request.texts,
request.analysis_types
)
return resultSkills实现
情感分析Skill
python
from typing import Dict, Any
from pydantic import BaseModel, Field, validator
class SentimentParameters(BaseModel):
text: str = Field(..., min_length=1, max_length=10000)
model: str = Field(default="rule-based", regex="^(rule-based|ml|transformer)$")
include_scores: bool = Field(default=False)
@validator('text')
def validate_text(cls, v):
if not v.strip():
raise ValueError('Text cannot be empty')
return v.strip()
class SentimentAnalyzer:
def __init__(self):
self.positive_words = [
'good', 'great', 'excellent', 'amazing',
'wonderful', 'fantastic', 'awesome', 'love'
]
self.negative_words = [
'bad', 'terrible', 'awful', 'horrible',
'poor', 'worst', 'hate', 'dislike'
]
def execute(self, params: SentimentParameters) -> Dict[str, Any]:
try:
result = self._analyze(params)
return {
"success": True,
"data": result,
"metadata": {
"model": params.model,
"text_length": len(params.text)
},
"error": None
}
except Exception as e:
return {
"success": False,
"data": None,
"metadata": {},
"error": str(e)
}
def _analyze(self, params: SentimentParameters) -> Dict[str, Any]:
if params.model == "rule-based":
return self._rule_based_analysis(params)
else:
return self._ml_analysis(params)
def _rule_based_analysis(self, params: SentimentParameters) -> Dict[str, Any]:
text_lower = params.text.lower()
positive_count = sum(1 for word in self.positive_words if word in text_lower)
negative_count = sum(1 for word in self.negative_words if word in text_lower)
if positive_count > negative_count:
sentiment = "positive"
score = positive_count / (positive_count + negative_count + 1)
elif negative_count > positive_count:
sentiment = "negative"
score = negative_count / (positive_count + negative_count + 1)
else:
sentiment = "neutral"
score = 0.5
result = {
"sentiment": sentiment,
"score": score,
"positive_words": [w for w in self.positive_words if w in text_lower],
"negative_words": [w for w in self.negative_words if w in text_lower]
}
if params.include_scores:
result["confidence"] = min(0.9, 0.5 + abs(score - 0.5))
return result
def _ml_analysis(self, params: SentimentParameters) -> Dict[str, Any]:
return self._rule_based_analysis(params)关键词提取Skill
python
from typing import Dict, Any, List
from pydantic import BaseModel, Field, validator
import re
from collections import Counter
class KeywordParameters(BaseModel):
text: str = Field(..., min_length=1, max_length=10000)
max_keywords: int = Field(default=10, ge=1, le=50)
min_word_length: int = Field(default=3, ge=2, le=10)
include_scores: bool = Field(default=True)
@validator('text')
def validate_text(cls, v):
if not v.strip():
raise ValueError('Text cannot be empty')
return v.strip()
class KeywordExtractor:
def __init__(self):
self.stop_words = {
'the', 'a', 'an', 'and', 'or', 'but',
'in', 'on', 'at', 'to', 'for', 'of',
'with', 'by', 'is', 'are', 'was', 'were'
}
def execute(self, params: KeywordParameters) -> Dict[str, Any]:
try:
result = self._extract(params)
return {
"success": True,
"data": result,
"metadata": {
"text_length": len(params.text),
"max_keywords": params.max_keywords
},
"error": None
}
except Exception as e:
return {
"success": False,
"data": None,
"metadata": {},
"error": str(e)
}
def _extract(self, params: KeywordParameters) -> Dict[str, Any]:
words = re.findall(r'\b\w+\b', params.text.lower())
filtered_words = [
word for word in words
if len(word) >= params.min_word_length
and word not in self.stop_words
]
word_counts = Counter(filtered_words)
total_words = len(filtered_words)
keywords = []
for word, count in word_counts.most_common(params.max_keywords):
score = count / total_words
keyword_data = {"word": word, "count": count}
if params.include_scores:
keyword_data["score"] = score
keywords.append(keyword_data)
return {
"keywords": keywords,
"total_words": total_words,
"unique_words": len(word_counts)
}前端实现
主页面
javascript
import React, { useState } from 'react';
import { Card, Button, Select, Input, message, Spin } from 'antd';
import { analyzeText } from '../services/api';
const { TextArea } = Input;
const { Option } = Select;
function AnalysisPage() {
const [text, setText] = useState('');
const [analysisTypes, setAnalysisTypes] = useState(['sentiment']);
const [loading, setLoading] = useState(false);
const [results, setResults] = useState(null);
const handleAnalyze = async () => {
if (!text.trim()) {
message.error('Please enter some text');
return;
}
setLoading(true);
try {
const response = await analyzeText(text, analysisTypes);
setResults(response.data);
message.success('Analysis completed');
} catch (error) {
message.error('Analysis failed: ' + error.message);
} finally {
setLoading(false);
}
};
return (
<div style={{ padding: '24px' }}>
<Card title="Text Analysis">
<TextArea
rows={6}
value={text}
onChange={(e) => setText(e.target.value)}
placeholder="Enter text to analyze..."
/>
<div style={{ marginTop: '16px' }}>
<Select
mode="multiple"
style={{ width: '100%' }}
placeholder="Select analysis types"
value={analysisTypes}
onChange={setAnalysisTypes}
>
<Option value="sentiment">Sentiment Analysis</Option>
<Option value="keywords">Keyword Extraction</Option>
<Option value="summary">Text Summarization</Option>
<Option value="entities">Entity Recognition</Option>
<Option value="language">Language Detection</Option>
</Select>
</div>
<Button
type="primary"
onClick={handleAnalyze}
loading={loading}
style={{ marginTop: '16px' }}
>
Analyze
</Button>
</Card>
{results && (
<Card title="Results" style={{ marginTop: '24px' }}>
<Spin spinning={loading}>
<pre>{JSON.stringify(results, null, 2)}</pre>
</Spin>
</Card>
)}
</div>
);
}
export default AnalysisPage;部署配置
Docker Compose
yaml
version: '3.8'
services:
backend:
build: ./backend
ports:
- "8000:8000"
environment:
- DATABASE_URL=postgresql://user:password@db:5432/itap
- REDIS_URL=redis://redis:6379
depends_on:
- db
- redis
volumes:
- ./skills:/app/skills
frontend:
build: ./frontend
ports:
- "3000:3000"
depends_on:
- backend
db:
image: postgres:13
environment:
- POSTGRES_USER=user
- POSTGRES_PASSWORD=password
- POSTGRES_DB=itap
volumes:
- postgres_data:/var/lib/postgresql/data
redis:
image: redis:6
volumes:
- redis_data:/data
nginx:
image: nginx:alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
depends_on:
- backend
- frontend
volumes:
postgres_data:
redis_data:Nginx配置
nginx
upstream backend {
server backend:8000;
}
upstream frontend {
server frontend:3000;
}
server {
listen 80;
server_name localhost;
location /api/ {
proxy_pass http://backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
location / {
proxy_pass http://frontend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}项目优化
性能优化
1. 缓存策略
python
from functools import lru_cache
import hashlib
class CachedAnalysisService(AnalysisService):
def __init__(self, skill_service: SkillService):
super().__init__(skill_service)
self.cache = {}
async def analyze_text(
self,
text: str,
analysis_types: List[str]
) -> Dict[str, Any]:
cache_key = self._generate_cache_key(text, analysis_types)
if cache_key in self.cache:
return self.cache[cache_key]
result = await super().analyze_text(text, analysis_types)
self.cache[cache_key] = result
return result
def _generate_cache_key(self, text: str, analysis_types: List[str]) -> str:
key_str = f"{text}:{','.join(sorted(analysis_types))}"
return hashlib.md5(key_str.encode()).hexdigest()2. 异步处理
python
from fastapi import BackgroundTasks
@router.post("/analyze-async")
async def analyze_text_async(
request: AnalysisRequest,
background_tasks: BackgroundTasks,
service: AnalysisService = Depends(get_analysis_service)
):
task_id = str(uuid.uuid4())
background_tasks.add_task(
service.analyze_text,
request.text,
request.analysis_types
)
return {
"task_id": task_id,
"status": "processing",
"message": "Analysis started in background"
}监控和日志
日志配置
python
import logging
from logging.handlers import RotatingFileHandler
def setup_logging():
logger = logging.getLogger()
logger.setLevel(logging.INFO)
file_handler = RotatingFileHandler(
'app.log',
maxBytes=1024 * 1024,
backupCount=5
)
file_handler.setFormatter(
logging.Formatter(
'%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
)
logger.addHandler(file_handler)性能监控
python
import time
from prometheus_client import Counter, Histogram
REQUEST_COUNT = Counter('requests_total', 'Total requests')
REQUEST_LATENCY = Histogram('request_latency_seconds', 'Request latency')
async def monitor_request(request, call_next):
start_time = time.time()
REQUEST_COUNT.inc()
response = await call_next(request)
latency = time.time() - start_time
REQUEST_LATENCY.observe(latency)
response.headers["X-Process-Time"] = str(latency)
return response未来发展方向
Skills标准演进
可能的改进方向:
- 更丰富的类型系统:支持更复杂的数据类型
- 异步支持:原生支持异步操作
- 流式处理:支持流式数据处理
- 多语言支持:更好的多语言支持
- 性能优化:更高效的执行机制
生态扩展
发展方向:
- 更多工具支持:集成更多开发工具
- 更好的IDE支持:提供更好的编辑器集成
- 自动化测试:自动化测试和验证
- AI辅助开发:AI辅助Skill开发
- 跨平台支持:支持更多平台和语言
总结
Skills模块总结与项目完成了对Skills标准开发的全面学习。本节我们:
- 总结了Skills模块的核心知识
- 完成了智能文本分析平台项目
- 实现了完整的后端和前端
- 配置了部署环境
- 提供了性能优化方案
- 展望了Skills的未来发展
通过这个项目,我们综合运用了Skills标准、文档驱动开发、能力发现、依赖管理等技术,构建了一个完整的AI应用平台。
