Appearance
第30天:Skills文档驱动开发
学习目标
本节的学习目标包括理解文档驱动开发理念、掌握Markdown结构化方法、学会编写专业的skill.md、能够通过文档生成代码、掌握Markdown高级特性、学会使用文档生成工具等方面。
核心内容
文档驱动开发理念
什么是文档驱动开发
文档驱动开发(Documentation-Driven Development, DDD)是一种开发方法论,强调先编写文档,再编写代码。在Skills标准中,文档驱动开发意味着:
- 先写文档:首先编写skill.md文件,定义技能的功能、参数、返回值等
- 文档即代码:文档本身就是代码的一部分,可以用于生成代码框架
- 文档同步:文档和代码保持同步,文档更新时代码也更新
- 自动生成:可以从文档自动生成代码框架、测试代码等
文档驱动开发的优势
提高可读性:
提高可读性包括文档是人类可读的易于理解、不需要阅读代码就能了解技能的功能、便于团队协作和知识传递等方面。
便于维护:
便于维护包括文档和代码在同一位置便于维护、更新功能时同步更新文档、减少文档和代码不一致的问题等方面。
自动生成:
自动生成包括可以从文档自动生成代码框架、可以生成测试代码、可以生成API文档等方面。
版本控制:
版本控制包括文档纳入版本控制、可以追踪文档变更历史、便于回滚和比较等方面。
文档驱动开发流程
1. 需求分析
↓
2. 编写文档(skill.md)
↓
3. 文档评审
↓
4. 生成代码框架
↓
5. 实现代码
↓
6. 编写测试
↓
7. 文档和代码同步更新
↓
8. 发布Markdown结构化技巧
Front Matter
Front Matter是Markdown文件开头的元数据部分,使用YAML格式:
yaml
---
name: "my-skill"
version: "1.0.0"
author: "Your Name <email@example.com>"
description: "A brief description"
tags: ["tag1", "tag2"]
license: "MIT"
python_requires: ">=3.8"
dependencies:
- requests>=2.28.0
- pandas>=1.5.0
---Front Matter最佳实践:
Front Matter最佳实践包括使用kebab-case作为技能名称、遵循语义化版本规范(Semantic Versioning)、包含作者邮箱便于联系、使用标签分类技能、明确许可证类型、指定Python版本要求、列出所有依赖等方面。
标题层次
使用Markdown标题层次结构化内容:
markdown
# 一级标题
## 二级标题
### 三级标题
#### 四级标题
##### 五级标题标题使用原则:
标题使用原则包括一级标题用于文档标题、二级标题用于主要章节、三级标题用于子章节、四级标题用于详细说明、五级标题用于补充信息等方面。
列表
使用列表组织信息:
无序列表:
markdown
- 项目1
- 项目2
- 子项目2.1
- 子项目2.2
- 项目3有序列表:
markdown
1. 第一步
2. 第二步
3. 第三步任务列表:
markdown
- [x] 已完成的任务
- [ ] 未完成的任务表格
使用表格展示结构化数据:
markdown
| 参数名 | 类型 | 必需 | 描述 | 默认值 |
|--------|------|------|------|--------|
| param1 | string | 是 | 参数1描述 | - |
| param2 | integer | 否 | 参数2描述 | 10 |
| param3 | boolean | 否 | 参数3描述 | false |表格最佳实践:
表格最佳实践包括使用清晰的列名、保持表格简洁、使用适当的对齐、添加必要的说明等方面。
代码块
使用代码块展示代码:
python
def hello_world():
print("Hello, World!")bash
pip install my-skill代码块最佳实践:
代码块最佳实践包括指定语言类型、使用适当的缩进、添加必要的注释、保持代码简洁等方面。
链接和引用
使用链接和引用增强文档:
markdown
[链接文本](https://example.com)
[引用链接][link-id]
[link-id]: https://example.com "链接标题"
> 引用文本
> 多行引用图片和图表
使用图片和图表增强可读性:
markdown


[](https://example.com)skill.md的高级特性
条件渲染
使用条件标记控制内容显示:
markdown
:::info
这是一条信息提示
:::
:::warning
这是一条警告提示
:::
:::danger
这是一条危险提示
:::
:::note
这是一条注释
:::选项卡
使用选项卡组织相关内容:
markdown
:::tabs
=== 选项1
选项1的内容
=== 选项2
选项2的内容
=== 选项3
选项3的内容
:::折叠内容
使用折叠内容隐藏详细信息:
markdown
<details>
<summary>点击查看详情</summary>
详细内容
</details>Mermaid图表
使用Mermaid绘制流程图:
mermaid
graph TD
A[开始] --> B[处理]
B --> C[结束]mermaid
sequenceDiagram
participant A as 用户
participant B as 系统
A->>B: 请求
B-->>A: 响应代码生成工具
基于模板生成
使用模板引擎从文档生成代码:
python
from jinja2 import Template
import yaml
def generate_code_from_skill(skill_md_path: str, template_path: str, output_path: str):
with open(skill_md_path, 'r') as f:
content = f.read()
front_matter, _ = parse_front_matter(content)
with open(template_path, 'r') as f:
template = Template(f.read())
code = template.render(**front_matter)
with open(output_path, 'w') as f:
f.write(code)
def parse_front_matter(content: str):
lines = content.split('\n')
if lines[0] != '---':
return {}, content
end_index = lines[1:].index('---') + 1
front_matter_text = '\n'.join(lines[1:end_index])
body = '\n'.join(lines[end_index+1:])
front_matter = yaml.safe_load(front_matter_text)
return front_matter, body使用Pydantic生成
使用Pydantic从文档生成数据模型:
python
from pydantic import BaseModel, Field
from typing import Dict, Any, Optional
import yaml
def generate_model_from_skill(skill_md_path: str):
with open(skill_md_path, 'r') as f:
content = f.read()
front_matter, _ = parse_front_matter(content)
class SkillParameters(BaseModel):
name: str = Field(default=front_matter.get('name'))
version: str = Field(default=front_matter.get('version'))
description: str = Field(default=front_matter.get('description'))
return SkillParameters使用JSON Schema生成
从文档生成JSON Schema:
python
import json
import yaml
def generate_json_schema_from_skill(skill_md_path: str, output_path: str):
with open(skill_md_path, 'r') as f:
content = f.read()
parameters = parse_parameters(content)
schema = {
"type": "object",
"properties": {},
"required": []
}
for param in parameters:
schema["properties"][param["name"]] = {
"type": param["type"],
"description": param["description"]
}
if param.get("required"):
schema["required"].append(param["name"])
if "default" in param:
schema["properties"][param["name"]]["default"] = param["default"]
with open(output_path, 'w') as f:
json.dump(schema, f, indent=2)
def parse_parameters(content: str):
parameters = []
lines = content.split('\n')
in_table = False
for line in lines:
if line.startswith('|'):
if not in_table:
in_table = True
else:
parts = [p.strip() for p in line.split('|')]
if len(parts) >= 5 and parts[1] and parts[1] != 'Parameter':
parameters.append({
"name": parts[1],
"type": parts[2],
"required": parts[3] == 'Yes',
"description": parts[4],
"default": parts[5] if len(parts) > 5 else None
})
return parameters文档生成工具
技能文档生成器
python
import os
import yaml
from pathlib import Path
from typing import Dict, List
class SkillDocumentationGenerator:
def __init__(self, skill_path: str):
self.skill_path = Path(skill_path)
self.skill_md_path = self.skill_path / "skill.md"
self.output_path = self.skill_path / "docs"
def generate(self):
os.makedirs(self.output_path, exist_ok=True)
front_matter, content = self._parse_skill_md()
self._generate_readme(front_matter, content)
self._generate_api_doc(front_matter, content)
self._generate_examples(front_matter, content)
def _parse_skill_md(self) -> tuple:
with open(self.skill_md_path, 'r', encoding='utf-8') as f:
content = f.read()
lines = content.split('\n')
if lines[0] != '---':
return {}, content
end_index = lines[1:].index('---') + 1
front_matter_text = '\n'.join(lines[1:end_index])
body = '\n'.join(lines[end_index+1:])
front_matter = yaml.safe_load(front_matter_text)
return front_matter, body
def _generate_readme(self, front_matter: Dict, content: str):
readme_content = f"""# {front_matter.get('name', 'Skill')}
{front_matter.get('description', '')}Installation
bash
pip install {front_matter.get('name', 'skill')}Usage
See skill.md for detailed usage.
License
"""
with open(self.skill_path / "README.md", 'w', encoding='utf-8') as f:
f.write(readme_content)
def _generate_api_doc(self, front_matter: Dict, content: str):
api_doc_content = f"""# API Documentation
with open(self.output_path / "api.md", 'w', encoding='utf-8') as f:
f.write(api_doc_content)
def _generate_examples(self, front_matter: Dict, content: str):
examples_content = self._extract_section(content, 'Examples')
with open(self.output_path / "examples.md", 'w', encoding='utf-8') as f:
f.write(examples_content)
def _extract_section(self, content: str, section_name: str) -> str:
lines = content.split('\n')
section_lines = []
in_section = False
for line in lines:
if line.startswith(f'## {section_name}'):
in_section = True
elif in_section and line.startswith('## '):
break
elif in_section:
section_lines.append(line)
return '\n'.join(section_lines)
generator = SkillDocumentationGenerator("path/to/skill") generator.generate()
#### 代码框架生成器
```python
import os
import yaml
from pathlib import Path
from typing import Dict
class CodeFrameworkGenerator:
def __init__(self, skill_path: str):
self.skill_path = Path(skill_path)
self.skill_md_path = self.skill_path / "skill.md"
self.src_path = self.skill_path / "src"
def generate(self):
os.makedirs(self.src_path, exist_ok=True)
front_matter, content = self._parse_skill_md()
self._generate_main_py(front_matter, content)
self._generate_init_py(front_matter)
self._generate_test_py(front_matter, content)
def _parse_skill_md(self) -> tuple:
with open(self.skill_md_path, 'r', encoding='utf-8') as f:
content = f.read()
lines = content.split('\n')
if lines[0] != '---':
return {}, content
end_index = lines[1:].index('---') + 1
front_matter_text = '\n'.join(lines[1:end_index])
body = '\n'.join(lines[end_index+1:])
front_matter = yaml.safe_load(front_matter_text)
return front_matter, body
def _generate_main_py(self, front_matter: Dict, content: str):
skill_name = front_matter.get('name', 'skill').replace('-', '_')
class_name = ''.join(word.capitalize() for word in skill_name.split('_'))
parameters = self._parse_parameters(content)
from typing import Dict, Any, Optional
from pydantic import BaseModel, Field
class SkillParameters(BaseModel):
"""Skill parameters"""
{self._generate_parameter_fields(parameters)}
class {class_name}:
def __init__(self):
"""Initialize {class_name}"""
pass
def execute(self, params: SkillParameters) -> Dict[str, Any]:
"""
Execute skill with given parameters.
Args:
params: SkillParameters containing input parameters
Returns:
Dict containing success status, data, metadata, and error
"""
try:
result = self._process(params)
return {{
"success": True,
"data": result,
"metadata": {{}},
"error": None
}}
except Exception as e:
return {{
"success": False,
"data": None,
"metadata": {{}},
"error": str(e)
}}
def _process(self, params: SkillParameters) -> Any:
"""
Process the skill logic.
Args:
params: SkillParameters
Returns:
Processed result
"""
# Implement your skill logic here
pass
if __name__ == "__main__":
skill = {class_name}()
# Test the skill
print("Skill initialized successfully")
with open(self.src_path / "main.py", 'w', encoding='utf-8') as f:
f.write(main_py_content)
def _generate_parameter_fields(self, parameters: List[Dict]) -> str:
fields = []
for param in parameters:
field_def = f' {param["name"]}: {self._get_python_type(param["type"])}'
if param.get("required"):
field_def += ' = Field(...'
else:
default = param.get("default", "None")
field_def += f' = Field(default={default}'
field_def += f', description="{param["description"]}")'
field_def += ')'
fields.append(field_def)
return '\n'.join(fields)
def _get_python_type(self, param_type: str) -> str:
type_mapping = {
'string': 'str',
'integer': 'int',
'number': 'float',
'boolean': 'bool',
'array': 'List[Any]',
'object': 'Dict[str, Any]'
}
return type_mapping.get(param_type, 'Any')
def _parse_parameters(self, content: str) -> List[Dict]:
parameters = []
lines = content.split('\n')
in_table = False
for line in lines:
if line.startswith('|'):
if not in_table:
in_table = True
else:
parts = [p.strip() for p in line.split('|')]
if len(parts) >= 5 and parts[1] and parts[1] != 'Parameter':
parameters.append({
"name": parts[1],
"type": parts[2],
"required": parts[3] == 'Yes',
"description": parts[4],
"default": parts[5] if len(parts) > 5 else None
})
return parameters
def _generate_init_py(self, front_matter: Dict):
skill_name = front_matter.get('name', 'skill').replace('-', '_')
with open(self.src_path / "__init__.py", 'w', encoding='utf-8') as f:
f.write(init_py_content)
def _generate_test_py(self, front_matter: Dict, content: str):
skill_name = front_matter.get('name', 'skill').replace('-', '_')
class_name = ''.join(word.capitalize() for word in skill_name.split('_'))
import unittest
from {skill_name} import {class_name}, SkillParameters
class Test{class_name}(unittest.TestCase):
def setUp(self):
self.skill = {class_name}()
def test_execute_success(self):
"""Test successful execution"""
params = SkillParameters(
# Add required parameters here
)
result = self.skill.execute(params)
self.assertTrue(result["success"])
self.assertIsNotNone(result["data"])
self.assertIsNone(result["error"])
if __name__ == "__main__":
unittest.main()
with open(self.skill_path / "tests" / "test_skill.py", 'w', encoding='utf-8') as f:
f.write(test_py_content)
# 使用示例
generator = CodeFrameworkGenerator("path/to/skill")
generator.generate()实践任务
任务1:编写专业的skill.md
编写一个专业的skill.md文件,使用Markdown高级特性。
步骤:
步骤包括编写Front Matter、使用标题层次结构化内容、使用表格展示参数、使用代码块展示示例、使用条件标记添加提示、使用选项卡组织内容等方面。
输出:
输出包括专业的skill.md文件、使用Markdown高级特性、结构清晰易于阅读等方面。
任务2:实现代码生成工具
实现一个简单的代码生成工具。
步骤:
步骤包括解析skill.md文件、提取参数定义、生成代码框架、生成测试代码、测试生成的代码等方面。
输出:
输出包括代码生成工具实现、生成的代码框架、生成的测试代码等方面。
任务3:优化文档结构
优化现有skill.md的文档结构。
步骤:
步骤包括分析现有文档结构、识别改进点、优化文档层次、添加必要的说明、改进可读性等方面。
输出:
输出包括优化后的skill.md文件、改进说明文档、对比分析报告等方面。
代码示例
示例1:完整的skill.md文件
yaml
---
name: "advanced-text-analyzer"
version: "2.0.0"
author: "Your Name <email@example.com>"
description: "Advanced text analysis with multiple capabilities"
tags: ["text", "analysis", "nlp", "ai"]
license: "MIT"
python_requires: ">=3.8"
dependencies:
- pydantic>=2.0.0
- requests>=2.28.0
- numpy>=1.24.0
---
# Advanced Text Analyzer Skill
## Description
:::info
This skill provides comprehensive text analysis capabilities including sentiment analysis, keyword extraction, text summarization, and entity recognition.
:::
### Features
Features包括**Sentiment Analysis**(Determine emotional tone of text)、**Keyword Extraction**(Identify important keywords and phrases)、**Text Summarization**(Generate concise summaries)、**Entity Recognition**(Extract named entities including people, organizations, locations)等方面。
### Use Cases
This skill is suitable for: Analyzing customer feedback and reviews、Processing social media content、Summarizing articles and documents、Extracting insights from text data、Building content recommendation systems等方面。
### Limitations
:::warning
The following limitations should be noted: Sentiment analysis is rule-based and may not be accurate for complex or sarcastic text、Keyword extraction does not consider word context or semantic relationships、Summarization is extractive and may miss important information、Entity recognition accuracy depends on text quality and domain等方面。
:::
## Parameters
| Parameter | Type | Required | Description | Default | Constraints |
|-----------|------|----------|-------------|---------|-------------|
| text | string | Yes | Text to analyze | - | minLength: 1, maxLength: 10000 |
| analysis_type | string | Yes | Type of analysis to perform | - | enum: ["sentiment", "keywords", "summary", "entities"] |
| options | object | No | Additional options for analysis | {} | - |
### options
:::tabs
=== Sentiment Options
| Property | Type | Required | Description | Default |
|----------|------|----------|-------------|---------|
| model | string | No | Sentiment model to use | "rule-based" |
| include_scores | boolean | No | Include confidence scores | false |
=== Keyword Options
| Property | Type | Required | Description | Default |
|----------|------|----------|-------------|---------|
| max_keywords | integer | No | Maximum number of keywords | 10 | minimum: 1, maximum: 50 |
| min_word_length | integer | No | Minimum word length | 3 | minimum: 2, maximum: 10 |
| include_scores | boolean | No | Include keyword scores | true |
=== Summary Options
| Property | Type | Required | Description | Default |
|----------|------|----------|-------------|---------|
| max_sentences | integer | No | Maximum sentences in summary | 3 | minimum: 1, maximum: 10 |
| min_length | integer | No | Minimum summary length | 50 | minimum: 10, maximum: 500 |
=== Entity Options
| Property | Type | Required | Description | Default |
|----------|------|----------|-------------|---------|
| entity_types | array | No | Entity types to extract | ["PERSON", "ORG", "LOC"] | enum: ["PERSON", "ORG", "LOC", "DATE", "MISC"] |
| include_confidence | boolean | No | Include confidence scores | false |
:::
## Returns
Returns a dictionary containing: `success` (boolean): Whether operation was successful, `data` (any): The analysis results, `metadata` (object): Additional metadata, `error` (string): Error message if failed.
### Success Response
```json
{{
"success": true,
"data": {{
"sentiment": "positive",
"score": 0.85,
"confidence": 0.92
}},
"metadata": {{
"analysis_type": "sentiment",
"text_length": 100,
"word_count": 20,
"processing_time": 0.05
}},
"error": null
}}Error Response
json
{{
"success": false,
"data": null,
"metadata": {{}},
"error": "Invalid analysis type: invalid_type"
}}Examples
Example 1: Sentiment Analysis
python
from advanced_text_analyzer import AdvancedTextAnalyzer
analyzer = AdvancedTextAnalyzer()
result = analyzer.analyze(
text="This is an amazing product! I love it!",
analysis_type="sentiment",
options={{
"model": "rule-based",
"include_scores": true
}}
)
print(result)Output:
json
{{
"success": true,
"data": {{
"sentiment": "positive",
"score": 0.85,
"confidence": 0.92,
"positive_words": ["amazing", "love"],
"negative_words": []
}},
"metadata": {{
"analysis_type": "sentiment",
"text_length": 45,
"word_count": 9,
"processing_time": 0.03
}},
"error": null
}}Example 2: Keyword Extraction
python
from advanced_text_analyzer import AdvancedTextAnalyzer
analyzer = AdvancedTextAnalyzer()
result = analyzer.analyze(
text="AI and machine learning are transforming the world of technology.",
analysis_type="keywords",
options={{
"max_keywords": 5,
"min_word_length": 3,
"include_scores": true
}}
)
print(result)Output:
json
{{
"success": true,
"data": {{
"keywords": ["AI", "machine", "learning", "transforming", "world", "technology"],
"scores": [0.2, 0.2, 0.2, 0.13, 0.13, 0.13],
"counts": [1, 1, 1, 1, 1, 1]
}},
"metadata": {{
"analysis_type": "keywords",
"text_length": 68,
"word_count": 10,
"processing_time": 0.04
}},
"error": null
}}Example 3: Text Summarization
python
from advanced_text_analyzer import AdvancedTextAnalyzer
analyzer = AdvancedTextAnalyzer()
result = analyzer.analyze(
text="This is a long text that needs to be summarized. It contains multiple sentences. Each sentence provides important information. The summary should capture the main points.",
analysis_type="summary",
options={{
"max_sentences": 2,
"min_length": 50
}}
)
print(result)Output:
json
{{
"success": true,
"data": {{
"summary": "This is a long text that needs to be summarized. It contains multiple sentences.",
"original_length": 178,
"summary_length": 72,
"compression_ratio": 0.4
}},
"metadata": {{
"analysis_type": "summary",
"text_length": 178,
"word_count": 32,
"processing_time": 0.05
}},
"error": null
}}Architecture
mermaid
graph TD
A[Input Text] --> B[Preprocessing]
B --> C{Analysis Type}
C -->|Sentiment| D[Sentiment Analyzer]
C -->|Keywords| E[Keyword Extractor]
C -->|Summary| F[Summarizer]
C -->|Entities| G[Entity Recognizer]
D --> H[Result Formatter]
E --> H
F --> H
G --> H
H --> I[Output]Performance
| Analysis Type | Avg. Processing Time | Accuracy |
|---|---|---|
| Sentiment | 0.03s | 85% |
| Keywords | 0.04s | 80% |
| Summary | 0.05s | 75% |
| Entities | 0.06s | 70% |
总结
文档驱动开发是Skills标准的核心理念,通过先编写文档,再编写代码的方式,提高可读性和可维护性。本节我们学习了:
- 文档驱动开发理念
- Markdown结构化技巧
- skill.md的高级特性
- 代码生成工具
- 文档生成工具
下一步,我们将学习Skills能力发现,掌握如何实现技能的自动发现和加载。
参考资源
参考资源包括Markdown Guide(https://www.markdownguide.org/)、YAML Specification(https://yaml.org/spec/)、Jinja2 Documentation(https://jinja.palletsprojects.com/)、Pydantic Documentation(https://docs.pydantic.dev/)、Mermaid Documentation(https://mermaid-js.github.io/)等。
