Initialize ai-video-skills repository with core skills and preview assets.
Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
commit
04075dee76
21
LICENSE
Normal file
21
LICENSE
Normal file
@ -0,0 +1,21 @@
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2026
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
27
README.md
Normal file
27
README.md
Normal file
@ -0,0 +1,27 @@
|
||||
# AI Video Skills
|
||||
|
||||
面向 AI 视频自动化创作的 Skill 集合,聚焦 `HyperFrames` 工作流。
|
||||
仓库目标:让创作者可以直接复用技能文档、脚本和示例,快速产出稳定视频。
|
||||
|
||||
## 收录 Skills
|
||||
|
||||
- `skills/ai-tech-news-video`:AI 信息差快报视频
|
||||
- 预览:
|
||||

|
||||
- `skills/product-intro-video`:产品介绍视频
|
||||
- 预览:
|
||||

|
||||
- `skills/sound-fx-for-video`:音效搜索、下载与合成
|
||||
- 预览:暂无预览
|
||||
- `skills/sketch-animation-video`:简笔画/线稿动画视频
|
||||
- 预览:
|
||||

|
||||
|
||||
## 账号信息
|
||||
|
||||
- 名称:拓扑同学
|
||||
- 平台:小红书
|
||||
- 小红书号:26431840972
|
||||
- 二维码:`docs/assets/xiaohongshu-profile-qr.png`
|
||||
|
||||

|
||||
BIN
docs/assets/preview-ai-tech-news-video.gif
Normal file
BIN
docs/assets/preview-ai-tech-news-video.gif
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 2.5 MiB |
BIN
docs/assets/preview-product-intro-video.gif
Normal file
BIN
docs/assets/preview-product-intro-video.gif
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 4.2 MiB |
BIN
docs/assets/preview-sketch-animation-video.gif
Normal file
BIN
docs/assets/preview-sketch-animation-video.gif
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 627 KiB |
BIN
docs/assets/preview-sound-fx-for-video.gif
Normal file
BIN
docs/assets/preview-sound-fx-for-video.gif
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 753 KiB |
BIN
docs/assets/xiaohongshu-profile-qr.png
Normal file
BIN
docs/assets/xiaohongshu-profile-qr.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 94 KiB |
430
skills/ai-tech-news-video/SKILL.md
Normal file
430
skills/ai-tech-news-video/SKILL.md
Normal file
@ -0,0 +1,430 @@
|
||||
---
|
||||
name: ai-tech-news-video
|
||||
description: "用于制作 AI 科技资讯快报视频。包含:新闻检索与筛选、真实素材下载、edge-tts 口播、HyperFrames 画面编排、音频与字幕同步、最终渲染导出。触发词:AI快报、科技资讯视频、AI新闻视频、news flash video、weekly AI digest。"
|
||||
---
|
||||
|
||||
# AI 科技资讯快报视频 Skill
|
||||
|
||||
使用 `HyperFrames + edge-tts + 素材检索` 生产专业 AI 资讯快报视频。
|
||||
|
||||
## 流程总览
|
||||
|
||||
```
|
||||
1. 新闻检索与筛选 → 2. 素材获取 → 3. 口播生成 → 4. 编写 HyperFrames 页面 → 5. 渲染导出
|
||||
```
|
||||
|
||||
## 默认规格
|
||||
|
||||
- **默认比例:16:9 横屏**,使用 `1920x1080`,适合 B 站、YouTube、网页内嵌、横屏播放器和演示场景。
|
||||
- **不要默认做 3:4 或 9:16**。只有用户明确说“抖音 / 小红书 / 视频号 / Shorts / Reels / 竖屏 / 手机端优先”时,才改用 `1080x1920`。
|
||||
- 信息差视频的默认画面语言是**新闻图或视频素材 + 生动解释动画 + 字幕**,不要把整段新闻文案铺成 PPT。
|
||||
- **必须添加逐句口播字幕。**每段配音都要有对应的可读字幕,字幕要跟随口播出现,不能只用标题、信息条、画面文案或卡片摘要代替。
|
||||
- 字幕默认放在底部安全区,使用半透明深色底或描边保证可读性;不要遮住新闻截图的核心标题、产品界面、表格和人物脸部。
|
||||
- 字幕文案可以比配音略短,但必须覆盖每段口播的关键信息。每屏建议 1-2 行,每行不超过 22 个中文字符。
|
||||
- **开头口播固定短句式**:`X月X日的 AI 信息差来了。` 然后立刻开始播第一条内容,不要铺垫、不要自我介绍、不要解释栏目。
|
||||
- **不讲无关总结**:最后一条讲完就结束或做 0.5-1 秒视觉收束,不要再补“以上就是本期”“总结一下”“关注我”等无关口播。
|
||||
- **硬性要求:每条新闻必须出现真实素材。**真实素材包括新闻网页截图、官方博客/公告截图、产品官网截图、发布会画面、公司/产品官方图片、媒体配图、公开授权照片或可核实的真实界面截图。不能只用自制假 Logo、假产品界面、纯图标、纯抽象几何图或 AI 编造截图代替真实素材。
|
||||
- **素材优先级:官方截图/公告页 > 新闻报道截图 > 真实产品界面/公司图片 > 授权图库辅助图。**图库图只能做背景或补充氛围,不能作为该新闻的唯一画面证据。
|
||||
- **没有真实素材就不能直接渲染最终版。**如果某条新闻找不到可用真实素材,必须先换题、继续找,或明确告诉用户这条无法做成合格新闻卡,不能悄悄用假界面糊弄过去。
|
||||
- 每条新闻优先用不同视觉结构和动画隐喻,让观众一眼看懂变化:例如 Agent 搬进电脑,可以用小 Agent 从网页窗口移动到本地文件、桌面窗口、任务卡片之间,而不是只放一张配图和大段文字。
|
||||
- **不要默认做片尾总结、关注引导或自我介绍**。信息差视频讲完信息差就结束;最后一条新闻结束后可直接淡出、硬切或用 0.5-1 秒视觉收束。
|
||||
|
||||
## Step 1:检索并筛选 AI 新闻
|
||||
|
||||
### 推荐信息源(按 `web_fetch` 可用性排序)
|
||||
|
||||
| Source | URL | Fetchable? | Best For |
|
||||
|--------|-----|-----------|----------|
|
||||
| The Verge AI | theverge.com/ai-artificial-intelligence | ✅ Yes | Comprehensive AI coverage |
|
||||
| Hacker News | news.ycombinator.com | ✅ Yes | Tech community trending |
|
||||
| GitHub trending | github.com/trending | ✅ Yes | Dev tools & open source |
|
||||
| Ars Technica | arstechnica.com/ai | ❌ JS-rendered | Skip unless using browser |
|
||||
| 36kr | 36kr.com/information/AI | ❌ JS-rendered | Skip unless using browser |
|
||||
| TechCrunch | techcrunch.com/category/artificial-intelligence | ❌ JS-rendered | Skip unless using browser |
|
||||
|
||||
**JS 渲染站点**:优先使用 `browser` 工具,不要只用 `web_fetch`。
|
||||
|
||||
### 新闻筛选规则
|
||||
|
||||
- Pick **4-5 items** per video (sweet spot for 60-90s)
|
||||
- Each item needs: **headline (≤20 chars)** + **one-line summary** + **category tag**
|
||||
- Balance categories: mix of features, company moves, infrastructure, legal, societal impact
|
||||
- Verify recency: all items should be from the **past 7 days**
|
||||
- Include at least one **China/domestic AI** item if audience is Chinese
|
||||
|
||||
### 分类标签
|
||||
|
||||
| Tag | Emoji | Color | CSS Class |
|
||||
|-----|-------|-------|-----------|
|
||||
| 新功能 | 🧠 | Yellow | `tag-feature` |
|
||||
| 公司动态 | 🚀 | Indigo | `tag-company` |
|
||||
| 基础设施 | ⚡ | Green | `tag-infra` |
|
||||
| 法律 | ⚖️ | Red | `tag-legal` |
|
||||
| 能源/社会 | 💡 | Blue | `tag-energy` |
|
||||
| 开源 | 📦 | Purple | `tag-opensource` |
|
||||
| 研究 | 🔬 | Cyan | `tag-research` |
|
||||
|
||||
## Step 2:获取并下载真实新闻素材
|
||||
|
||||
<HARD-GATE>
|
||||
Before writing HyperFrames HTML or rendering the final MP4, create an `assets/images/` folder and save at least one real, topic-matched visual asset for every selected news item. Also write an `assets/images/manifest.md` file that maps each news item to:
|
||||
|
||||
- asset filename
|
||||
- source page URL or citation label
|
||||
- asset type: official screenshot, news screenshot, product screenshot, public photo, or licensed stock support
|
||||
- why this asset proves or supports this specific news item
|
||||
|
||||
If any selected item has no real asset, stop and replace that news item or ask the user how to proceed. Do not substitute fake logos, fake dashboards, fake screenshots, generic icon cards, abstract AI art, or purely CSS-drawn panels as the only visual for a news item.
|
||||
|
||||
Before final render, add timed subtitles for every voiceover clip. If there is narration audio without corresponding visible subtitle text, stop and add subtitles before rendering.
|
||||
</HARD-GATE>
|
||||
|
||||
### 素材来源
|
||||
|
||||
| Source | URL | License | Notes |
|
||||
|--------|-----|---------|-------|
|
||||
| Official announcement/blog | company site | Varies | Best source for product/news screenshots |
|
||||
| News article screenshot | The Verge / TechCrunch / Reuters / 36kr / etc. | Editorial fair use considerations | Use as visual proof of a reported item |
|
||||
| Product website/app screenshot | official product page/app | Varies | Best for feature and launch stories |
|
||||
| Unsplash | unsplash.com | Free | Best quality, use `?w=640&q=80` for optimized downloads |
|
||||
| Pexels | pexels.com | Free | Good variety |
|
||||
| Wikimedia | commons.wikimedia.org | Varies | Logos, diagrams |
|
||||
|
||||
### 真实素材硬性规则
|
||||
|
||||
- **One real asset per news card, minimum.** A 5-item video needs at least 5 downloaded images/screenshots, one for each story.
|
||||
- **Every real asset must be visible in the video.** It can be cropped, masked, color-graded, blurred slightly behind text, or combined with explanatory animation, but it must appear on-screen long enough for viewers to recognize it.
|
||||
- **Use screenshots for web-only sources.** If the story is from a web page and no downloadable image is available, capture the article/announcement/product page as an image and use it in the card.
|
||||
- **遇到 Cloudflare 或类似人机验证时,必须先点验证再截图。** 如果截图画面出现 “Verify you are human”“Performing security verification” 或验证码框,不能把这个验证页当新闻素材保存。要用可交互浏览器打开页面,点击验证框,等真实文章或官方页面加载出来后再截图;如果验证后仍然进不去,就换来源或换新闻。
|
||||
- **Do not invent brand visuals.** Homemade “OpenAI Ads Manager” panels, fake product windows, fake legal files, fake company logos, and fake charts are allowed only as secondary explanation layers, never as the main news visual.
|
||||
- **Do not use generic stock as proof.** A data-center stock photo can support an infrastructure story, but if the story is about a specific company deal, also include the article/official page screenshot or a real company/product image.
|
||||
- **Keep a source manifest.** Future maintainers must be able to open `assets/images/manifest.md` and understand where each image came from.
|
||||
|
||||
### 下载示例
|
||||
|
||||
```bash
|
||||
cd assets/images
|
||||
curl -sL -o ai-brain.jpg "https://images.unsplash.com/photo-XXXXXXXXX?w=640&q=80"
|
||||
```
|
||||
|
||||
### 截图示例
|
||||
|
||||
Use a browser or Playwright screenshot when an official/news page is the best available visual:
|
||||
|
||||
```bash
|
||||
mkdir -p assets/images
|
||||
# Example filename convention:
|
||||
# assets/images/01-perplexity-official-page.png
|
||||
# assets/images/02-anthropic-announcement.png
|
||||
```
|
||||
|
||||
After every screenshot, quickly inspect the saved image. If it captured a Cloudflare/security verification page instead of the real article, discard it and redo the capture through an interactive browser after clicking the human verification checkbox. The manifest should cite the real source page, not the verification page.
|
||||
|
||||
Crop or resize screenshots only after saving the original. Keep names numbered by card order so it is obvious which asset belongs to which news item.
|
||||
|
||||
### 选图原则
|
||||
|
||||
- **Match the topic**: AI → neural/brain visuals, SpaceX → rockets/space, legal → courthouse/gavel, data centers → servers
|
||||
- **Prefer real/news-specific visuals over abstract/tech images.** Abstract images are only allowed as background texture, not as the primary evidence for the story.
|
||||
- **Dark backgrounds** work best with the dark UI theme
|
||||
- **Optimize size**: download at 640px width, quality 80 — sufficient for 1920px video
|
||||
|
||||
### 按主题的搜索关键词
|
||||
|
||||
| Topic | Unsplash Search Terms |
|
||||
|-------|----------------------|
|
||||
| AI/ML | artificial intelligence, neural network, machine learning, brain digital |
|
||||
| Space | rocket launch, space, nasa, spacex |
|
||||
| Legal | law, courtroom, gavel, justice |
|
||||
| Infrastructure | server room, data center, technology, network |
|
||||
| Energy | power lines, electricity, energy, solar panels |
|
||||
| Open source | code, programming, developer, github |
|
||||
| Research | laboratory, science, microscope, quantum |
|
||||
|
||||
## Step 3:使用 edge-tts 生成口播
|
||||
|
||||
### 安装
|
||||
|
||||
```bash
|
||||
pip3 install --break-system-packages edge-tts
|
||||
```
|
||||
|
||||
### 推荐音色
|
||||
|
||||
| Voice | Language | Style | Best For |
|
||||
|-------|----------|-------|----------|
|
||||
| zh-CN-YunxiNeural | Chinese | Young male, energetic | Tech news, dynamic delivery |
|
||||
| zh-CN-XiaoxiaoNeural | Chinese | Female, professional | Formal news, corporate |
|
||||
| zh-CN-YunjianNeural | Chinese | Male, deep | Dramatic reveals, impacts |
|
||||
| en-US-GuyNeural | English | Male, mature | English news |
|
||||
|
||||
### 生成示例
|
||||
|
||||
```bash
|
||||
cd assets/audio
|
||||
|
||||
# Intro (brief, punchy)
|
||||
edge-tts --voice zh-CN-YunxiNeural --rate="+30%" \
|
||||
--text "AI快报,5月7日,本周最热资讯。" \
|
||||
--write-media intro.mp3
|
||||
|
||||
# News cards — 直接说内容,不说"第X条"
|
||||
edge-tts --voice zh-CN-YunxiNeural --rate="+30%" \
|
||||
--text "Anthropic 让 Claude 学会了做梦。Claude 可在会话间隙回顾历史,发现错误模式并自我改进。" \
|
||||
--write-media card1.mp3
|
||||
|
||||
# No default outro
|
||||
# 信息差视频默认不生成“以上就是本期/关注我/下期见”这类片尾口播。
|
||||
```
|
||||
|
||||
### 口播文案规则
|
||||
|
||||
- **语速快**:必须加 `--rate="+30%"` 参数加速,默认语速太慢,不适合快节奏资讯
|
||||
- **不说"第X条"**:直接讲内容,不要"第一条""第二条"这种废话,紧凑不啰嗦
|
||||
- **不说"本周""今天"等时间词**:视频本身就是快报,观众知道是最新资讯
|
||||
- **开头只说日期信息差**:第一句必须是 `X月X日的 AI 信息差来了。`,例如 `5月9日的 AI 信息差来了。` 后面直接进入新闻内容。
|
||||
- **不说片尾 CTA**:不要默认说“以上就是本期”“我是 XXX”“关注我”“下期见”。用户明确要求账号口播时才加。
|
||||
- **不讲无关总结**:默认不加片尾总结、观点升华、账号人设口播;信息差讲完就结束。
|
||||
- **headline + 1 sentence max**:每条新闻标题 + 一句话解释,不超过 40 字中文
|
||||
- **字幕同步**:每个 `intro.mp3`、`card*.mp3` 都要在 HTML 里有对应字幕 clip;字幕起止时间以配音时长为准,允许比配音前后各多留 0.1-0.2 秒。
|
||||
- Duration target: **8-12 seconds per card**(加速后)
|
||||
- Check duration: `ffprobe -v quiet -show_entries format=duration -of csv=p=0 card1.mp3`
|
||||
|
||||
### BGM(背景音乐)
|
||||
|
||||
**必须添加背景音乐。** 没有BGM的视频感觉空、不专业。
|
||||
|
||||
#### BGM 获取规则(改为本地优先)
|
||||
|
||||
1. **先从当前项目目录查找本地 BGM**,优先检查:
|
||||
- `assets/audio/bgm.mp3`
|
||||
- `assets/audio/bgm.wav`
|
||||
- `assets/audio/` 下其他可用音乐文件(`.mp3/.wav/.m4a`)
|
||||
2. **如果找不到本地 BGM,不要自动联网下载或生成。**
|
||||
3. **必须询问用户是否需要添加 BGM。**
|
||||
4. 用户确认需要后,**请用户提供音频文件**(或给出其在本地项目中的路径),再接入时间线。
|
||||
|
||||
示例检查命令:
|
||||
|
||||
```bash
|
||||
ls assets/audio
|
||||
```
|
||||
|
||||
#### BGM 选择建议
|
||||
|
||||
- **风格**:科技感/电子/Minimal / Synthwave / Ambient Tech
|
||||
- **节奏**:中等偏快(110-130 BPM),和快资讯节奏匹配
|
||||
- **不要人声**:纯音乐,避免和配音冲突
|
||||
- **音量**:BGM track 的 `data-volume` 设 **0.08-0.12**,远低于配音(0.85),作为氛围层
|
||||
- **关键词搜索**:tech ambient, synthwave minimal, future bass, corporate tech, digital pulse
|
||||
|
||||
#### 在 HyperFrames 中使用 BGM
|
||||
|
||||
```html
|
||||
<audio id="bgm" class="clip"
|
||||
data-start="0" data-duration="86"
|
||||
data-track-index="20" data-volume="0.10"
|
||||
src="assets/audio/bgm.mp3"></audio>
|
||||
```
|
||||
|
||||
- BGM 放在**高号 track**(如 track 20),避免和配音 track 冲突
|
||||
- 全程播放,`data-start="0"`,`data-duration` 等于视频总时长
|
||||
|
||||
## Step 4:编写 HyperFrames 合成页面
|
||||
|
||||
### 项目结构
|
||||
|
||||
```
|
||||
tech-news-flash/
|
||||
├── index.html # Main composition
|
||||
├── assets/
|
||||
│ ├── images/ # Downloaded images
|
||||
│ │ ├── ai-brain.jpg
|
||||
│ │ ├── spacex-rocket.jpg
|
||||
│ │ └── ...
|
||||
│ └── audio/ # Voiceover files
|
||||
│ ├── intro.mp3
|
||||
│ ├── card1.mp3
|
||||
│ └── ...
|
||||
├── hyperframes.json
|
||||
├── meta.json
|
||||
└── package.json
|
||||
```
|
||||
|
||||
### 默认画布
|
||||
|
||||
```html
|
||||
<meta name="viewport" content="width=1920, height=1080" />
|
||||
<div
|
||||
id="root"
|
||||
data-composition-id="main"
|
||||
data-start="0"
|
||||
data-duration="86"
|
||||
data-width="1920"
|
||||
data-height="1080"
|
||||
>
|
||||
```
|
||||
|
||||
横屏渲染命令:
|
||||
|
||||
```bash
|
||||
npx hyperframes render --resolution landscape -o renders/ai-tech-news.mp4
|
||||
```
|
||||
|
||||
只有明确要求竖屏时再使用:
|
||||
|
||||
```bash
|
||||
npx hyperframes render --resolution portrait -o renders/ai-tech-news-portrait.mp4
|
||||
```
|
||||
|
||||
### 卡片布局:图文分栏
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────┐
|
||||
│ ┌──────────────┐ ┌──────────────────────────┐ │
|
||||
│ │ │ │ NO. 01 │ │
|
||||
│ │ IMAGE │ │ 🧠 新功能 │ │
|
||||
│ │ (45%) │ │ │ │
|
||||
│ │ │ │ Headline Text │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ │ Description text... │ │
|
||||
│ └──────────────┘ └──────────────────────────┘ │
|
||||
│ ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░░░░░░░░ progress bar │
|
||||
└─────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### 关键设计原则
|
||||
|
||||
1. **16:9 横屏优先** — 充分利用左右空间:一侧放新闻图/视频/产品截图,另一侧放关键词、动画解释或少量标题。
|
||||
2. **字幕承载完整口播** — 画面不要重复展示整段新闻文案;只放短标题、关键词、数字和视觉隐喻。
|
||||
3. **每条新闻要有不同动画点** — 不要 5 条都套同一个卡片模板。每条至少有一个贴合新闻的动态隐喻或素材切换。
|
||||
4. **真实素材是主画面的一部分** — 每张卡至少 35% 画面面积来自真实素材、官网截图、新闻截图或真实照片;解释动画可以覆盖在上面,但不能完全替代它。
|
||||
5. **Progress bar at card bottom** — shows reading progress, adds motion
|
||||
6. **Numbered cards (01-05)** — creates sequence feel
|
||||
7. **Category-colored tags** — visual scanning aid
|
||||
8. **Subtle image overlay** — gradient from image edge to dark background
|
||||
9. **Card watermark number** — large transparent number on image for depth
|
||||
|
||||
### 渲染前检查清单
|
||||
|
||||
Before `npx hyperframes render`, verify:
|
||||
|
||||
- `assets/images/` contains one real asset per news item.
|
||||
- `assets/images/manifest.md` maps every card to a source and asset file.
|
||||
- Every card's HTML references its real asset with an `<img>` or visible background image.
|
||||
- Fake UI, fake logos, CSS diagrams, generated abstract art, or stock photos are not the only visual for any news card.
|
||||
- `npx hyperframes inspect` has no layout errors.
|
||||
|
||||
### 音频集成(关键)
|
||||
|
||||
**Web Audio API does NOT render into HyperFrames MP4.** Always use `<audio>` tags:
|
||||
|
||||
```html
|
||||
<audio id="voice1" class="clip"
|
||||
data-start="6" data-duration="17"
|
||||
data-track-index="5" data-volume="0.85"
|
||||
src="assets/audio/card1.mp3"></audio>
|
||||
```
|
||||
|
||||
### 时序策略
|
||||
|
||||
| Element | Duration | Notes |
|
||||
|---------|----------|-------|
|
||||
| Intro | 5-6s | Title + date animation |
|
||||
| News Card | voice_duration + 1s | Let voice finish, then transition |
|
||||
| Card Transition | 0.3s | Fade/slide out |
|
||||
| End beat | 0.5-1s | Last news finishes, then fade/hard cut. No CTA unless requested |
|
||||
|
||||
**Always check voice duration first**, then set card `data-duration` to match.
|
||||
|
||||
### HyperFrames 常见坑位
|
||||
|
||||
- **No `Math.random()`** — use `mulberry32` seeded PRNG
|
||||
- **Track indices**: persistent elements (header/ticker) on track 10+, cards on track 1, voice on track 5, SFX on track 6-7, BGM on track 20
|
||||
- **`class="clip"`** required on all timed elements
|
||||
- **Audio must be `<audio>` tags** with `data-track-index` — not Web Audio API
|
||||
|
||||
## Step 5:将音效合成为 WAV 文件
|
||||
|
||||
**Web Audio API 合成的音效不会渲染进 HyperFrames 的 MP4 输出。**
|
||||
|
||||
解决方案:用 Python 脚本把音效预渲染成 WAV 文件,然后用 `<audio>` 标签引入。
|
||||
|
||||
### 生成音效文件
|
||||
|
||||
```bash
|
||||
python3 scripts/render_sfx.py assets/audio/sfx
|
||||
```
|
||||
|
||||
默认会先产出 6 种常用音效(**基础参考包**):
|
||||
|
||||
| 文件 | 时长 | 用途 |
|
||||
|------|------|------|
|
||||
| whoosh.wav | 350ms | 卡片转场 |
|
||||
| impact.wav | 400ms | 卡片揭示/重音 |
|
||||
| pop.wav | 150ms | 通知/轻反馈 |
|
||||
| click.wav | 100ms | UI 点击 |
|
||||
| sparkle.wav | 600ms | 揭晓/成就/片尾 |
|
||||
| rise.wav | 1000ms | 紧张/铺垫 |
|
||||
|
||||
这 6 种只是默认参考,不是上限。
|
||||
如果分镜需要其他声音(例如机械启动、扫描、故障告警、数据流、倒计时、转场吸附、胜利提示等),应在 `scripts/render_sfx.py` 里新增对应合成函数并输出新的 WAV 文件,再按时间线接入。
|
||||
|
||||
新增音效建议遵循:
|
||||
|
||||
- 先定义用途(在哪个镜头触发、要表达什么情绪/动作)
|
||||
- 再确定参数(时长、基频、包络、滤波、音量)
|
||||
- 最后统一命名(如 `scan.wav`、`alarm.wav`、`countdown.wav`)并写入音效映射注释
|
||||
|
||||
### 在 HyperFrames 中使用
|
||||
|
||||
```html
|
||||
<!-- 卡片入场音效:whoosh + impact 组合 -->
|
||||
<audio id="sfx-card1" class="clip"
|
||||
data-start="4.0" data-duration="0.5"
|
||||
data-track-index="6" data-volume="0.4"
|
||||
src="assets/audio/sfx/whoosh.wav"></audio>
|
||||
<audio id="sfx-card1b" class="clip"
|
||||
data-start="4.1" data-duration="0.5"
|
||||
data-track-index="7" data-volume="0.35"
|
||||
src="assets/audio/sfx/impact.wav"></audio>
|
||||
```
|
||||
|
||||
### 自定义音效
|
||||
|
||||
修改 `scripts/render_sfx.py` 中的合成函数即可。每个函数是纯数学(正弦波+噪声+包络),无需外部依赖。
|
||||
|
||||
## Step 6:渲染导出
|
||||
|
||||
```bash
|
||||
source ~/.nvm/nvm.sh && nvm use 22
|
||||
cd tech-news-flash
|
||||
npx hyperframes lint # Check for errors
|
||||
npx hyperframes snapshot --at 3,8,25,42,58,82 # Verify key frames
|
||||
npx hyperframes render -o output.mp4 # Render final video
|
||||
```
|
||||
|
||||
## 全流程速查
|
||||
|
||||
```bash
|
||||
# 1. Init project
|
||||
source ~/.nvm/nvm.sh && nvm use 22
|
||||
npx hyperframes init ai-news-$(date +%Y%m%d)
|
||||
cd ai-news-$(date +%Y%m%d)
|
||||
|
||||
# 2. Create asset directories
|
||||
mkdir -p assets/images assets/audio
|
||||
|
||||
# 3. Download images (edit URLs for your news)
|
||||
curl -sL -o assets/images/topic1.jpg "https://images.unsplash.com/photo-XXX?w=640&q=80"
|
||||
|
||||
# 4. Generate voiceover
|
||||
edge-tts --voice zh-CN-YunxiNeural --text "..." --write-media assets/audio/card1.mp3
|
||||
|
||||
# 5. Write index.html (use the card layout template)
|
||||
|
||||
# 6. Lint, preview, render
|
||||
npx hyperframes lint
|
||||
npx hyperframes preview
|
||||
npx hyperframes render -o output.mp4
|
||||
```
|
||||
80
skills/ai-tech-news-video/scripts/generate_voiceover.py
Normal file
80
skills/ai-tech-news-video/scripts/generate_voiceover.py
Normal file
@ -0,0 +1,80 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Generate all voiceover audio for AI Tech News Flash video using edge-tts."""
|
||||
|
||||
import asyncio
|
||||
import subprocess
|
||||
import sys
|
||||
import os
|
||||
|
||||
VOICE = "zh-CN-YunxiNeural"
|
||||
RATE = "+30%" # Fast pace for news flash style
|
||||
OUTPUT_DIR = "assets/audio"
|
||||
|
||||
# Script template - edit these for each episode
|
||||
# Rule: NO "第X条", NO time filler words, just content directly
|
||||
SCRIPTS = {
|
||||
"intro": "AI快报,{date},最热资讯。",
|
||||
"card1": "{headline1}。{desc1}",
|
||||
"card2": "{headline2}。{desc2}",
|
||||
"card3": "{headline3}。{desc3}",
|
||||
"card4": "{headline4}。{desc4}",
|
||||
"card5": "{headline5}。{desc5}",
|
||||
"outro": "以上就是本期 AI 快报。关注拓扑同学,下期见。",
|
||||
}
|
||||
|
||||
async def generate_audio(text, output_path, voice=VOICE, rate=RATE):
|
||||
"""Generate a single audio file with edge-tts."""
|
||||
import edge_tts
|
||||
communicate = edge_tts.Communicate(text, voice, rate=rate)
|
||||
await communicate.save(output_path)
|
||||
print(f" Generated: {output_path}")
|
||||
|
||||
def get_duration(filepath):
|
||||
"""Get audio duration in seconds."""
|
||||
result = subprocess.run(
|
||||
["ffprobe", "-v", "quiet", "-show_entries", "format=duration",
|
||||
"-of", "csv=p=0", filepath],
|
||||
capture_output=True, text=True
|
||||
)
|
||||
return float(result.stdout.strip())
|
||||
|
||||
async def main():
|
||||
os.makedirs(OUTPUT_DIR, exist_ok=True)
|
||||
|
||||
# TODO: Replace with actual episode content
|
||||
date = "2026年5月7日"
|
||||
headlines = {
|
||||
"1": ("Anthropic 让 Claude 学会了做梦", "Claude 可在会话间隙回顾历史记录,发现自身错误模式并自我改进"),
|
||||
"2": ("xAI 并入 SpaceX 改名 SpaceXAI", "Elon Musk 宣布 xAI 不再独立,同时与 Anthropic 达成算力合作"),
|
||||
"3": ("OpenAI 联合 AMD 和 NVIDIA 发布 MRC 协议", "提升大规模 AI 训练集群的 GPU 网络性能与弹性"),
|
||||
"4": ("Musk 诉 Altman 庭审关键证人阶段", "Zilis 出庭作证,邮件曝光 Musk 曾计划将 OpenAI 纳入 Tesla"),
|
||||
"5": ("43% 美国人认为数据中心推高了电费", "Pew Research 调查显示数据中心能耗已成两党共识议题"),
|
||||
}
|
||||
|
||||
# Generate all audio files
|
||||
tasks = []
|
||||
|
||||
intro_text = SCRIPTS["intro"].format(date=date)
|
||||
tasks.append(generate_audio(intro_text, f"{OUTPUT_DIR}/intro.mp3"))
|
||||
|
||||
for i, (headline, desc) in headlines.items():
|
||||
card_text = SCRIPTS[f"card{i}"].format(**{f"headline{i}": headline, f"desc{i}": desc})
|
||||
tasks.append(generate_audio(card_text, f"{OUTPUT_DIR}/card{i}.mp3"))
|
||||
|
||||
tasks.append(generate_audio(SCRIPTS["outro"], f"{OUTPUT_DIR}/outro.mp3"))
|
||||
|
||||
await asyncio.gather(*tasks)
|
||||
|
||||
# Print durations for timing reference
|
||||
print("\n📊 Audio durations (for HyperFrames timing):")
|
||||
total = 0
|
||||
for name in ["intro", "card1", "card2", "card3", "card4", "card5", "outro"]:
|
||||
path = f"{OUTPUT_DIR}/{name}.mp3"
|
||||
if os.path.exists(path):
|
||||
dur = get_duration(path)
|
||||
total += dur
|
||||
print(f" {name}: {dur:.1f}s")
|
||||
print(f" TOTAL: {total:.1f}s")
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
194
skills/ai-tech-news-video/scripts/render-sfx.js
Normal file
194
skills/ai-tech-news-video/scripts/render-sfx.js
Normal file
@ -0,0 +1,194 @@
|
||||
#!/usr/bin/env node
|
||||
/**
|
||||
* Render Web Audio API synthesized sound effects to WAV files.
|
||||
* These WAV files can then be used in HyperFrames via <audio> tags.
|
||||
*
|
||||
* Usage: node render-sfx.js [output-dir]
|
||||
* Output: output-dir/whoosh.wav, impact.wav, pop.wav, sparkle.wav, click.wav, rise.wav
|
||||
*/
|
||||
|
||||
const fs = require('fs');
|
||||
const path = require('path');
|
||||
|
||||
const OUTPUT_DIR = process.argv[2] || 'assets/audio/sfx';
|
||||
|
||||
// ── Sound synthesis functions (deterministic, no Math.random) ──
|
||||
|
||||
function mulberry32(seed) {
|
||||
return function() {
|
||||
seed |= 0; seed = seed + 0x6D2B79F5 | 0;
|
||||
let t = Math.imul(seed ^ seed >>> 15, 1 | seed);
|
||||
t = t + Math.imul(t ^ t >>> 7, 61 | t) ^ t;
|
||||
return ((t ^ t >>> 14) >>> 0) / 4294967296;
|
||||
};
|
||||
}
|
||||
|
||||
async function renderSound(name, duration, renderFn) {
|
||||
const sampleRate = 44100;
|
||||
const frameCount = Math.ceil(sampleRate * duration);
|
||||
const offlineCtx = new OfflineAudioContext(1, frameCount, sampleRate);
|
||||
|
||||
await renderFn(offlineCtx, duration);
|
||||
|
||||
const buffer = await offlineCtx.startRendering();
|
||||
const wav = audioBufferToWav(buffer);
|
||||
const outPath = path.join(OUTPUT_DIR, `${name}.wav`);
|
||||
fs.writeFileSync(outPath, Buffer.from(wav));
|
||||
console.log(` ✅ ${name}.wav (${(duration * 1000).toFixed(0)}ms, ${(wav.byteLength / 1024).toFixed(1)}KB)`);
|
||||
return outPath;
|
||||
}
|
||||
|
||||
// ── Sound: Whoosh (transition sweep) ──
|
||||
async function synthWhoosh(ctx, dur) {
|
||||
const bufSize = ctx.sampleRate * 0.35;
|
||||
const buf = ctx.createBuffer(1, bufSize, ctx.sampleRate);
|
||||
const d = buf.getChannelData(0);
|
||||
let rng = mulberry32(42);
|
||||
for (let i = 0; i < bufSize; i++) d[i] = (rng() * 2 - 1) * Math.pow(1 - i / bufSize, 2);
|
||||
|
||||
const src = ctx.createBufferSource(); src.buffer = buf;
|
||||
const flt = ctx.createBiquadFilter(); flt.type = 'bandpass'; flt.Q.value = 5;
|
||||
flt.frequency.setValueAtTime(200, 0);
|
||||
flt.frequency.exponentialRampToValueAtTime(3500, 0.15);
|
||||
flt.frequency.exponentialRampToValueAtTime(200, 0.35);
|
||||
const g = ctx.createGain();
|
||||
g.gain.setValueAtTime(0, 0);
|
||||
g.gain.linearRampToValueAtTime(0.35, 0.08);
|
||||
g.gain.linearRampToValueAtTime(0, 0.35);
|
||||
src.connect(flt).connect(g).connect(ctx.destination);
|
||||
src.start(0);
|
||||
}
|
||||
|
||||
// ── Sound: Impact (bass hit) ──
|
||||
async function synthImpact(ctx, dur) {
|
||||
const osc = ctx.createOscillator(); osc.type = 'sine';
|
||||
const g = ctx.createGain();
|
||||
osc.frequency.setValueAtTime(120, 0);
|
||||
osc.frequency.exponentialRampToValueAtTime(30, 0.25);
|
||||
g.gain.setValueAtTime(0.6, 0);
|
||||
g.gain.exponentialRampToValueAtTime(0.001, 0.3);
|
||||
osc.connect(g).connect(ctx.destination);
|
||||
osc.start(0); osc.stop(0.3);
|
||||
|
||||
// Noise burst
|
||||
const bufSize = ctx.sampleRate * 0.1;
|
||||
const buf = ctx.createBuffer(1, bufSize, ctx.sampleRate);
|
||||
const d = buf.getChannelData(0);
|
||||
let rng = mulberry32(99);
|
||||
for (let i = 0; i < bufSize; i++) d[i] = (rng() * 2 - 1);
|
||||
const noise = ctx.createBufferSource(); noise.buffer = buf;
|
||||
const ng = ctx.createGain();
|
||||
ng.gain.setValueAtTime(0.4, 0);
|
||||
ng.gain.exponentialRampToValueAtTime(0.001, 0.12);
|
||||
noise.connect(ng).connect(ctx.destination);
|
||||
noise.start(0);
|
||||
}
|
||||
|
||||
// ── Sound: Pop (notification) ──
|
||||
async function synthPop(ctx, dur) {
|
||||
const osc = ctx.createOscillator(); osc.type = 'sine';
|
||||
const g = ctx.createGain();
|
||||
osc.frequency.setValueAtTime(880, 0);
|
||||
osc.frequency.exponentialRampToValueAtTime(1400, 0.04);
|
||||
g.gain.setValueAtTime(0.3, 0);
|
||||
g.gain.exponentialRampToValueAtTime(0.001, 0.12);
|
||||
osc.connect(g).connect(ctx.destination);
|
||||
osc.start(0); osc.stop(0.12);
|
||||
}
|
||||
|
||||
// ── Sound: Click (UI feedback) ──
|
||||
async function synthClick(ctx, dur) {
|
||||
const osc = ctx.createOscillator(); osc.type = 'sine';
|
||||
const g = ctx.createGain();
|
||||
osc.frequency.setValueAtTime(1200, 0);
|
||||
osc.frequency.exponentialRampToValueAtTime(600, 0.05);
|
||||
g.gain.setValueAtTime(0.25, 0);
|
||||
g.gain.exponentialRampToValueAtTime(0.001, 0.08);
|
||||
osc.connect(g).connect(ctx.destination);
|
||||
osc.start(0); osc.stop(0.08);
|
||||
}
|
||||
|
||||
// ── Sound: Sparkle (reveal/achievement) ──
|
||||
async function synthSparkle(ctx, dur) {
|
||||
const notes = [523.25, 659.25, 783.99, 1046.5, 1318.5];
|
||||
notes.forEach((freq, i) => {
|
||||
const osc = ctx.createOscillator(); osc.type = 'sine';
|
||||
const g = ctx.createGain();
|
||||
osc.connect(g).connect(ctx.destination);
|
||||
const s = i * 0.07;
|
||||
osc.frequency.value = freq;
|
||||
g.gain.setValueAtTime(0, s);
|
||||
g.gain.linearRampToValueAtTime(0.15, s + 0.02);
|
||||
g.gain.exponentialRampToValueAtTime(0.001, s + 0.4);
|
||||
osc.start(s); osc.stop(s + 0.4);
|
||||
});
|
||||
}
|
||||
|
||||
// ── Sound: Rise (tension builder) ──
|
||||
async function synthRise(ctx, dur) {
|
||||
const osc = ctx.createOscillator(); osc.type = 'sawtooth';
|
||||
const flt = ctx.createBiquadFilter(); flt.type = 'lowpass'; flt.Q.value = 8;
|
||||
const g = ctx.createGain();
|
||||
osc.connect(flt).connect(g).connect(ctx.destination);
|
||||
flt.frequency.setValueAtTime(100, 0);
|
||||
flt.frequency.exponentialRampToValueAtTime(3000, 1.0);
|
||||
g.gain.setValueAtTime(0.01, 0);
|
||||
g.gain.linearRampToValueAtTime(0.25, 1.0);
|
||||
osc.start(0); osc.stop(1.0);
|
||||
}
|
||||
|
||||
// ── WAV encoder ──
|
||||
function audioBufferToWav(buffer) {
|
||||
const numCh = buffer.numberOfChannels;
|
||||
const sampleRate = buffer.sampleRate;
|
||||
const bitDepth = 16;
|
||||
const bytesPerSample = bitDepth / 8;
|
||||
const data = buffer.getChannelData(0);
|
||||
const dataLength = data.length * bytesPerSample;
|
||||
const headerLength = 44;
|
||||
const totalLength = headerLength + dataLength;
|
||||
const ab = new ArrayBuffer(totalLength);
|
||||
const view = new DataView(ab);
|
||||
|
||||
writeStr(view, 0, 'RIFF');
|
||||
view.setUint32(4, totalLength - 8, true);
|
||||
writeStr(view, 8, 'WAVE');
|
||||
writeStr(view, 12, 'fmt ');
|
||||
view.setUint32(16, 16, true);
|
||||
view.setUint16(20, 1, true);
|
||||
view.setUint16(22, numCh, true);
|
||||
view.setUint32(24, sampleRate, true);
|
||||
view.setUint32(28, sampleRate * numCh * bytesPerSample, true);
|
||||
view.setUint16(32, numCh * bytesPerSample, true);
|
||||
view.setUint16(34, bitDepth, true);
|
||||
writeStr(view, 36, 'data');
|
||||
view.setUint32(40, dataLength, true);
|
||||
|
||||
let offset = 44;
|
||||
for (let i = 0; i < data.length; i++, offset += 2) {
|
||||
const s = Math.max(-1, Math.min(1, data[i]));
|
||||
view.setInt16(offset, s < 0 ? s * 0x8000 : s * 0x7FFF, true);
|
||||
}
|
||||
return ab;
|
||||
}
|
||||
|
||||
function writeStr(view, offset, str) {
|
||||
for (let i = 0; i < str.length; i++) view.setUint8(offset + i, str.charCodeAt(i));
|
||||
}
|
||||
|
||||
// ── Main ──
|
||||
async function main() {
|
||||
fs.mkdirSync(OUTPUT_DIR, { recursive: true });
|
||||
console.log(`🔊 Rendering SFX to ${OUTPUT_DIR}/\n`);
|
||||
|
||||
await renderSound('whoosh', 0.4, synthWhoosh);
|
||||
await renderSound('impact', 0.4, synthImpact);
|
||||
await renderSound('pop', 0.15, synthPop);
|
||||
await renderSound('click', 0.1, synthClick);
|
||||
await renderSound('sparkle', 0.6, synthSparkle);
|
||||
await renderSound('rise', 1.1, synthRise);
|
||||
|
||||
console.log('\n✨ Done! Use these WAV files in HyperFrames <audio> tags.');
|
||||
}
|
||||
|
||||
main().catch(console.error);
|
||||
191
skills/ai-tech-news-video/scripts/render_sfx.py
Normal file
191
skills/ai-tech-news-video/scripts/render_sfx.py
Normal file
@ -0,0 +1,191 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Synthesize sound effects as WAV files using numpy.
|
||||
These WAV files can be used in HyperFrames via <audio> tags.
|
||||
|
||||
Usage: python3 render_sfx.py [output-dir]
|
||||
Output: whoosh.wav, impact.wav, pop.wav, click.wav, sparkle.wav, rise.wav
|
||||
"""
|
||||
|
||||
import struct
|
||||
import math
|
||||
import os
|
||||
import sys
|
||||
|
||||
OUTPUT_DIR = sys.argv[1] if len(sys.argv) > 1 else "assets/audio/sfx"
|
||||
SAMPLE_RATE = 44100
|
||||
|
||||
def mulberry32(seed):
|
||||
"""Seeded PRNG for deterministic output."""
|
||||
state = [seed]
|
||||
def rand():
|
||||
state[0] = (state[0] + 0x6D2B79F5) & 0xFFFFFFFF
|
||||
t = state[0]
|
||||
t = ((t ^ (t >> 15)) * (1 | t)) & 0xFFFFFFFF
|
||||
t = (t + (((t ^ (t >> 7)) * (61 | t)) ^ t)) & 0xFFFFFFFF
|
||||
return ((t ^ (t >> 14)) >> 0) / 4294967296
|
||||
return rand
|
||||
|
||||
def write_wav(filename, samples):
|
||||
"""Write 16-bit mono WAV file."""
|
||||
num_samples = len(samples)
|
||||
data_size = num_samples * 2
|
||||
with open(filename, 'wb') as f:
|
||||
f.write(b'RIFF')
|
||||
f.write(struct.pack('<I', 36 + data_size))
|
||||
f.write(b'WAVE')
|
||||
f.write(b'fmt ')
|
||||
f.write(struct.pack('<IHHIIHH', 16, 1, 1, SAMPLE_RATE, SAMPLE_RATE * 2, 2, 16))
|
||||
f.write(b'data')
|
||||
f.write(struct.pack('<I', data_size))
|
||||
for s in samples:
|
||||
s = max(-1.0, min(1.0, s))
|
||||
f.write(struct.pack('<h', int(s * 32767)))
|
||||
|
||||
def synth_whoosh():
|
||||
"""Whoosh / transition sweep (350ms)."""
|
||||
duration = 0.35
|
||||
n = int(SAMPLE_RATE * duration)
|
||||
rng = mulberry32(42)
|
||||
samples = []
|
||||
for i in range(n):
|
||||
t = i / SAMPLE_RATE
|
||||
# Noise with envelope
|
||||
noise = (rng() * 2 - 1) * (1 - i/n) ** 2
|
||||
# Frequency sweep: bandpass center 200→3500→200
|
||||
progress = i / n
|
||||
if progress < 0.43:
|
||||
center = 200 * (3500/200) ** (progress / 0.43)
|
||||
else:
|
||||
center = 3500 * (200/3500) ** ((progress - 0.43) / 0.57)
|
||||
# Simple bandpass approximation: modulate noise with sweep
|
||||
mod = math.sin(2 * math.pi * center * t)
|
||||
# Gain envelope
|
||||
if t < 0.08:
|
||||
gain = t / 0.08
|
||||
else:
|
||||
gain = 1 - (t - 0.08) / (duration - 0.08)
|
||||
samples.append(noise * 0.35 * max(0, gain))
|
||||
return samples
|
||||
|
||||
def synth_impact():
|
||||
"""Impact / bass hit (400ms)."""
|
||||
duration = 0.4
|
||||
n = int(SAMPLE_RATE * duration)
|
||||
rng = mulberry32(99)
|
||||
samples = []
|
||||
for i in range(n):
|
||||
t = i / SAMPLE_RATE
|
||||
# Low sine sweep 120→30Hz
|
||||
freq = 120 * (30/120) ** (min(t, 0.25) / 0.25) if t < 0.25 else 30
|
||||
sine = math.sin(2 * math.pi * freq * t)
|
||||
gain_s = max(0.001, math.exp(-t * 10)) * 0.6
|
||||
# Noise burst (first 120ms)
|
||||
if t < 0.12:
|
||||
noise = (rng() * 2 - 1) * max(0.001, math.exp(-t * 25)) * 0.4
|
||||
else:
|
||||
noise = 0
|
||||
samples.append(sine * gain_s + noise)
|
||||
return samples
|
||||
|
||||
def synth_pop():
|
||||
"""Pop / notification (150ms)."""
|
||||
duration = 0.15
|
||||
n = int(SAMPLE_RATE * duration)
|
||||
samples = []
|
||||
phase = 0
|
||||
for i in range(n):
|
||||
t = i / SAMPLE_RATE
|
||||
freq = 880 * (1400/880) ** (min(t, 0.04) / 0.04) if t < 0.04 else 1400
|
||||
phase += 2 * math.pi * freq / SAMPLE_RATE
|
||||
sine = math.sin(phase)
|
||||
gain = max(0.001, math.exp(-t * 30)) * 0.3
|
||||
samples.append(sine * gain)
|
||||
return samples
|
||||
|
||||
def synth_click():
|
||||
"""Click / UI feedback (100ms)."""
|
||||
duration = 0.1
|
||||
n = int(SAMPLE_RATE * duration)
|
||||
samples = []
|
||||
phase = 0
|
||||
for i in range(n):
|
||||
t = i / SAMPLE_RATE
|
||||
freq = 1200 * (600/1200) ** (min(t, 0.05) / 0.05) if t < 0.05 else 600
|
||||
phase += 2 * math.pi * freq / SAMPLE_RATE
|
||||
sine = math.sin(phase)
|
||||
gain = max(0.001, math.exp(-t * 40)) * 0.25
|
||||
samples.append(sine * gain)
|
||||
return samples
|
||||
|
||||
def synth_sparkle():
|
||||
"""Sparkle / reveal (600ms)."""
|
||||
duration = 0.6
|
||||
n = int(SAMPLE_RATE * duration)
|
||||
notes = [523.25, 659.25, 783.99, 1046.5, 1318.5]
|
||||
samples = [0.0] * n
|
||||
for note_i, freq in enumerate(notes):
|
||||
start = note_i * 0.07
|
||||
phase = 0
|
||||
for i in range(n):
|
||||
t = i / SAMPLE_RATE
|
||||
if t < start:
|
||||
continue
|
||||
local_t = t - start
|
||||
if local_t > 0.4:
|
||||
break
|
||||
phase += 2 * math.pi * freq / SAMPLE_RATE
|
||||
sine = math.sin(phase)
|
||||
# Envelope: quick attack, exponential decay
|
||||
if local_t < 0.02:
|
||||
gain = (local_t / 0.02) * 0.15
|
||||
else:
|
||||
gain = max(0.001, math.exp(-(local_t - 0.02) * 10)) * 0.15
|
||||
samples[i] += sine * gain
|
||||
return samples
|
||||
|
||||
def synth_rise():
|
||||
"""Rise / tension builder (1s)."""
|
||||
duration = 1.0
|
||||
n = int(SAMPLE_RATE * duration)
|
||||
samples = []
|
||||
phase = 0
|
||||
for i in range(n):
|
||||
t = i / SAMPLE_RATE
|
||||
# Sawtooth: sum of harmonics
|
||||
val = 0
|
||||
for h in range(1, 8):
|
||||
val += (1/h) * math.sin(2 * math.pi * 100 * h * t)
|
||||
# Lowpass approximation: reduce higher harmonics over time
|
||||
cutoff_ratio = t / duration
|
||||
val *= cutoff_ratio ** 2
|
||||
# Gain: slow rise
|
||||
gain = 0.01 + 0.24 * (t / duration)
|
||||
samples.append(val * gain)
|
||||
return samples
|
||||
|
||||
def main():
|
||||
os.makedirs(OUTPUT_DIR, exist_ok=True)
|
||||
print(f"🔊 Rendering SFX to {OUTPUT_DIR}/\n")
|
||||
|
||||
sounds = {
|
||||
'whoosh': synth_whoosh,
|
||||
'impact': synth_impact,
|
||||
'pop': synth_pop,
|
||||
'click': synth_click,
|
||||
'sparkle': synth_sparkle,
|
||||
'rise': synth_rise,
|
||||
}
|
||||
|
||||
for name, synth_fn in sounds.items():
|
||||
samples = synth_fn()
|
||||
path = os.path.join(OUTPUT_DIR, f"{name}.wav")
|
||||
write_wav(path, samples)
|
||||
dur = len(samples) / SAMPLE_RATE
|
||||
size = os.path.getsize(path)
|
||||
print(f" ✅ {name}.wav ({dur*1000:.0f}ms, {size/1024:.1f}KB)")
|
||||
|
||||
print("\n✨ Done! Add these to HyperFrames with <audio> tags.")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
592
skills/product-intro-video/SKILL.md
Normal file
592
skills/product-intro-video/SKILL.md
Normal file
@ -0,0 +1,592 @@
|
||||
---
|
||||
name: product-intro-video
|
||||
description: "制作产品介绍视频。默认使用中文介绍产品,除非用户明确要求其他语言。流程:访问官网→获取素材(VL验证)→Web搜索补充→用户确认→HyperFrames渲染。支持任意比例。触发:产品视频、product video、官网视频、intro video、产品介绍。"
|
||||
---
|
||||
|
||||
# 产品介绍视频制作 Skill
|
||||
|
||||
根据官网信息制作产品介绍视频,使用 `HyperFrames + GSAP` 渲染。
|
||||
|
||||
## 默认语言规则(中文优先)
|
||||
|
||||
**默认用中文介绍产品。** 只要用户没有明确指定其他语言,旁白、画面主文案、说明性字幕、分镜说明和交付总结都使用中文。
|
||||
|
||||
如果官网原文是英文,也不要直接把英文官网文案当成旁白。应先理解产品,再改写成自然中文,保留必要的英文品牌名、产品名、技术名和短 CTA,例如 `HyperFrames`、`HTML`、`CSS`、`GSAP`、`MP4`。
|
||||
|
||||
只有在以下情况才使用非中文:
|
||||
|
||||
1. 用户明确说要英文、双语或某种具体语言。
|
||||
2. 品牌口号必须保留原文,并且原文比翻译更像品牌资产。
|
||||
3. 代码片段、命令、API 名称、产品专有名词需要保持英文。
|
||||
|
||||
## 中文旁白与音频规则
|
||||
|
||||
### 先判断是否需要旁白
|
||||
|
||||
广告片不是每个画面都必须有旁白。开始写脚本前,先判断这条视频到底需不需要旁白,以及哪些段落需要旁白。
|
||||
|
||||
必须在 `SCRIPT.md` 或 `STORYBOARD.md` 里写清楚:
|
||||
|
||||
1. **旁白段落**:哪些场景需要旁白说明产品、卖点、差异化或 CTA。
|
||||
2. **无旁白段落**:哪些场景只保留原素材声音、音乐、音效、产品画面或情绪停顿。
|
||||
3. **素材优先段落**:如果用户提供了视频素材、产品演示、访谈、实拍、屏幕录制或官网 demo,要先判断这些素材本身是否已经能讲清楚内容,不要强行盖上旁白。
|
||||
4. **留白段落**:品牌露出、视觉冲击、转场、产品界面细节展示,可以留 0.5-2 秒给音乐和画面呼吸。
|
||||
|
||||
判断原则:
|
||||
|
||||
| 情况 | 旁白策略 |
|
||||
|------|----------|
|
||||
| 产品概念复杂、观众不容易一眼看懂 | 用短旁白解释“这是什么”和“为什么有用” |
|
||||
| 画面是完整产品演示或用户提供素材 | 优先让素材自己说话,旁白只补充必要上下文 |
|
||||
| 模板、界面、操作流程正在快速展示 | 可用少量关键词旁白,不要逐项念屏幕内容 |
|
||||
| 情绪高潮、品牌片尾、视觉转场 | 可以无旁白,只用音乐、音效和视觉停顿 |
|
||||
| 用户明确要求无旁白 | 不生成旁白,只做字幕、音乐和音效 |
|
||||
|
||||
旁白密度建议:
|
||||
|
||||
- 15 秒广告:通常 1-3 句旁白即可。
|
||||
- 30 秒广告:通常 4-8 句旁白,中间要留音乐和画面呼吸。
|
||||
- 60 秒产品介绍:可以有完整旁白,但仍应保留关键无旁白展示段。
|
||||
|
||||
不要为了“填满音轨”而写旁白。旁白过多会让广告片像说明书,削弱画面和产品素材的说服力。
|
||||
|
||||
### 中文旁白默认使用 Edge TTS
|
||||
|
||||
默认使用 `edge-tts` 生成中文旁白,不使用 macOS `say`、系统朗读声或低质量离线 TTS,除非 Edge TTS 确认不可用。
|
||||
|
||||
推荐中文声音:
|
||||
|
||||
| 声音 | 适合场景 | 说明 |
|
||||
|------|----------|------|
|
||||
| `zh-CN-YunyangNeural` | 产品介绍、技术产品、企业宣传 | 专业、可靠,默认优先选这个 |
|
||||
| `zh-CN-YunxiNeural` | 节奏更轻快的产品视频 | 年轻、有活力 |
|
||||
| `zh-CN-XiaoxiaoNeural` | 亲和、温暖、轻柔产品 | 女声,温暖自然 |
|
||||
|
||||
先检查本机可用声音:
|
||||
|
||||
```bash
|
||||
edge-tts --list-voices | rg 'zh-CN.*Neural'
|
||||
```
|
||||
|
||||
生成中文旁白:
|
||||
|
||||
```bash
|
||||
edge-tts --voice zh-CN-YunyangNeural --rate +6% \
|
||||
--text "中文旁白内容" \
|
||||
--write-media narration.mp3
|
||||
```
|
||||
|
||||
转成 HyperFrames 稳定使用的 wav:
|
||||
|
||||
```bash
|
||||
ffmpeg -y -i narration.mp3 -ar 48000 -ac 2 narration.wav
|
||||
```
|
||||
|
||||
### 旁白时长处理
|
||||
|
||||
生成旁白后必须检查时长:
|
||||
|
||||
```bash
|
||||
ffprobe -v error -show_entries format=duration -of default=nw=1:nk=1 narration.wav
|
||||
```
|
||||
|
||||
如果旁白超过视频时长,优先压缩脚本,不要大幅加速音频。只有在只差一点点时才允许轻微加速,建议不超过 `+8%` 或 `atempo=1.08`。大幅加速会让中文旁白变尖、赶、僵硬,成片听感会很差。
|
||||
|
||||
### 视频时序必须跟随音频
|
||||
|
||||
**先确认旁白真实长度,再制作视频时间线。** 不要先固定 30 秒画面再把音频塞进去,否则很容易出现前面还正常、最后几个片段音频提前结束或画面拖尾的问题。
|
||||
|
||||
正确流程:
|
||||
|
||||
1. 先写中文旁白脚本。
|
||||
2. 用 Edge TTS 生成旁白音频。
|
||||
3. 用 `ffprobe` 读取旁白真实时长。
|
||||
4. 根据旁白内容把脚本拆成场景,并给每个场景分配时间。
|
||||
5. `STORYBOARD.md`、`index.html`、`data-duration`、转场时间、BGM 时长都跟随这个真实音频长度。
|
||||
6. 渲染后再次用 `ffprobe` 确认成片时长和音频时长一致。
|
||||
|
||||
如果旁白真实长度是 `31.2s`,视频就应该做成约 `31.2s`;除非用户明确要求必须 30 秒,此时应该先压缩旁白脚本,而不是强行让画面和音频互相错位。
|
||||
|
||||
检查命令:
|
||||
|
||||
```bash
|
||||
ffprobe -v error -show_entries format=duration -of default=nw=1:nk=1 narration.wav
|
||||
ffprobe -v error -show_entries stream=codec_type,duration -show_entries format=duration -of json renders/final.mp4
|
||||
```
|
||||
|
||||
交付前必须抽查:
|
||||
|
||||
1. 开头第一句是否和第一幕同步。
|
||||
2. 中段核心卖点是否和对应画面同步。
|
||||
3. 最后一句 CTA 是否落在片尾品牌画面上,而不是提前几秒读完。
|
||||
|
||||
### 背景音乐规则
|
||||
|
||||
如果需要 BGM,优先使用无版权风险方案:用 Web Audio API 或等价离线合成脚本生成背景音乐,再作为独立音轨插入 HyperFrames。
|
||||
|
||||
BGM 必须比旁白低很多,建议:
|
||||
|
||||
```html
|
||||
<audio
|
||||
id="bgm"
|
||||
data-start="0"
|
||||
data-duration="30"
|
||||
data-track-index="20"
|
||||
src="assets/audio/bgm-webaudio.wav"
|
||||
data-volume="0.12"
|
||||
></audio>
|
||||
```
|
||||
|
||||
BGM 设计原则:
|
||||
|
||||
1. 不要有人声,避免和中文旁白冲突。
|
||||
2. 音量只做氛围层,通常 `0.08-0.18`。
|
||||
3. 科技产品适合低频脉冲、柔和和弦、轻微扫频和转场 whoosh。
|
||||
4. 每次渲染前确认旁白和 BGM 都是本地文件,并且音频时长覆盖完整视频。
|
||||
|
||||
## 整体流程(中文工作流)
|
||||
|
||||
```
|
||||
1. 访问官网 → 2. 获取素材(VL验证) → 3. Web搜索补充 → 4. 用户确认 → 5. 写HTML → 6. 渲染
|
||||
```
|
||||
|
||||
**关键:Step 4 必须等用户确认后才能进入创作!**
|
||||
|
||||
---
|
||||
|
||||
## Step 1: 访问官网 & 获取素材
|
||||
|
||||
### 抓取方式
|
||||
|
||||
| 网站类型 | 方式 | 说明 |
|
||||
|---------|------|------|
|
||||
| 静态 HTML | web_fetch | 直接抓文本内容 |
|
||||
| JS 渲染 | browser 工具 | 打开页面→snapshot→screenshot |
|
||||
| API 文档 | web_fetch | 通常可抓 |
|
||||
|
||||
### 素材获取清单
|
||||
|
||||
1. **Logo** — 官网头部/导航栏,右键保存或截图
|
||||
2. **产品截图** — Hero 区域、功能展示、界面截图
|
||||
3. **品牌色** — 从 CSS 变量、Logo 颜色中提取
|
||||
4. **核心文案** — 标语、功能描述、CTA 按钮文字
|
||||
5. **产品图片** — 功能展示区的配图/截图
|
||||
6. **视频预览** — 如果有 demo 视频,记录 URL
|
||||
|
||||
### 截图流程
|
||||
|
||||
```bash
|
||||
# 用 browser 工具截图(全页 + 首屏)
|
||||
browser(screenshot, fullPage=true) → 保存全页截图
|
||||
browser(screenshot, fullPage=false) → 保存首屏截图
|
||||
# 复制到项目目录
|
||||
cp /path/to/screenshot.jpg <project>/assets/images/
|
||||
```
|
||||
|
||||
### VL 验证素材
|
||||
|
||||
对每张截图用 image 工具验证:
|
||||
|
||||
```python
|
||||
image(
|
||||
image="assets/images/hero-screenshot.jpg",
|
||||
prompt="描述这张截图的内容。是否包含:1)产品Logo 2)产品界面 3)品牌色 4)核心功能展示?列出可用于视频的元素。"
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Web 搜索补充信息
|
||||
|
||||
### 搜索内容
|
||||
|
||||
1. **产品新闻** — 最新发布、融资、用户评价
|
||||
2. **竞品对比** — 同类产品有哪些,差异化是什么
|
||||
3. **社区讨论** — Hacker News / Reddit 反馈
|
||||
4. **技术细节** — GitHub star 数、开源协议、API 功能
|
||||
5. **使用案例** — 谁在用,怎么用的
|
||||
|
||||
### 搜索关键词模板
|
||||
|
||||
```
|
||||
"<产品名> review"
|
||||
"<产品名> vs <竞品名>"
|
||||
"<产品名> site:news.ycombinator.com"
|
||||
"<产品名> site:reddit.com"
|
||||
"<产品名> open source" / "<产品名> API"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 3: 用户确认
|
||||
|
||||
### 输出格式
|
||||
|
||||
```
|
||||
📋 产品信息确认
|
||||
|
||||
【基本信息】
|
||||
产品名:HyperFrames
|
||||
公司:HeyGen
|
||||
定位:Open-source, agent-native HTML-to-video 渲染框架
|
||||
品牌色:#00D9A5 (薄荷绿) + #000000 (纯黑)
|
||||
Logo:抽象几何播放图标
|
||||
|
||||
【核心卖点】
|
||||
1. HTML → Video — 用 Web 技术栈写视频
|
||||
2. Agent-Native — AI Agent 可直接生成/修改视频
|
||||
3. Open Source — 开源框架,本地渲染
|
||||
4. 51+ 模板目录 — Notion/Stripe/Raycast Showcase 等
|
||||
|
||||
【视频规格】
|
||||
比例:3:4 竖版 (1080×1440)
|
||||
风格:深色科技风
|
||||
重点突出:AI Agent 可直接生成和修改视频
|
||||
|
||||
【已有素材】
|
||||
✅ 官网全页截图
|
||||
✅ 社区 Playground 截图
|
||||
✅ 本地 CLI 环境可用
|
||||
|
||||
确认信息准确?需要修改什么?
|
||||
```
|
||||
|
||||
### 规格选项
|
||||
|
||||
| 项目 | 选项 |
|
||||
|------|------|
|
||||
| 比例 | 16:9 (1920×1080) / 9:16 (1080×1920) / 3:4 (1080×1440) / 1:1 (1080×1080) |
|
||||
| 风格 | 跟随品牌色 / 深色科技 / 温暖治愈 / 浅色极简 / 渐变流光 |
|
||||
| 配音 | 有旁白 / 无旁白 |
|
||||
| BGM | 用户提供 / 搜索免费 / Web Audio API 合成 |
|
||||
|
||||
---
|
||||
|
||||
## Step 4: HyperFrames HTML 编写
|
||||
|
||||
### 项目初始化
|
||||
|
||||
```bash
|
||||
source ~/.nvm/nvm.sh && nvm use 22
|
||||
npx hyperframes init <product-name> --width 1080 --height 1440
|
||||
cd <product-name>
|
||||
mkdir -p assets/images assets/audio/sfx
|
||||
```
|
||||
|
||||
### 视频结构(产品介绍)
|
||||
|
||||
```
|
||||
1. Logo 入场(0-2s)— 品牌 Logo 动画
|
||||
2. 产品名+定位(2-4s)— 一句话说明是什么
|
||||
3. 核心卖点展示(4-Ns)— 2-4 个功能点,每个 3-5s
|
||||
- 每个卖点:图标/关键词 → 细节展开 → 过渡
|
||||
4. 产品界面/效果展示(Ns-Ms)— 截图或 Demo 预览
|
||||
5. CTA 收尾(Ms-end)— 官网链接 / 下载 / 试用
|
||||
```
|
||||
|
||||
### 品牌色提取规则
|
||||
|
||||
从官网提取后写入 CSS 变量。根据选择的视觉风格,变量体系完全不同:
|
||||
|
||||
#### 深色科技风色板
|
||||
|
||||
```css
|
||||
:root {
|
||||
--brand-primary: #00D9A5; /* 主品牌色 */
|
||||
--brand-secondary: #00B886; /* 深一级 */
|
||||
--brand-glow: rgba(0,217,165,0.15); /* 光晕 */
|
||||
--bg-primary: #000000; /* 背景色 */
|
||||
--bg-card: rgba(20,20,20,0.95);
|
||||
--text-primary: #ffffff;
|
||||
--text-secondary: rgba(255,255,255,0.55);
|
||||
}
|
||||
```
|
||||
|
||||
#### 温暖治愈风色板
|
||||
|
||||
```css
|
||||
:root {
|
||||
--peach: #FFD4C2; /* 蜜桃 */
|
||||
--cream: #FFF8F0; /* 奶油白 */
|
||||
--warm-bg: #FFF1E6; /* 暖杏背景 */
|
||||
--coral: #E8785A; /* 珊瑚橘(主强调) */
|
||||
--coral-soft: rgba(232,120,90,0.15);
|
||||
--amber: #FFB347; /* 琥珀金 */
|
||||
--amber-soft: rgba(255,179,71,0.12);
|
||||
--rose: #F4A0A0; /* 玫瑰粉 */
|
||||
--rose-soft: rgba(244,160,160,0.18);
|
||||
--lavender: #C9B1FF; /* 薰衣草 */
|
||||
--lavender-soft: rgba(201,177,255,0.12);
|
||||
--text: #3D2C2C; /* 深炭棕(不用纯黑!) */
|
||||
--text-dim: #8B7B75; /* 浅棕灰 */
|
||||
--text-light: #B5A8A3;
|
||||
--card: rgba(255,255,255,0.65);
|
||||
--card-border: rgba(255,255,255,0.8);
|
||||
--shadow: 0 8px 40px rgba(180,130,110,0.12);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 视觉风格指南(重点!扁平=无聊)
|
||||
|
||||
**核心原则:产品视频不能是会动的 PPT。要有纵深、光效、质感、温度。**
|
||||
|
||||
#### ❌ 所有风格都要避免的
|
||||
|
||||
- 纯色背景 + 文字(像 Keynote 幻灯片)
|
||||
- 简单的淡入淡出(无质感)
|
||||
- 扁平色块堆叠(无纵深)
|
||||
- 静态文字一直显示(无聊)
|
||||
- Emoji 当图标(廉价感)
|
||||
- 纯黑 + 纯蓝 + 冷色 = 硬、冷、像 PPT
|
||||
|
||||
---
|
||||
|
||||
### 风格 A:深色科技风
|
||||
|
||||
适合:开发者工具、SaaS、AI 产品。参考:Linear、Stripe Showcase、Raycast。
|
||||
|
||||
| 技法 | 实现方式 | 效果 |
|
||||
|------|---------|------|
|
||||
| **玻璃拟态** | `backdrop-filter: blur(20px); background: rgba(255,255,255,0.08); border: 1px solid rgba(255,255,255,0.1)` | 半透明毛玻璃卡片 |
|
||||
| **光晕/发光** | `box-shadow: 0 0 60px 20px var(--brand-glow);` | 品牌色向外扩散 |
|
||||
| **渐变网格** | 多个 `radial-gradient` 叠加 + `mix-blend-mode: screen` | Apple 多彩渐变 |
|
||||
| **3D 倾斜** | `transform: perspective(1000px) rotateY(-5deg)` | 截图有立体感 |
|
||||
| **浮动动画** | GSAP `yoyo: true, repeat: N, y: -8` | 微微漂浮 |
|
||||
|
||||
深色风背景层叠(最少三层):
|
||||
|
||||
```css
|
||||
.bg-base { background: #000; }
|
||||
.bg-glow { /* 品牌色径向渐变 + blur(100px) */ }
|
||||
.bg-grid { /* 细线网格 60px 间距 */ }
|
||||
```
|
||||
|
||||
深色风卡片:
|
||||
|
||||
```css
|
||||
.feature-card {
|
||||
background: rgba(18, 18, 22, 0.7);
|
||||
backdrop-filter: blur(20px);
|
||||
border: 1px solid rgba(255, 255, 255, 0.06);
|
||||
border-radius: 20px;
|
||||
box-shadow: 0 4px 24px rgba(0, 0, 0, 0.3), inset 0 1px 0 rgba(255, 255, 255, 0.05);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 风格 B:温暖治愈风 ⭐推荐
|
||||
|
||||
适合:所有想让人"觉得舒服"的产品。参考:Notion、Figma、日系文具品牌、Calm/Headspace。
|
||||
|
||||
**核心感受:清晨透过纱帘的自然光。像被温柔对待。**
|
||||
|
||||
#### 色彩哲学
|
||||
|
||||
- **零冷色** — 不用蓝、紫、绿做主色。锁定在暖黄→米白→浅驼→陶土红区间
|
||||
- **文字不用纯黑** — 用深炭棕 `#3D2C2C`,像墨水渗入宣纸
|
||||
- **低饱和度** — 所有颜色经过"去锐化",对比度控制在 1:4~1:6
|
||||
- **强调色** — 珊瑚橘 `#E8785A`,如同指尖温度
|
||||
|
||||
#### 背景层叠(温暖版,最少四层)
|
||||
|
||||
```css
|
||||
/* Layer 1: 暖色渐变底色 */
|
||||
.bg-base { background: linear-gradient(160deg, #FFF8F0 0%, #FFE8D6 40%, #FFF0E8 100%); }
|
||||
|
||||
/* Layer 2-4: 有机色斑(像阳光下的色散) */
|
||||
.bg-blob1 {
|
||||
width: 900px; height: 900px; border-radius: 50%;
|
||||
background: radial-gradient(circle, rgba(232,120,90,0.15) 0%, transparent 70%);
|
||||
filter: blur(80px); /* 大模糊 = 柔和扩散 */
|
||||
}
|
||||
.bg-blob2 { /* 琥珀色斑 */ }
|
||||
.bg-blob3 { /* 玫瑰色斑 */ }
|
||||
.bg-blob4 { /* 薰衣草色斑 */ }
|
||||
```
|
||||
|
||||
色斑要有缓慢漂移动画(GSAP `sine.inOut`,7-10s 周期),像阳光在移动。
|
||||
|
||||
#### 暖色粒子(像阳光浮尘)
|
||||
|
||||
不用科技感的网格/点阵。用 20+ 个小圆点,暖色调,在画面中缓慢上浮:
|
||||
|
||||
```css
|
||||
.warm-particle {
|
||||
position: absolute; border-radius: 50%; opacity: 0;
|
||||
/* 每个粒子不同大小(4-10px)、不同暖色、不同位置 */
|
||||
}
|
||||
```
|
||||
|
||||
```js
|
||||
// GSAP: 从底部浮起 → 消失 → 循环
|
||||
tl.fromTo("#wp1", { opacity: 0, scale: 0.3, y: 0 },
|
||||
{ opacity: 0.7, scale: 1, y: -60, duration: 1.8, ease: "sine.inOut" }, 0.5)
|
||||
.to("#wp1", { y: -120, opacity: 0, duration: 1.8, ease: "sine.in" }, 2.3);
|
||||
```
|
||||
|
||||
⚠️ 粒子必须写在静态 HTML 中(不能用 JS 动态创建),否则 HyperFrames 编译时 GSAP 找不到 target。
|
||||
|
||||
#### 卡片:奶油毛玻璃
|
||||
|
||||
```css
|
||||
.soft-card {
|
||||
background: rgba(255,255,255,0.65);
|
||||
backdrop-filter: blur(20px);
|
||||
border: 1px solid rgba(255,255,255,0.8);
|
||||
border-radius: 28px;
|
||||
box-shadow: 0 8px 40px rgba(180,130,110,0.12), 0 2px 12px rgba(255,200,180,0.08);
|
||||
}
|
||||
```
|
||||
|
||||
比深色风的玻璃拟态更透、更柔。圆角更大(28px vs 20px)。
|
||||
|
||||
#### 功能图标:渐变暖色底
|
||||
|
||||
```css
|
||||
.feat-icon {
|
||||
width: 48px; height: 48px; border-radius: 16px;
|
||||
background: linear-gradient(135deg, rgba(232,120,90,0.12), rgba(255,179,71,0.12));
|
||||
/* 每个图标用不同的暖色渐变组合 */
|
||||
}
|
||||
```
|
||||
|
||||
#### 终端窗口:暖棕底
|
||||
|
||||
```css
|
||||
.terminal {
|
||||
background: rgba(61,44,44,0.92); /* 深炭棕,不是纯黑 */
|
||||
color: #F5E6E0; /* 暖白,不是冷白 */
|
||||
border: 1px solid rgba(180,130,110,0.15);
|
||||
border-radius: 24px;
|
||||
}
|
||||
/* 提示符用琥珀色,成功输出用珊瑚色(不是冷绿) */
|
||||
.tp { color: var(--amber); }
|
||||
.ts { color: var(--coral); }
|
||||
```
|
||||
|
||||
#### CTA 按钮:珊瑚渐变 + 光扫
|
||||
|
||||
```css
|
||||
.cta-btn {
|
||||
background: linear-gradient(135deg, var(--coral), #E06848);
|
||||
color: #fff;
|
||||
border-radius: 18px;
|
||||
box-shadow: 0 8px 28px rgba(232,120,90,0.25);
|
||||
}
|
||||
/* 光扫用真实子元素(不是 ::after 伪元素,GSAP 无法动画伪元素) */
|
||||
.cta-sweep {
|
||||
position: absolute; inset: 0;
|
||||
background: linear-gradient(105deg, transparent 35%, rgba(255,255,255,0.25) 50%, transparent 65%);
|
||||
transform: translateX(-100%);
|
||||
}
|
||||
```
|
||||
|
||||
#### 字体选择
|
||||
|
||||
- **温暖风 → Nunito**(圆润、友善、温暖)
|
||||
- **深色风 → Inter**(几何、理性、专业)
|
||||
- **代码 → JetBrains Mono**(两种风格通用)
|
||||
|
||||
#### 动效缓动函数
|
||||
|
||||
| 风格 | 推荐 ease | 感受 |
|
||||
|------|----------|------|
|
||||
| 深色科技 | `back.out(1.7)` | 弹性、有力、干脆 |
|
||||
| 温暖治愈 | `sine.out` | 柔滑、缓缓、不急 |
|
||||
| 深色科技 | `power3.out` | 快速到位 |
|
||||
| 温暖治愈 | `sine.inOut` | 呼吸般的均匀节奏 |
|
||||
|
||||
温暖风的动效要慢:入场 0.5-1.0s(深色风 0.3-0.5s),出场 0.5s。
|
||||
|
||||
#### Logo 出场动画
|
||||
|
||||
深色风:弹性放大 `scale: 0.5→1.0 + back.out`
|
||||
温暖风:柔和浮现 `scale: 0.6→1.0 + sine.out`,配合呼吸缩放 `scale: 1.0↔1.05 + sine.inOut`
|
||||
|
||||
---
|
||||
|
||||
### 通用动效规则(两种风格共用)
|
||||
|
||||
#### 文字不能一次性全出现
|
||||
|
||||
- 标题:从下方滑入 + 淡入(`y: 20-40, opacity: 0→1`)
|
||||
- 副标题:延迟 0.3s 后出现
|
||||
- 细节:再延迟 0.5s,逐行出现
|
||||
|
||||
#### 产品截图要"浮起来"
|
||||
|
||||
```css
|
||||
.product-screenshot {
|
||||
border-radius: 16px;
|
||||
box-shadow: 0 20px 60px rgba(0,0,0,0.15); /* 浅色风用低透明度阴影 */
|
||||
}
|
||||
```
|
||||
|
||||
#### 纵深
|
||||
|
||||
每个画面至少 2 层(前景+背景),不能全部平面。
|
||||
|
||||
#### 呼吸感
|
||||
|
||||
文字出现后留 0.5-1s 让观众消化,不要连续轰炸。
|
||||
|
||||
#### 品牌光效/色效
|
||||
|
||||
每次关键元素出现时,品牌色光晕/色斑脉冲一次。
|
||||
|
||||
#### 光扫效果
|
||||
|
||||
按钮、卡片入场时可以加一道光扫。⚠️ 用真实 DOM 子元素实现(如 `<span class="sweep">`),不能用 CSS `::after` 伪元素(GSAP 无法动画伪元素)。
|
||||
|
||||
#### 数字跳动
|
||||
|
||||
如果有数字(用户数/GitHub star),用 GSAP countTo 动画。
|
||||
|
||||
#### 代码打字
|
||||
|
||||
如果是开发者工具,终端逐行显示 + 光标闪烁。
|
||||
|
||||
### 音频
|
||||
|
||||
#### BGM
|
||||
|
||||
搜索顺序:
|
||||
1. 搜索 Pixabay/Mixkit 免费 BGM
|
||||
2. 搜索不到 → 用 Web Audio API 合成(见 sound-fx-for-video Skill)
|
||||
3. 用户提供 → 直接使用
|
||||
|
||||
```bash
|
||||
# 搜索 Pixabay
|
||||
web_fetch("https://pixabay.com/music/search/tech%20background/")
|
||||
# 下载
|
||||
curl -sL -o assets/audio/bgm.mp3 "<URL>"
|
||||
```
|
||||
|
||||
#### 音效
|
||||
|
||||
使用 sound-fx-for-video Skill 生成:
|
||||
- 产品名出现 → sparkle
|
||||
- 卖点展开 → whoosh
|
||||
- 数字增长 → rise
|
||||
- CTA 出现 → impact
|
||||
|
||||
### HyperFrames 踩坑
|
||||
|
||||
同 ai-info-gap-video Skill 中的踩坑清单。额外注意:
|
||||
- **3:4 比例** — init 时指定 `--width 1080 --height 1440`
|
||||
- **竖版布局** — 文字竖向排列空间更窄,字号适当放大
|
||||
- **手机端预览** — 竖版视频主要在手机播放,确保文字够大
|
||||
|
||||
---
|
||||
|
||||
## Step 5: 渲染
|
||||
|
||||
```bash
|
||||
source ~/.nvm/nvm.sh && nvm use 22
|
||||
npx hyperframes lint
|
||||
npx hyperframes snapshot --at 1,3,5,10,15 # 验证关键帧
|
||||
npx hyperframes render -o output.mp4
|
||||
```
|
||||
190
skills/sketch-animation-video/SKILL.md
Normal file
190
skills/sketch-animation-video/SKILL.md
Normal file
@ -0,0 +1,190 @@
|
||||
---
|
||||
name: sketch-animation-video
|
||||
description: 使用 HyperFrames 制作简笔画/手绘线稿动画短视频。适用于简笔画动画、白板风、线稿动效、3-10 秒解释型片段,强调“无字幕可懂”的动作叙事、可复用模板与自检修复闭环。
|
||||
disable-model-invocation: true
|
||||
---
|
||||
|
||||
# 简笔画动画视频 Skill(HyperFrames)
|
||||
|
||||
用于生产「简笔画 / 手绘线稿」风格短视频(3-10 秒)。
|
||||
核心目标:**信息表达优先**、**动作语义清晰**、**可快速迭代**。
|
||||
|
||||
## 1) 适用场景(触发词)
|
||||
|
||||
当用户出现以下需求时使用:
|
||||
|
||||
- 简笔画动画、手绘动画、白板风、线稿风
|
||||
- 3-5 秒开场、功能解释、概念隐喻动画
|
||||
- 希望通过动作直接表达含义(不依赖长文本)
|
||||
- 需要 HyperFrames 可预览、可校验、可导出 MP4
|
||||
|
||||
---
|
||||
|
||||
## 2) 技术栈(统一)
|
||||
|
||||
- **HyperFrames**:编排与渲染(`data-*` 时序属性)
|
||||
- **HTML + CSS**:布局与视觉(纸张底、网格、线稿样式)
|
||||
- **SVG**:人物、图标、符号、道具等简笔画元素
|
||||
- **GSAP**:时间线动画(入场/传递/激活/输出/收尾)
|
||||
- **命令**:`npm run dev`、`npm run check`、`npm run render`
|
||||
|
||||
建议版本:
|
||||
|
||||
- `hyperframes@0.5.x`
|
||||
- `gsap@3.14.x`
|
||||
|
||||
---
|
||||
|
||||
## 3) 风格与叙事原则
|
||||
|
||||
### 视觉风格
|
||||
|
||||
- 米白纸背景 + 低对比网格纹理
|
||||
- 深色线条为主(推荐 `stroke-width: 8~13`)
|
||||
- 强调色控制在 2~3 种
|
||||
- 保持手绘感,避免过重 UI 卡片感
|
||||
|
||||
### 叙事优先级
|
||||
|
||||
1. **先表达语义,再追求美观**
|
||||
2. **无字幕可懂**:隐藏说明文本后,观众仍能看懂核心含义
|
||||
3. **动作隐喻优先**:对照 / 因果 / 冲突 / 结果
|
||||
|
||||
### 典型表达范式
|
||||
|
||||
- “效率提升”:`传统流程忙碌出汗` vs `AI 流程悠闲产出`
|
||||
- “统一方法”:`输入 -> 总控 -> 输出`
|
||||
- “成本结构”:`高成本项打叉` + `主要成本高亮`
|
||||
|
||||
---
|
||||
|
||||
## 4) 动效节奏模板(4 秒参考)
|
||||
|
||||
- `0.0~0.8s`:场景与主体入场
|
||||
- `0.8~1.8s`:信息传递(元素飞入/交换)
|
||||
- `1.8~3.0s`:核心激活(发光、弹性、喷发)
|
||||
- `3.0~4.0s`:收尾稳态(轻呼吸,不突兀)
|
||||
|
||||
> 对 8~16 秒场景:按同逻辑扩展,不要平均铺满动作,保留“重点动作”与“停顿”。
|
||||
|
||||
---
|
||||
|
||||
## 5) 可复用组件建议
|
||||
|
||||
- `paper-grid`:纸张网格背景
|
||||
- `person-*`:人物组
|
||||
- `input-*`:输入元素(音符/词条/卡片)
|
||||
- `core-*`:核心模块(大脑/控制台/总控)
|
||||
- `output-*`:输出元素(视频条/结果卡)
|
||||
- `fx-*`:强调元素(星芒、汗滴、回环箭头)
|
||||
|
||||
命名要求:同一语义组使用统一前缀,便于批量动画与重构。
|
||||
|
||||
---
|
||||
|
||||
## 6) 必做实现规范(避免踩坑)
|
||||
|
||||
### A. 元素 ID 规范
|
||||
|
||||
对会进入时间线或可能被后续编辑的节点,加稳定 `id`(例如 `id="scene-01-main"`),避免 Studio 警告 `studio_missing_editable_id`。
|
||||
|
||||
### B. transform 冲突规范
|
||||
|
||||
如果 CSS 用了 `transform: translateX(-50%)`,GSAP 再动画 `y/scale` 时会覆盖 transform。
|
||||
|
||||
建议改法:
|
||||
|
||||
- 用 GSAP 的 `xPercent: -50` 替代 CSS 的 `translateX(-50%)`
|
||||
- 或使用 `fromTo` 明确保留同一 transform 语义
|
||||
|
||||
### C. 文案开关规范
|
||||
|
||||
若需要“辅助口播提示文案”,必须做成可控开关,不要硬编码在最终导出里。
|
||||
|
||||
### D. 删除元素后检查
|
||||
|
||||
删除节点后同步检查 GSAP 选择器,避免出现空选择器(如 `""`)导致运行时报错。
|
||||
|
||||
---
|
||||
|
||||
## 7) 交付流程(必须执行)
|
||||
|
||||
1. 明确目标:时长、比例、语义重点、是否保留角标
|
||||
2. 先做静态 Hero Frame,再接 GSAP 时间线
|
||||
3. 执行 `npm run check`
|
||||
4. 导出 `npm run render`
|
||||
5. 抽检截图并修复(见第 9 节)
|
||||
6. 向用户回报:改动点 + 最新成片路径
|
||||
|
||||
---
|
||||
|
||||
## 8) 质量标准(验收)
|
||||
|
||||
- 画面在 3-10 秒(或目标时长)内表达完整
|
||||
- 主体不遮挡、不过界、不过度拥挤
|
||||
- 线稿风格一致(线宽、轮廓、颜色逻辑一致)
|
||||
- 动画节奏自然,不生硬、不抢读
|
||||
- **无字幕可懂**(关键)
|
||||
- `check` 通过后再宣称完成
|
||||
|
||||
---
|
||||
|
||||
## 9) 视觉抽检与自主修复闭环
|
||||
|
||||
产出 MP4 后必须做截图抽检,不只看命令通过。
|
||||
|
||||
### 抽检步骤
|
||||
|
||||
1. 抽取至少 4 帧关键截图(建议 `20% / 45% / 70% / 90%`)
|
||||
2. 检查:结构、节奏、线稿质量、语义清晰度
|
||||
3. 有问题就改 HTML/CSS/SVG/GSAP 并重跑 `check + render`
|
||||
4. 复检通过后再交付
|
||||
|
||||
### 修复优先级
|
||||
|
||||
1. 结构(位置、层级、比例、裁切)
|
||||
2. 线稿(轮廓识别、粗细统一、节点形态)
|
||||
3. 语义(谁在做什么、为什么、结果如何)
|
||||
4. 动效(时长、缓动、错峰、运动路径)
|
||||
5. 细节(颜色、阴影、点缀)
|
||||
|
||||
### 汇报格式
|
||||
|
||||
- 发现的问题(若有)
|
||||
- 执行的修复动作
|
||||
- 最新成片路径
|
||||
- 是否通过抽检
|
||||
|
||||
---
|
||||
|
||||
## 10) 通用提示词模板(中文)
|
||||
|
||||
下面这组模板用于稳定触发“先结构、后动效、可验收”的产出方式,比口语化一句话更可控。
|
||||
|
||||
### 模板 A:基础单场景(3-5 秒)
|
||||
|
||||
「请做一个 **3-5 秒** 的简笔画动画,比例 **[16:9 / 3:4 / 9:16]**。
|
||||
主题是 **[一句话主题]**。
|
||||
画面流程:**输入元素出现 -> 进入核心模块 -> 输出结果**。
|
||||
风格要求:米白纸背景、深色线稿、强调色不超过 3 种。
|
||||
限制:不依赖长字幕,观众静音看画面也能理解主要含义。
|
||||
交付:先给静态构图,再补 GSAP 时间线,最后执行 `check` 和 `render`。」
|
||||
|
||||
### 模板 B:对照叙事(8-12 秒)
|
||||
|
||||
「请做一个 **8-12 秒** 的简笔画对照动画。
|
||||
左侧是 **传统流程**,右侧是 **智能流程**,需要清楚体现效率差异。
|
||||
两侧都要有完整动作链路(开始 -> 处理 -> 结果),并在中段形成明显对比。
|
||||
风格统一为手绘线稿,线宽和颜色逻辑一致。
|
||||
限制:字幕只做辅助,核心信息必须靠动作表达。
|
||||
交付:提供关键帧抽检结果(20%/45%/70%/90%)和修复说明。」
|
||||
|
||||
### 模板 C:可复用组件导向(适合批量场景)
|
||||
|
||||
「请按可复用组件方式制作简笔画动画:
|
||||
组件前缀使用 `paper-*`、`person-*`、`input-*`、`core-*`、`output-*`、`fx-*`。
|
||||
时长 **[X 秒]**,目标语义是 **[目标语义]**。
|
||||
先完成静态 Hero Frame,再补入场、传递、激活、收尾四段动效。
|
||||
必须保证:无越界、无遮挡、无空选择器报错、可通过 `check`。
|
||||
最终输出 MP4,并附上“问题 -> 修复 -> 成片路径 -> 是否通过抽检”的汇报。」
|
||||
|
||||
BIN
skills/sound-fx-for-video/.DS_Store
vendored
Normal file
BIN
skills/sound-fx-for-video/.DS_Store
vendored
Normal file
Binary file not shown.
335
skills/sound-fx-for-video/SKILL.md
Normal file
335
skills/sound-fx-for-video/SKILL.md
Normal file
@ -0,0 +1,335 @@
|
||||
---
|
||||
name: sound-fx-for-video
|
||||
description: "用于视频制作中的音效方案:搜索下载免费音效、Web Audio API 合成音效、以及在 HyperFrames/Remotion/HTML5 视频中的音频集成。触发词包括:音效、SFX、配乐、BGM、whoosh、click、transition、impact、typing 等。"
|
||||
---
|
||||
|
||||
# 视频音效制作 Skill
|
||||
|
||||
视频需要声音。这份 Skill 覆盖两条主路径:
|
||||
1. **搜索并下载** 免费音效素材
|
||||
2. **合成生成** 自定义音效(Web Audio API,无需外部文件)
|
||||
|
||||
按需求选择路径,实际项目通常会混合使用。
|
||||
|
||||
---
|
||||
|
||||
## 路径一:搜索并下载免费音效
|
||||
|
||||
### 推荐来源
|
||||
|
||||
| Source | URL | License | API | Best For |
|
||||
|--------|-----|---------|-----|----------|
|
||||
| Freesound | freesound.org | CC licenses (check per-file) | REST API (key required) | Huge library, specific sounds |
|
||||
| Mixkit | mixkit.co/free-sound-effects | Free, no attribution | No | Quick grabs, curated quality |
|
||||
| Pixabay | pixabay.com/sound-effects | Free, no attribution | No | Clean UI, good variety |
|
||||
| BBC SFX | bbcsfx.acropolis.org.uk | Free for personal/educational | No | Premium BBC quality |
|
||||
| ZapSplat | zapsplat.com | Free with attribution | No | Game/comedy/cartoon sounds |
|
||||
| SoundBible | soundbible.com | Mixed (check per-file) | No | Quick one-off downloads |
|
||||
|
||||
### 搜索策略
|
||||
|
||||
1. **关键词要具体。**
|
||||
差:`click sound`
|
||||
好:`mouse click sharp UI feedback`
|
||||
差:`whoosh`
|
||||
好:`fast whoosh air sweep transition`
|
||||
|
||||
2. **组合关键词搜索:**
|
||||
- Object + action: "glass shatter", "paper crumple"
|
||||
- Mood + type: "cinematic impact bass", "playful pop notification"
|
||||
- Context: "UI button click feedback", "slide transition whoosh"
|
||||
|
||||
3. **常见音效分类与搜索词:**
|
||||
|
||||
| Category | Search Terms |
|
||||
|----------|-------------|
|
||||
| Transitions | whoosh, sweep, swoosh, riser, dive |
|
||||
| UI feedback | click, tap, pop, blip, notification, ding |
|
||||
| Impacts | boom, hit, slam, thud, punch, bass drop |
|
||||
| Typing | keyboard, typing, keystroke, mechanical |
|
||||
| Reveals | shimmer, sparkle, magic, chime, glow |
|
||||
| Movement | slide, swoop, flutter, bounce, elastic |
|
||||
| Atmosphere | ambient, drone, hum, tension, pulse |
|
||||
|
||||
### 程序化下载
|
||||
|
||||
**Freesound API** (best for automation):
|
||||
|
||||
```bash
|
||||
# 1. Get API key from https://freesound.org/apiv2/apply/
|
||||
# 2. Search
|
||||
curl "https://freesound.org/apiv2/search/text/?query=whoosh+transition&fields=id,name,previews,duration,license&token=YOUR_API_KEY"
|
||||
|
||||
# 3. Download preview (mp3, no auth needed)
|
||||
curl -o whoosh.mp3 "https://freesound.org/data/previews/ID_ID_preview.mp3"
|
||||
|
||||
# 4. Download full quality (OAuth2 needed)
|
||||
```
|
||||
|
||||
**Simple curl download** (sources with direct links):
|
||||
|
||||
```bash
|
||||
# Pixabay (find the download URL from browser network tab)
|
||||
curl -L -o click.mp3 "https://cdn.pixabay.com/audio/..."
|
||||
|
||||
# Mixkit
|
||||
curl -L -o transition.wav "https://assets.mixkit.co/active_storage/sfx/..."
|
||||
```
|
||||
|
||||
**Python helper** (for batch downloads):
|
||||
|
||||
```python
|
||||
import urllib.request
|
||||
import json
|
||||
|
||||
FREESOUND_TOKEN = "YOUR_TOKEN"
|
||||
|
||||
def search_sfx(query, max_results=5):
|
||||
url = f"https://freesound.org/apiv2/search/text/?query={query}&fields=id,name,previews,duration,license&token={FREESOUND_TOKEN}"
|
||||
with urllib.request.urlopen(url) as r:
|
||||
return json.loads(r.read())["results"][:max_results]
|
||||
|
||||
def download_preview(sound_id, filename):
|
||||
url = f"https://freesound.org/data/previews/{sound_id//1000}/{sound_id}_{sound_id}_preview.mp3"
|
||||
urllib.request.urlretrieve(url, filename)
|
||||
```
|
||||
|
||||
### 版权检查
|
||||
|
||||
发布前必须检查授权协议:
|
||||
- **CC0**: Use freely, no attribution
|
||||
- **CC-BY**: Use with attribution (add to video description)
|
||||
- **CC-BY-NC**: Non-commercial only — do NOT use for monetized videos
|
||||
- **CC-BY-SA**: Derivatives must share same license
|
||||
|
||||
---
|
||||
|
||||
## 路径二:使用 Web Audio API 合成音效
|
||||
|
||||
无需音频文件,直接在浏览器中实时合成,适合代码驱动视频流程(HyperFrames、Remotion、HTML5)。
|
||||
|
||||
### 快速参考:常见视频音效模式
|
||||
|
||||
#### UI Click / Tap
|
||||
```javascript
|
||||
function playClick(audioCtx) {
|
||||
const osc = audioCtx.createOscillator();
|
||||
const gain = audioCtx.createGain();
|
||||
osc.connect(gain).connect(audioCtx.destination);
|
||||
osc.frequency.setValueAtTime(1200, audioCtx.currentTime);
|
||||
osc.frequency.exponentialRampToValueAtTime(600, audioCtx.currentTime + 0.05);
|
||||
gain.gain.setValueAtTime(0.3, audioCtx.currentTime);
|
||||
gain.gain.exponentialRampToValueAtTime(0.001, audioCtx.currentTime + 0.08);
|
||||
osc.start(); osc.stop(audioCtx.currentTime + 0.08);
|
||||
}
|
||||
```
|
||||
|
||||
#### Notification Pop / Ding
|
||||
```javascript
|
||||
function playPop(audioCtx) {
|
||||
const osc = audioCtx.createOscillator();
|
||||
const gain = audioCtx.createGain();
|
||||
osc.type = 'sine';
|
||||
osc.connect(gain).connect(audioCtx.destination);
|
||||
osc.frequency.setValueAtTime(880, audioCtx.currentTime);
|
||||
osc.frequency.exponentialRampToValueAtTime(1760, audioCtx.currentTime + 0.05);
|
||||
gain.gain.setValueAtTime(0.4, audioCtx.currentTime);
|
||||
gain.gain.exponentialRampToValueAtTime(0.001, audioCtx.currentTime + 0.2);
|
||||
osc.start(); osc.stop(audioCtx.currentTime + 0.2);
|
||||
}
|
||||
```
|
||||
|
||||
#### Whoosh / Transition Sweep
|
||||
```javascript
|
||||
function playWhoosh(audioCtx) {
|
||||
const bufferSize = audioCtx.sampleRate * 0.4;
|
||||
const buffer = audioCtx.createBuffer(1, bufferSize, audioCtx.sampleRate);
|
||||
const data = buffer.getChannelData(0);
|
||||
for (let i = 0; i < bufferSize; i++) {
|
||||
data[i] = (Math.random() * 2 - 1) * Math.pow(1 - i / bufferSize, 2);
|
||||
}
|
||||
const source = audioCtx.createBufferSource();
|
||||
source.buffer = buffer;
|
||||
const filter = audioCtx.createBiquadFilter();
|
||||
filter.type = 'bandpass'; filter.Q.value = 5;
|
||||
filter.frequency.setValueAtTime(200, audioCtx.currentTime);
|
||||
filter.frequency.exponentialRampToValueAtTime(4000, audioCtx.currentTime + 0.2);
|
||||
filter.frequency.exponentialRampToValueAtTime(200, audioCtx.currentTime + 0.4);
|
||||
const gain = audioCtx.createGain();
|
||||
gain.gain.setValueAtTime(0, audioCtx.currentTime);
|
||||
gain.gain.linearRampToValueAtTime(0.5, audioCtx.currentTime + 0.1);
|
||||
gain.gain.linearRampToValueAtTime(0, audioCtx.currentTime + 0.4);
|
||||
source.connect(filter).connect(gain).connect(audioCtx.destination);
|
||||
source.start();
|
||||
}
|
||||
```
|
||||
|
||||
#### Impact / Bass Hit
|
||||
```javascript
|
||||
function playImpact(audioCtx) {
|
||||
const osc = audioCtx.createOscillator();
|
||||
const gain = audioCtx.createGain();
|
||||
osc.type = 'sine';
|
||||
osc.connect(gain).connect(audioCtx.destination);
|
||||
osc.frequency.setValueAtTime(150, audioCtx.currentTime);
|
||||
osc.frequency.exponentialRampToValueAtTime(30, audioCtx.currentTime + 0.3);
|
||||
gain.gain.setValueAtTime(1, audioCtx.currentTime);
|
||||
gain.gain.exponentialRampToValueAtTime(0.001, audioCtx.currentTime + 0.4);
|
||||
osc.start(); osc.stop(audioCtx.currentTime + 0.4);
|
||||
// Add noise burst layer
|
||||
const bufferSize = audioCtx.sampleRate * 0.1;
|
||||
const buffer = audioCtx.createBuffer(1, bufferSize, audioCtx.sampleRate);
|
||||
const data = buffer.getChannelData(0);
|
||||
for (let i = 0; i < bufferSize; i++) data[i] = (Math.random() * 2 - 1);
|
||||
const noise = audioCtx.createBufferSource();
|
||||
noise.buffer = buffer;
|
||||
const noiseGain = audioCtx.createGain();
|
||||
noiseGain.gain.setValueAtTime(0.5, audioCtx.currentTime);
|
||||
noiseGain.gain.exponentialRampToValueAtTime(0.001, audioCtx.currentTime + 0.15);
|
||||
noise.connect(noiseGain).connect(audioCtx.destination);
|
||||
noise.start();
|
||||
}
|
||||
```
|
||||
|
||||
#### Typing / Keystroke
|
||||
```javascript
|
||||
function playKey(audioCtx) {
|
||||
const bufferSize = audioCtx.sampleRate * 0.03;
|
||||
const buffer = audioCtx.createBuffer(1, bufferSize, audioCtx.sampleRate);
|
||||
const data = buffer.getChannelData(0);
|
||||
for (let i = 0; i < bufferSize; i++) {
|
||||
data[i] = (Math.random() * 2 - 1) * Math.pow(1 - i / bufferSize, 4);
|
||||
}
|
||||
const source = audioCtx.createBufferSource();
|
||||
source.buffer = buffer;
|
||||
const filter = audioCtx.createBiquadFilter();
|
||||
filter.type = 'highpass'; filter.frequency.value = 2000;
|
||||
const gain = audioCtx.createGain();
|
||||
gain.gain.value = 0.15;
|
||||
source.connect(filter).connect(gain).connect(audioCtx.destination);
|
||||
source.start();
|
||||
}
|
||||
```
|
||||
|
||||
#### Rise / Tension Builder
|
||||
```javascript
|
||||
function playRise(audioCtx, duration = 1.5) {
|
||||
const osc = audioCtx.createOscillator();
|
||||
osc.type = 'sawtooth';
|
||||
const filter = audioCtx.createBiquadFilter();
|
||||
filter.type = 'lowpass'; filter.Q.value = 8;
|
||||
const gain = audioCtx.createGain();
|
||||
osc.connect(filter).connect(gain).connect(audioCtx.destination);
|
||||
filter.frequency.setValueAtTime(100, audioCtx.currentTime);
|
||||
filter.frequency.exponentialRampToValueAtTime(3000, audioCtx.currentTime + duration);
|
||||
gain.gain.setValueAtTime(0.01, audioCtx.currentTime);
|
||||
gain.gain.linearRampToValueAtTime(0.3, audioCtx.currentTime + duration);
|
||||
osc.start(); osc.stop(audioCtx.currentTime + duration);
|
||||
}
|
||||
```
|
||||
|
||||
#### Sparkle / Shimmer
|
||||
```javascript
|
||||
function playSparkle(audioCtx) {
|
||||
const notes = [261.63, 329.63, 392.00, 523.25, 659.25]; // C5 E5 G5 C6 E6
|
||||
notes.forEach((freq, i) => {
|
||||
const osc = audioCtx.createOscillator();
|
||||
osc.type = 'sine';
|
||||
const gain = audioCtx.createGain();
|
||||
osc.connect(gain).connect(audioCtx.destination);
|
||||
const start = audioCtx.currentTime + i * 0.06;
|
||||
osc.frequency.value = freq;
|
||||
gain.gain.setValueAtTime(0, start);
|
||||
gain.gain.linearRampToValueAtTime(0.2, start + 0.02);
|
||||
gain.gain.exponentialRampToValueAtTime(0.001, start + 0.4);
|
||||
osc.start(start); osc.stop(start + 0.4);
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
### 模式速查表
|
||||
|
||||
| Sound | Core Technique | Key Parameters |
|
||||
|-------|---------------|----------------|
|
||||
| Click/Tap | Short sine + fast decay | freq 800-1500Hz, dur < 100ms |
|
||||
| Pop/Bubble | Sine + freq ramp up | freq sweep up, short |
|
||||
| Whoosh | Filtered noise + bandpass sweep | bandpass sweep 200→4k→200 |
|
||||
| Impact | Low sine + noise burst | freq 150→30Hz, noise < 150ms |
|
||||
| Typing | Highpass filtered noise | HPF 2kHz+, dur < 50ms |
|
||||
| Rise/Tension | Sawtooth + filter sweep | LPF 100→3kHz over duration |
|
||||
| Sparkle | Arpeggiated sine cluster | C-E-G-C-E, staggered 60ms |
|
||||
| Boom/Rumble | Very low sine + slow decay | freq 40-80Hz, dur 0.5-2s |
|
||||
| Slide/Move | Sine with pitch bend | freq ramp up or down |
|
||||
| Error/Buzz | Square wave buzz | low freq square, short burst |
|
||||
|
||||
---
|
||||
|
||||
## 路径三:HyperFrames / Remotion 集成
|
||||
|
||||
### HyperFrames 音频集成
|
||||
|
||||
在 HyperFrames 中,音频常见有两种方式:
|
||||
|
||||
**方式 A:Web Audio API(实时合成)**
|
||||
|
||||
```javascript
|
||||
// In your HyperFrames component
|
||||
const audioCtx = new AudioContext();
|
||||
|
||||
function MyAnimation() {
|
||||
const handleClick = () => playClick(audioCtx);
|
||||
|
||||
return (
|
||||
<div onClick={handleClick}>
|
||||
<h1>Click Me</h1>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
**方式 B:预加载音频文件**
|
||||
|
||||
```javascript
|
||||
// 1. Import audio file
|
||||
import whooshSfx from "./sfx/whoosh.mp3";
|
||||
|
||||
// 2. Play on animation event
|
||||
const audio = new Audio(whooshSfx);
|
||||
audio.play();
|
||||
```
|
||||
|
||||
**方式 C:Remotion `<Audio>` 组件**(使用 Remotion 时)
|
||||
|
||||
```jsx
|
||||
import { Audio, useCurrentFrame } from "remotion";
|
||||
|
||||
export const MyComp = () => {
|
||||
const frame = useCurrentFrame();
|
||||
return (
|
||||
<>
|
||||
<Audio src={staticFile("whoosh.mp3")} startFrom={30} volume={0.5} />
|
||||
</>
|
||||
);
|
||||
};
|
||||
```
|
||||
|
||||
### 视听同步建议
|
||||
|
||||
1. **Trigger sounds at animation keyframes**, not at random times
|
||||
2. **Keep sounds short** (50-300ms for UI, 300-800ms for transitions)
|
||||
3. **Layer sounds** — a whoosh + impact combo feels better than either alone
|
||||
4. **Volume hierarchy**: BGM (0.1-0.2) < Transition SFX (0.2-0.3) < Key SFX (0.3-0.5) < Voiceover (1.0)
|
||||
5. **Always test with audio on** — silent playback hides timing issues
|
||||
|
||||
---
|
||||
|
||||
## 选择流程
|
||||
|
||||
```
|
||||
Need a sound effect?
|
||||
├── Is it a simple UI/click/pop? → Synthesize with Web Audio API (instant, no files)
|
||||
├── Is it a complex/natural sound (footsteps, crowd, rain)? → Download from Freesound/Pixabay
|
||||
├── Need precise control over timing? → Web Audio API synthesis
|
||||
├── Need realism? → Download real recordings
|
||||
└── Both? → Mix: download base layer + synthesize accents on top
|
||||
```
|
||||
168
skills/sound-fx-for-video/references/web-audio-synthesis.md
Normal file
168
skills/sound-fx-for-video/references/web-audio-synthesis.md
Normal file
@ -0,0 +1,168 @@
|
||||
# Web Audio API SFX Synthesis Reference
|
||||
|
||||
Detailed parameter tuning guide for custom sound effects.
|
||||
|
||||
## AudioContext Setup
|
||||
|
||||
```javascript
|
||||
// Always create one AudioContext and reuse it
|
||||
const audioCtx = new (window.AudioContext || window.webkitAudioContext)();
|
||||
|
||||
// Resume if suspended (browser autoplay policy)
|
||||
if (audioCtx.state === 'suspended') audioCtx.resume();
|
||||
```
|
||||
|
||||
## Oscillator Types & Character
|
||||
|
||||
| Type | Character | Best For |
|
||||
|------|-----------|----------|
|
||||
| `sine` | Clean, pure, smooth | Dings, pops, sparkles, bass |
|
||||
| `square` | Buzzing, retro, harsh | Errors, 8-bit, alerts |
|
||||
| `sawtooth` | Bright, rich, buzzy | Rises, tension, synth pads |
|
||||
| `triangle` | Soft, mellow, warm | Soft UI feedback, ambient |
|
||||
|
||||
## Envelope Shapes (Gain Automation)
|
||||
|
||||
### Short percussive (clicks, pops)
|
||||
```
|
||||
Gain: 0 → 0.3 (instant) → 0.001 (80ms, exponential)
|
||||
```
|
||||
|
||||
### Medium decay (notifications, reveals)
|
||||
```
|
||||
Gain: 0 → 0.4 (5ms) → 0.001 (200ms, exponential)
|
||||
```
|
||||
|
||||
### Long fade (whooshes, ambience)
|
||||
```
|
||||
Gain: 0 → 0.5 (100ms, linear) → 0 (400ms, linear)
|
||||
```
|
||||
|
||||
### Rise & fall (tension builders)
|
||||
```
|
||||
Gain: 0.01 → 0.3 (1.5s, linear) → stop immediately
|
||||
```
|
||||
|
||||
## Frequency Ranges & Perception
|
||||
|
||||
| Range | Perception | Use For |
|
||||
|-------|-----------|---------|
|
||||
| 20-60 Hz | Deep rumble | Impacts, bass drops |
|
||||
| 60-200 Hz | Body/thump | Booms, thuds |
|
||||
| 200-800 Hz | Warmth/mid | UI feedback, soft clicks |
|
||||
| 800-2000 Hz | Presence | Notification dings, alerts |
|
||||
| 2000-5000 Hz | Sharp/bright | Crisp clicks, typing, snaps |
|
||||
| 5000-12000 Hz | Air/sizzle | Shimmer, sparkle, sizzle |
|
||||
|
||||
## Noise Types
|
||||
|
||||
```javascript
|
||||
// White noise (equal energy across spectrum)
|
||||
for (let i = 0; i < len; i++) data[i] = Math.random() * 2 - 1;
|
||||
|
||||
// Pink noise (more bass, natural sounding)
|
||||
let b0=0,b1=0,b2=0,b3=0,b4=0,b5=0,b6=0;
|
||||
for (let i = 0; i < len; i++) {
|
||||
const w = Math.random() * 2 - 1;
|
||||
b0 = 0.99886*b0 + w*0.0555179;
|
||||
b1 = 0.99332*b1 + w*0.0750759;
|
||||
b2 = 0.96900*b2 + w*0.1538520;
|
||||
b3 = 0.86650*b3 + w*0.3104856;
|
||||
b4 = 0.55000*b4 + w*0.5329522;
|
||||
b5 = -0.7616*b5 - w*0.0168980;
|
||||
data[i] = (b0+b1+b2+b3+b4+b5+b6+w*0.5362) * 0.11;
|
||||
b6 = w * 0.115926;
|
||||
}
|
||||
|
||||
// Brown noise (deep rumble)
|
||||
let last = 0;
|
||||
for (let i = 0; i < len; i++) {
|
||||
const w = Math.random() * 2 - 1;
|
||||
data[i] = (last + 0.02 * w) / 1.02;
|
||||
last = data[i];
|
||||
data[i] *= 3.5;
|
||||
}
|
||||
```
|
||||
|
||||
## Layering Patterns
|
||||
|
||||
### Impact + Debris
|
||||
- Layer 1: Low sine sweep (150→30Hz, 300ms)
|
||||
- Layer 2: Noise burst (100ms, highpass 1kHz)
|
||||
- Layer 3: Sub hit (40Hz sine, 500ms)
|
||||
|
||||
### Whoosh + Pass-by
|
||||
- Layer 1: Bandpass noise sweep (200→4k→200Hz)
|
||||
- Layer 2: Subtle pitch-shifted sine doppler effect
|
||||
|
||||
### Success / Achievement
|
||||
- Layer 1: Rising chime arpeggio (C-E-G-C)
|
||||
- Layer 2: Soft sparkle noise (highpass 8kHz, low volume)
|
||||
- Layer 3: Warm bass pad underneath (sine 130Hz, slow attack)
|
||||
|
||||
## Rendering Web Audio to File (for video export)
|
||||
|
||||
When HyperFrames/Remotion renders to MP4, browser AudioContext won't be captured. Pre-render sounds:
|
||||
|
||||
```javascript
|
||||
// Offline render: generate WAV file from Web Audio API
|
||||
async function renderSoundToFile(renderFunction, duration, filename) {
|
||||
const sampleRate = 44100;
|
||||
const offlineCtx = new OfflineAudioContext(1, sampleRate * duration, sampleRate);
|
||||
renderFunction(offlineCtx);
|
||||
const rendered = await offlineCtx.startRendering();
|
||||
const wav = audioBufferToWav(rendered);
|
||||
const blob = new Blob([wav], { type: 'audio/wav' });
|
||||
// Save or use in video
|
||||
return URL.createObjectURL(blob);
|
||||
}
|
||||
|
||||
function audioBufferToWav(buffer) {
|
||||
const numCh = buffer.numberOfChannels;
|
||||
const sampleRate = buffer.sampleRate;
|
||||
const format = 1; // PCM
|
||||
const bitDepth = 16;
|
||||
const bytesPerSample = bitDepth / 8;
|
||||
const blockAlign = numCh * bytesPerSample;
|
||||
const data = buffer.getChannelData(0);
|
||||
const dataLength = data.length * bytesPerSample;
|
||||
const headerLength = 44;
|
||||
const totalLength = headerLength + dataLength;
|
||||
const arrayBuffer = new ArrayBuffer(totalLength);
|
||||
const view = new DataView(arrayBuffer);
|
||||
// WAV header
|
||||
writeString(view, 0, 'RIFF');
|
||||
view.setUint32(4, totalLength - 8, true);
|
||||
writeString(view, 8, 'WAVE');
|
||||
writeString(view, 12, 'fmt ');
|
||||
view.setUint32(16, 16, true);
|
||||
view.setUint16(20, format, true);
|
||||
view.setUint16(22, numCh, true);
|
||||
view.setUint32(24, sampleRate, true);
|
||||
view.setUint32(28, sampleRate * blockAlign, true);
|
||||
view.setUint16(32, blockAlign, true);
|
||||
view.setUint16(34, bitDepth, true);
|
||||
writeString(view, 36, 'data');
|
||||
view.setUint32(40, dataLength, true);
|
||||
// PCM data
|
||||
let offset = 44;
|
||||
for (let i = 0; i < data.length; i++, offset += 2) {
|
||||
const s = Math.max(-1, Math.min(1, data[i]));
|
||||
view.setInt16(offset, s < 0 ? s * 0x8000 : s * 0x7FFF, true);
|
||||
}
|
||||
return arrayBuffer;
|
||||
}
|
||||
|
||||
function writeString(view, offset, string) {
|
||||
for (let i = 0; i < string.length; i++)
|
||||
view.setUint8(offset + i, string.charCodeAt(i));
|
||||
}
|
||||
```
|
||||
|
||||
## Common Tuning Tips
|
||||
|
||||
- **Too harsh?** Add lowpass filter (cutoff 3-5kHz)
|
||||
- **Too quiet?** Check gain staging; exponential ramp to 0.001, not 0 (Web Audio throws error on 0)
|
||||
- **Timing off?** Use `audioCtx.currentTime + offset` for precise scheduling
|
||||
- **Want variation?** Add slight random delay (±30ms) and pitch shift (±5%) for repeated sounds
|
||||
- **Need stereo?** Use `StereoPannerNode` — `pan.value` from -1 (left) to 1 (right)
|
||||
69
skills/sound-fx-for-video/scripts/freesound_search.py
Normal file
69
skills/sound-fx-for-video/scripts/freesound_search.py
Normal file
@ -0,0 +1,69 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Search and download sound effects from Freesound.org API."""
|
||||
|
||||
import urllib.request
|
||||
import urllib.parse
|
||||
import json
|
||||
import sys
|
||||
import os
|
||||
|
||||
FREESOUND_API_URL = "https://freesound.org/apiv2"
|
||||
|
||||
def search(query, token, max_results=10, duration_max=5.0, fields="id,name,previews,duration,license,username"):
|
||||
"""Search Freesound for sound effects matching query."""
|
||||
params = urllib.parse.urlencode({
|
||||
"query": query,
|
||||
"fields": fields,
|
||||
"filter": f"duration:[0 TO {duration_max}]" if duration_max else "",
|
||||
"sort": "rating_desc",
|
||||
"page_size": max_results,
|
||||
"token": token,
|
||||
})
|
||||
url = f"{FREESOUND_API_URL}/search/text/?{params}"
|
||||
with urllib.request.urlopen(url) as resp:
|
||||
data = json.loads(resp.read())
|
||||
results = data.get("results", [])
|
||||
for i, r in enumerate(results):
|
||||
print(f" [{i+1}] {r['name']} ({r['duration']:.1f}s) by {r['username']} | License: {r['license']} | ID: {r['id']}")
|
||||
return results
|
||||
|
||||
def download_preview(sound_id, token, output_dir="."):
|
||||
"""Download the preview MP3 for a sound (no OAuth needed)."""
|
||||
# First get the sound info to find the preview URL
|
||||
url = f"{FREESOUND_API_URL}/sounds/{sound_id}/?fields=previews,name&token={token}"
|
||||
with urllib.request.urlopen(url) as resp:
|
||||
data = json.loads(resp.read())
|
||||
preview_url = data["previews"]["preview-hq-mp3"]
|
||||
name = data["name"].replace(" ", "_").replace("/", "_")
|
||||
filename = os.path.join(output_dir, f"{name}.mp3")
|
||||
urllib.request.urlretrieve(preview_url, filename)
|
||||
print(f"Downloaded: {filename}")
|
||||
return filename
|
||||
|
||||
def main():
|
||||
if len(sys.argv) < 3:
|
||||
print("Usage: python freesound_search.py <command> <token> [args]")
|
||||
print("Commands:")
|
||||
print(" search <token> <query> [max_results] [max_duration]")
|
||||
print(" download <token> <sound_id> [output_dir]")
|
||||
sys.exit(1)
|
||||
|
||||
command = sys.argv[1]
|
||||
token = sys.argv[2]
|
||||
|
||||
if command == "search":
|
||||
query = sys.argv[3] if len(sys.argv) > 3 else "whoosh"
|
||||
max_results = int(sys.argv[4]) if len(sys.argv) > 4 else 10
|
||||
max_duration = float(sys.argv[5]) if len(sys.argv) > 5 else 5.0
|
||||
print(f"Searching: '{query}' (max {max_results} results, max {max_duration}s)")
|
||||
search(query, token, max_results, max_duration)
|
||||
elif command == "download":
|
||||
sound_id = sys.argv[3]
|
||||
output_dir = sys.argv[4] if len(sys.argv) > 4 else "."
|
||||
os.makedirs(output_dir, exist_ok=True)
|
||||
download_preview(sound_id, token, output_dir)
|
||||
else:
|
||||
print(f"Unknown command: {command}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
Loading…
Reference in New Issue
Block a user