type
status
date
slug
summary
tags
category
icon
password
Author
Featured
Featured
Published
Public
Public
这周需要写一下 BP, 想起几年前看过李自然的一个视频(封面如下,印象深刻)。当时他还帮免费改 🥹 估计现在是没有这福利了。所以!我决定做一个数字李自然出来教我!
这里是基于他的 YouTube 153期节目(~ 69小时)制作的『数字李自然v1』:
下面我分享一下制作流程
效果演示
![notion image](https://www.notion.so/image/https%3A%2F%2Fprod-files-secure.s3.us-west-2.amazonaws.com%2F214b3c70-e1d8-41da-a501-4d9dddbd65c6%2F329a1dc0-c0a7-4992-a3a8-c42b15ea7b61%2FUntitled.png?table=block&id=463f8187-1259-4d97-b7b7-b9e72bb3d8c2&t=463f8187-1259-4d97-b7b7-b9e72bb3d8c2&width=2016&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fprod-files-secure.s3.us-west-2.amazonaws.com%2F214b3c70-e1d8-41da-a501-4d9dddbd65c6%2Fed264b24-2888-4318-8a59-1ff56f041049%2FUntitled.png?table=block&id=bbac6fbd-a1cf-4198-81a4-71b422163a3d&t=bbac6fbd-a1cf-4198-81a4-71b422163a3d&width=1820&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fprod-files-secure.s3.us-west-2.amazonaws.com%2F214b3c70-e1d8-41da-a501-4d9dddbd65c6%2Fa239b101-b870-411c-8a2f-0bd6cc49b783%2FUntitled.png?table=block&id=b727692f-0934-4b7f-8107-760059f16762&t=b727692f-0934-4b7f-8107-760059f16762&width=2034&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fprod-files-secure.s3.us-west-2.amazonaws.com%2F214b3c70-e1d8-41da-a501-4d9dddbd65c6%2F8abcff4e-10a9-48aa-bcf2-d2fb8965ebae%2FUntitled.png?table=block&id=d8976f07-d008-479e-ae23-3c4383387f4d&t=d8976f07-d008-479e-ae23-3c4383387f4d&width=2034&cache=v2)
这个主要挑战肯定不是在 GPTs 创建上(毕竟这个可能一分钟就能做完),而是知识库这块——因为李自然的所有视频都没有外挂字幕,所以我没办法直接从 YouTube 上批量下载下来。。。
于是步骤1来了
步骤1 - 从 YouTube 页面获取所有视频的 URLs
参考此教程
我得到了如下下载列表
![notion image](https://www.notion.so/image/https%3A%2F%2Fprod-files-secure.s3.us-west-2.amazonaws.com%2F214b3c70-e1d8-41da-a501-4d9dddbd65c6%2F356034b3-45f5-49f5-bcca-65299ad8f637%2FUntitled.png?table=block&id=4d452e37-f4bf-4ebd-9574-6b8286f75d31&t=4d452e37-f4bf-4ebd-9574-6b8286f75d31&width=3120&cache=v2)
步骤2 - 批量下载视频文件
懒得自己写 script ,直接问 ChatGPT。运行后得到如下视频文件:
![notion image](https://www.notion.so/image/https%3A%2F%2Fprod-files-secure.s3.us-west-2.amazonaws.com%2F214b3c70-e1d8-41da-a501-4d9dddbd65c6%2F3e0844ad-2f58-4c6c-8138-131caecf069f%2FUntitled.png?table=block&id=b75983c0-ba13-43a8-ab27-f6934b0816c9&t=b75983c0-ba13-43a8-ab27-f6934b0816c9&width=2154&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fprod-files-secure.s3.us-west-2.amazonaws.com%2F214b3c70-e1d8-41da-a501-4d9dddbd65c6%2F8c22a9e1-f89e-4674-b2f4-5cab765f498f%2FUntitled.png?table=block&id=467af942-0d4b-491b-a2c7-697bd0ded3f2&t=467af942-0d4b-491b-a2c7-697bd0ded3f2&width=3182&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fprod-files-secure.s3.us-west-2.amazonaws.com%2F214b3c70-e1d8-41da-a501-4d9dddbd65c6%2F55989ca8-b247-4f4f-a6bb-2e880989b2b4%2FUntitled.png?table=block&id=c4d7e8ef-486e-45ea-afeb-253b51ef822b&t=c4d7e8ef-486e-45ea-afeb-253b51ef822b&width=1758&cache=v2)
步骤3 - 合成一个文件
考虑到一个一个处理太麻烦,所以这次就简单粗暴一些,直接把所有文件全部拖入剪映,导出一个完整的68.5小时 (4GB) 的音频文件
步骤4 - 高速转录
偶然看到这篇推文 - 高速转录工具 Insanely Fast Whisper
这应该是求快速验证的不二之选了 —— 否则现有工具根本没有一次性转录60+小时的能力。
运行在我的 3090 上,默认设置,68小时的音频转录耗时一个多小时左右,很给力了!(当然风扇也呼呼吹了一个小时。。。)输出结果是一个巨大的JSON文件(~15MB)
![notion image](https://www.notion.so/image/https%3A%2F%2Fprod-files-secure.s3.us-west-2.amazonaws.com%2F214b3c70-e1d8-41da-a501-4d9dddbd65c6%2F651e6666-afcb-4e27-a365-e9bbbdbd3b8a%2FUntitled.png?table=block&id=2a8eff50-d6ee-4061-ad1a-bd62eeba485a&t=2a8eff50-d6ee-4061-ad1a-bd62eeba485a&width=1710&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fprod-files-secure.s3.us-west-2.amazonaws.com%2F214b3c70-e1d8-41da-a501-4d9dddbd65c6%2F785f0c1b-3719-44e2-91c5-15abceb7e4c7%2FUntitled.png?table=block&id=3eeef1a4-fd01-48ce-a323-9ccd83e73d29&t=3eeef1a4-fd01-48ce-a323-9ccd83e73d29&width=3022&cache=v2)
最后,把文本内容传到 GPTs 里就可以作为知识库啦!当然这只是一个花了几个小时的 baby 项目。尤其是考虑到李自然说节目的模式很多时候是临场反应,所有有很多口水话。后期如果做一些文字压缩和整理,应该效果会好很多。
感谢阅读!欢迎尝试『数字李自然v1』:
- Author:Yucheng L
- URL:https://lyc.fyi/article/liziran-gpts-bot
- Copyright:All articles in this blog, except for special statements, adopt BY-NC-SA agreement. Please indicate the source!