Generating Transcripts and Subtitles out of any youtube video via Python

Generating Transcripts and Subtitles out of any youtube video via Python

It becomes quite easy to get a transcript when there are subtitles present in a youtube video, but what if one needs to generate transcripts for videos where subtitles are not present or are not enabled?

Here in this blog article, we will see how to get subtitles from youtube videos when subtitles are not present on youtube videos and we will use Python to solve this problem.

Python Libraries required :

  1. For multithreading — _thread

2. For converting video to text: moviepy, speech_recognition

3. For GUI: pysimplegui

youtube video with no subtitles available

Lets’ follow the below steps:

Step 1: Download the youtube video from which you want to get the transcript and then store the video in a folder. (just add ss in front of the youtube video link Eg : change the link “” to” and press Enter on the URL Bar, later you can download the video by clicking upon “Save Link As” a *MP4 file.

Step 2: Create a new folder in which your all generated audio files will be saved.

Step 3: Create a new folder in which your all generated text files will be saved.

Step 4: Use the given GUI. You can use the code over here, before running the code make sure that libraries in prerequisites are installed by following these instructions. Once libraries are installed, just run the code by using :

python gui 2. py

Here first, we will be using multi-threading to convert video files to audio files (.wav). After this, we will again use multi-threading to convert the audio files to video files. Multi-threading means we will be starting a thread and then let it execute and immediately it will start another thread. So previously the videos which were getting converted, now all will be converted simultaneously.

This will lead us to better time efficiency.

multithreading code for the conversion of video to audio files.

multithreading code to convert audio files to text files

All these methods are called from the GUI function.

Let's have a look at the GUI of this functionality.

give path to video

First, provide the path of the video file. Then click next.

select the folder to store audio files

Step 5: In the next step, the above-shown window will be opened. Select an empty folder to store audio files. Then click on the convert from video to audio button, then wait till it executes. This should not take much time. See command-line to track progress. After the execution, click on the view result button to see all the files in the folder.

The whole video is divided into intervals of 60secs each. So according to that audio files will be generated and they are also named accordingly as seen in the output.

Step 6: Next, click on the convert from audio to video button at the bottom,

This will lead us to the following window

give folder path to store .txt files

In this, give the path of an empty folder to store text files.

Then, click on the GO button.

This will start the process to convert from audio to text files. Wait for some time till it executes and the final output will look like this.

final output

The final output will be stored in the final.txt file

final.txt file output

Since it has used google’s speech recognition library, 100 % accuracy of correct words is not guaranteed.

Also, there are no punctuations. These problems can be further solved and optimized using NLP(natural language processing) techniques and building an ML model.

For any further queries or anything related to Python Development , Coding, Blogging, Tech Documentation you can DM me on Linkedin or instagram at id=acanubhav94.

Special credits to my team members: Siddhid and Anshika .