How I automated the download of Youtube videos with a Python script and AI

By Hugo LassiègeOct 28, 20248 min read

I make videos on Youtube on top of this blog. (This is unfortunately in french, sorry ^^)

And to make these videos, I regularly need to download other videos, again from Youtube, to illustrate what I'm talking about.

And that's a pain...

Because Youtube doesn't allow downloads, so you have to use external applications.

It's not too hard to find one, but these sites regularly change their conditions.

Sometimes you have to pay.

Sometimes you can get the video, but without the sound.

Sometimes, the number of videos per day is limited.

Sometimes the site fills up with ads overnight.

Sometimes it crashes.

I admit I've had enough. What if I coded it?

Solo attempt

I opened Youtube and had a quick look at the development console.

And then I realized that it didn't look good.

There's no video stream, like an mp4 that would be sent to the browser and played progressively.

It's more like multiple api calls that would each return a piece of video. And I understand a little from reading on the internet that Youtube will vary the quality of the video according to the bandwidth available (unless I force in the settings that I want a particular version).

This is exactly the case where a developer knows that she/he doesn't want to solve a problem that costs him a few minutes from time to time with a development that will cost a few days/weeks, without even being sure of succeeding.

It's frustrating for a developer because it seems so easy to do it yourself.

But in fact it's not, and obviously coding it from scratch would be a very bad idea.

It's true that, at that point, it's a painful problem. But not vital. Do I want to spend hours understanding the Youtube api and finding an elegant way to solve this problem?

Absolutely not.

5 years ago I'd probably have given up, out of sheer laziness.
But one thing has changed since then. What if I asked ChatGPT?

Coding with an AI is like using a search engine

In this case, I had no idea how to start this script. I don't know how YouTube works.

Is there an official API to do what I wanted to do?

Are there any external libraries for this?

Google search is possible. But depending on how the search is formulated, and who has managed to get their results indexed in the first results, I know I can waste a lot of time.

How many times have you had the certainty of finally stumbling across the right blog post, only to realize that it was written 5 years ago and ended in a dead end?

That's where AI comes in, because it's already done the job of indexing the web, so it has some leads.

So here we go:

The first solution proposed by ChatGPT is as follows:

import ssl
from pytube import YouTube

# Disable SSL verification (not recommended for production)
ssl._create_default_https_context = ssl._create_unverified_context

def download_youtube_video(video_url):
    yt = YouTube(video_url)
    stream = yt.streams.get_highest_resolution()
    stream.download()
    print(f "Video '{yt.title}' has been successfully downloaded.")

video_url = input("Please enter the full URL of the YouTube video: ")
download_youtube_video(video_url)

Interesting, so I'm discovering the pytube library.

I test the code, and... it crashes.

urllib.error.HTTPError: HTTP Error 400: Bad Request

An AI knows how to propose alternatives

Despite several iterations with ChatGPT, nothing worked.

At this point, ChatGPT, perhaps desperate for his student, me, took the initiative and said:

If you continue to have problems with pytube, you might consider using another library like youtube-dl or yt-dlp, which are also very popular and stable for downloading YouTube videos.

Ok... Let's give it a try.

And so I tried a second script with another library, again offered by ChatGPT :

import yt_dlp

def download_youtube_video(video_url):
    ydl_opts = {
        format': 'best', # Selects the best available format
    }
    with yt_dlp.YoutubeDL(ydl_opts) as ydl:
        ydl.download([video_url])

# Request video URL from user
video_url = input("Please enter the full URL of the YouTube video: ")

try:
    print(f "Download '{video_url}'")
    download_youtube_video(video_url)
    print("Download completed successfully.")
except Exception as e:
    print(f "An error has occurred: {e}")

I checked the official yt-dlp doc, it's literally their first page of documentation. I could have done it myself.

If I had known about yt-dlp.

If I'd known it was better than other libraries that exist on the Internet.

If I'd been less lazy, maybe?

On the other hand, damn, it still doesn't work:


yt_dlp.utils.DownloadError: ERROR: [youtube] Yu0B903O_CA: Requested format is not available. Use \--list-formats for a list of available formats

And now I've gone off the first page of the official doc, the answer is probably lost in the rest of the doc. But I don't feel like reading it, now I've got someone who can do it for me.

Please, ChatGPT, tell me what to do 🙂

Ok ChatGPT figured out that first you had to list the available video formats. It was literally marked in the previous error message. But I told you. I'm lazy.

So this time, it offers to list the formats, have me select the one I want and start the download.

The answer:

import yt_dlp

def list_formats(video_url):
    ydl_opts = {
        listformats': True, # Lists all available formats
    }
    with yt_dlp.YoutubeDL(ydl_opts) as ydl:
        ydl.extract_info(video_url, download=False)

def download_youtube_video(video_url, format_code):
    ydl_opts = {
        format': format_code, # Uses specified format
    }
    with yt_dlp.YoutubeDL(ydl_opts) as ydl:
        ydl.download([video_url])

# Request video URL from user
video_url = input("Please enter the full URL of the YouTube video: ")

# List available formats
list_formats(video_url)

# Request format code
format_code = input("Please enter the format code to be downloaded (e.g. 137+140): ")


try:
    print(f "Download video in format {format_code}...")
    download_youtube_video(video_url, format_code)
    print("Download completed successfully.")
except ValueError as e:
    print(e)

And this time... it works :)

I spent maybe 20 minutes in total, dev included, and solved a small insignificant thing that was regularly wasting my time.

Without being a Python expert.

Without being a Youtube expert.

Isn't the code terrible?

Perhaps.

If I wanted to industrialize this solution and offer yet another alternative to the youtube download sites on the internet, of course I'd check with the official documentation. I'd read every line carefully. Maybe I'd add some tests. I'd still check that there wasn't a better approach with another lib.

But for this particular case, I wouldn't do any of that.

Is it time for me to create my own version of the sites that offer downloads from Youtube?

Isn't this my chance to get rich with a new SAAS coded in 20 minutes?

Well, that's the limit of the exercise. In the creation of a SAAS, coding is maybe, what, 10% of the time? The 20 minutes I spent gave me just a technological base, but it lacks the whole gift package to make it a product that holds up.

But that's not the point, let's get back to AI.

So for someone with little experience, there are risks because the code produced is sometimes not crazy, sometimes even wrong. Because an inexperienced person won't necessarily have the hindsight to know that the proposed solution is too complex, not the best possible.

On the other hand, I know that experienced developers have told me that they don't see the point, that it slows them down, that it's nothing more than statistical models anyway and that, statistically, there are plenty of bugs lying around in public code bases, so inevitably, this is reflected in the code provided.

What's more, sometimes the AIs hallucinate the answers with the assurance of a young business school graduate making his first ESN call.

Personally, I think it's a tool, and you have to learn how to use it.

I started developing at a time when IDEs were not yet widely used. With IDEs came IDE debuggers.

I started developing in C and had to manage memory allocations. I stopped doing that when I learned Java.

When it comes to IDEs, debuggers and automatic memory allocation, I've heard the same thing every time. It doesn't make things faster, it's a source of errors, it creates bugs and it distorts reasoning.

And finally, these tools have become part of our daily lives.

What I do know is that I'm more productive than before, faster than before.

On the contrary, the question that might naturally come to mind is: Will AI replace me? Not yet. But AI has changed the way I work.

It's a great replacement for a simple Google search. It provides leads. That doesn't mean I don't have to dig deeper, but at least I know which direction to go.

I'm able to be fluent in a language I don't know much about, and learn something in the process.

This doesn't mean that AI makes perfect code, not yet. But it does unblock deadlocks.

AI also means I can stop worrying about repetitive portions of code (boilerplate, form validation, etc.).

Once again, it's changed the way I work.

And now, I can more easily put B-Rolls on my videos.

And that, frankly, is priceless :)


Share this:

Written by Hugo Lassiège

Software Engineer with more than 20 years of experience. I love to share about technologies and startups

Copyright © 2024
 Eventuallymaking
  Powered by Bloggrify