I asked three AI models to fix the same broken code. One outperformed the rest

One of the most helpful aspects of using AI for coding is debugging. Before the AI era, I had to read the code line by line, use debuggers, and put print statements to find bugs. If all failed, there was Stack Overflow and other beginner websites, where people used to tear your code apart. But now? You just share the code snippet with an AI, give some context, and ask it to fix the code, and you’re done.

This gave me an idea to test how far these AI models can go in terms of creative thinking while fixing your code. So I decided to do a small experiment with ChatGPT, Gemini, and Claude to see which model could fix my code best.

Understanding the problem and crafting the prompt

Keeping it open-ended for more fun

A robot hand and a human hand typing together on a laptop keyboard, with coding symbols and a blurred code background.

Last time, we tried a password checker program for the same three models. For this AI battle, I decided to go with a simple file organizer Python script. Here’s the script that I tested with:

import os

source_dir = "Downloads"
dest_dir = "Images"

def clean_my_downloads():
    all_files = os.listdir(source_dir)
    
    for f in all_files
        if ".jpg" in f or ".png" in f:
            old_path = source_dir + "/" + f
            new_path = dest_dir + "/" + f
            
            shutil.move(old_path, new_path)
            print("Moved file: " + f)

clean_my_downloads()

Alright, before putting the models to the test, let me explain the bugs and traps I’ve prepared here.

The syntax errors (Easy)
- Missing colon: There is no ‘:’ at the end of the for f in all_files line.
- Missing import: The code uses shutil.move(), but import shutil is missing at the top.
- All three models should easily catch and fix these. If one fails here, it’s an instant fail.
The logic crash (Medium)
- Missing folder: The code assumes the Images folder already exists. If it doesn’t, shutil.move() will crash.
- A good model will add os.makedirs(dest_dir, exist_ok=True) to ensure the folder is created if it’s missing.
The cross-platform trap (Medium-Hard)
- Sloppy paths: The code uses string concatenation (+ “https://www.howtogeek.com/” +) to build file paths. This is bad practice and can cause issues on Windows vs. Mac.
- A better model will rewrite this using os.path.join(source_dir, f) or, even better, upgrade the code to use Python’s modern pathlib library.
The extension flaw (Hard)
- Naive matching: The code uses if ".jpg" in f. This is dangerous because a file named “my_photo.jpg.zip” will get moved, which isn’t an image! Also, it ignores capital letters like “.JPG”.
- A smart model will change this to f.lower().endswith(('.jpg', '.png')).
The data loss trap (The ultimate test)
- Silent overwrites: If there is already a file named screenshot.png in the Images folder, and the script moves a new screenshot.png from Downloads, shutil.move() will instantly overwrite the old one. The user loses their original image forever.
- This is what will crown our winner. The best AI will realize that the user asked it to make it bulletproof and will add a check to see whether the file already exists. The true winner will add logic to rename the duplicate (e.g., screenshot_(1).png) so no data is lost.

With that out of the way, let’s talk about the prompt next. For the prompt, I also kept it a little bit vague, with some nudge towards the problem and the goal. That way, I can test each model’s creativity without instructing them. Here’s the prompt I went with:

I'm a beginner trying to write a Python script to clean up my Downloads folder by moving all my images into an 'Images' folder. The code won't even run, and I keep getting errors. Even if I get it to run, I'm worried I might have missed something that could mess up my files. Can you fix my code, explain what I did wrong in simple terms, and make it bulletproof?

Notice that I didn’t just tell it to fix my code. I also asked it to make it bulletproof. This keeps it open for the model to implement different techniques to make it even better.

The battle begins

Testing each model

For comparison, I’ll be using similarly powered models from all three AIs. Also, note that all of them are the best freely available models, not part of their paid versions.

ChatGPT

We start with ChatGPT. Since there was no way to choose a model myself, I had to rely on the one the system chose. It went with GPT 5.5. The fixed code it provided was great. It fixed the syntax errors and used os.makedirs() to avoid the logic crash.

It used os.path.join() for different OS compatibility. It also did something brilliant here. It used os.path.expanduser("~/Downloads"). In the original code, source_dir = "Downloads" would only work if the user ran the script from their home folder. ChatGPT realized a beginner would get confused by this and dynamically found the actual Downloads folder path on their hard drive. That’s a massive pro point.

It also passed the extension flaw by upgrading it to f.lower().endswith(...) so it catches uppercase .JPGs and won’t accidentally move a file named “my_photo.jpg.zip”. It also proactively added .jpeg, .gif, and .webp.

As for the data loss trap, it caught it successfully. But the only thing lacking in its code was that, instead of renaming the file to prevent data loss, it just skipped it totally, which is not the intended solution. But overall, it did pretty well.

The explanation was also top-notch. It explained each line of code that had an issue, provided some beginner tips, and even suggested a future improved version. You can see the full answer here.

Gemini

Next up is Gemini. I used Google AI Studio and went for 3.1 Pro Preview. If ChatGPT set the bar high, Gemini just vaulted right over it. It went full modern Python engineer with its solution.

It fixed the syntax errors and the logic crash. For the cross-platform trap, Gemini took a different route. Instead of using Python’s os module, it used pathlib. It’s the modern, Pythonic way to handle files. It also dynamically found the home directory like ChatGPT did. It handled the extension flaw using the same library.

As for the data loss trap, Gemini outshone ChatGPT here as well. Gemini implemented a flawless while loop that checks if the name exists, and if it does, it dynamically injects a number into the filename (e.g., photo_1.jpg) before moving it. Your data is saved, and your folder is cleaned.

Moreover, Gemini used try/except for error handling, which is best practice in programming. Overall, these make Gemini’s score higher than ChatGPT.

As for the explanation section, I’d say Gemini falls slightly behind ChatGPT. It didn’t explain in full detail, which is required for beginners. You can see the full code and explanation here (requires logging in using your Google account.)

Claude

Our last contestant is Claude. For the model, I went with Sonnet 4.6.

Just like the other two AIs, it got the syntax error and logic crash right. One thing Claude added extra was checking if the Downloads folder actually existed first. If the user ran this from the wrong folder, it gracefully prints an error instead of crashing.

Coming to the cross-platform trap, unlike ChatGPT and Gemini, Claude failed to change the relative path to an absolute path. It left source_dir = "Downloads". This means if a beginner saves this script to their Desktop and double-clicks it, it will fail because it’s looking for a Downloads folder on the Desktop. Both ChatGPT and Gemini foresaw this and dynamically mapped to the user’s home directory.

Claude also did well in the extension trap by using os.path.splitext()[1].lower(). It proactively added .heic (Apple’s default photo format.)

Like Gemini, Claude nailed the duplicate file trap. It wrote a while loop to auto-rename files (photo_1.jpg) so nothing is deleted and the folder actually gets cleaned. The last thing that Claude added was having a moved_count variable to track how many images were moved to give you a nice summary.

The whole code is well-commented out, pointing out where each fix was made. It also generated a summary table of all the fixes. But the missed opportunity to bulletproof the cross-platform gap will deduct some of its score. Find Claude’s full answer here.

The final verdict

Who takes the crown?

Unlike last time, this one was harder to choose. ChatGPT passed most of the tests but got one wrong. Gemini excelled in all of them and even went beyond in some of them. Claude fought well but overlooked one slight mistake. So, after all the judging, I’ll have to announce that Gemini is the winner of this competition. ChatGPT was a great explainer, and Claude had the best code documentation.

AI makes debugging easy

Though the real point here was to showcase each AI model’s thinking and code-fixing capabilities, it’s safe to say that all of them did a great job. So, no matter which one you go for, it’s much better than debugging by yourself in the old school way.

Source link

I asked three AI models to fix the same broken code. One outperformed the rest

Understanding the problem and crafting the prompt

Keeping it open-ended for more fun

The battle begins

Testing each model

ChatGPT

Gemini

Claude

The final verdict

Who takes the crown?

AI makes debugging easy

Like this:

Related

Understanding the problem and crafting the prompt

Keeping it open-ended for more fun

The battle begins

Testing each model

ChatGPT

Gemini

Claude

The final verdict

Who takes the crown?

AI makes debugging easy

Share this:

Like this:

Related

Related News

Emotional new Channel 4 drama Tip Toe is ‘Queer as Folk meets Years and Years’ as Russell T Davies warns shocking ending is ‘happening right now’

7 upcoming video game shows and movies I’m most excited for

Lenovo ThinkTab X11 Gen 1 rugged tablet review

Oura Ring 5 vs. Oura Ring 4: What’s changed for the ‘world’s smallest smart ring’?