DevOps & Workflow5 min read

    How to remove a file from git history

    Share

    A normal git rm followed by a commit removes a file from future commits, but every prior commit still contains it. For a leaked credential, an accidentally committed .env file, or a 500 MB binary that ballooned the repo, that's not enough — you have to rewrite every commit that ever touched the file.

    This is one of the few times in git where rewriting history is the correct answer. The trade-off is that every collaborator has to re-clone or rebase, because all commit SHAs change.

    Before you do anything else: if you're scrubbing a secret (API key, password, private key), assume it's already been seen. Rotate the secret first, then clean history. History rewrites only buy you defense in depth.


    Pick the right tool

    There are three options. Use them in this order of preference:

    1. git filter-repo — the official recommended tool. Fast, scriptable, modern. The git project deprecated git filter-branch in favor of it.
    2. BFG Repo-Cleaner — a Java tool focused on the common cases (delete files, replace strings). Faster than filter-branch, simpler than filter-repo for everyday tasks.
    3. git filter-branch — built into git, but slow, error-prone, and officially discouraged. Use only if you can't install anything else.

    Both filter-repo and BFG require a fresh clone (git clone --mirror for BFG, or run filter-repo on a fresh clone) — they refuse to run on a repository with uncommitted changes or active worktrees.


    Install git filter-repo

    # Debian / Ubuntu
    sudo apt install git-filter-repo
    
    # macOS
    brew install git-filter-repo
    
    # pip (any platform)
    pip install git-filter-repo
    

    Check it works:

    git filter-repo --version
    

    Remove a single file from all history

    Mirror-clone the repo so you have all refs and don't disturb your working clone:

    git clone --mirror git@github.com:org/repo.git repo.git
    cd repo.git
    git filter-repo --invert-paths --path path/to/secret.env
    

    --invert-paths means "keep everything except this path". Without it, you'd keep only that path.

    When done, push the rewritten history:

    git push --force
    

    (Mirror clones automatically push all refs.)


    Remove a directory

    git filter-repo --invert-paths --path config/secrets/
    

    Trailing slash is optional. Use --path-glob for wildcards:

    git filter-repo --invert-paths --path-glob '*.pem'
    git filter-repo --invert-paths --path-glob 'dist/*.zip'
    

    Replace specific strings (e.g. leaked tokens) without removing files

    If a secret is embedded inside a file that should remain in history, scrub the string instead of deleting the file:

    echo 'AKIA1234567890ABCDEF==>REDACTED' > replacements.txt
    git filter-repo --replace-text replacements.txt
    

    Each line is pattern==>replacement. Patterns can be literals, regex (regex:^password=.*$), or globs.


    Remove the largest blobs (size cleanup)

    If you don't know which file ballooned the repo, ask filter-repo:

    git filter-repo --analyze
    # writes reports to .git/filter-repo/analysis/
    

    Then remove the offenders by name:

    git filter-repo --invert-paths --path huge-binary.bin
    

    Or by size cap (keep only blobs ≤ 10 MB):

    git filter-repo --strip-blobs-bigger-than 10M
    

    Using BFG instead

    BFG is sometimes simpler for the most common cases.

    git clone --mirror git@github.com:org/repo.git repo.git
    java -jar bfg.jar --delete-files secret.env repo.git
    java -jar bfg.jar --delete-folders config/secrets repo.git
    java -jar bfg.jar --replace-text replacements.txt repo.git
    cd repo.git
    git reflog expire --expire=now --all
    git gc --prune=now --aggressive
    git push --force
    

    The reflog expire + gc step is what actually deletes the now-unreachable objects. filter-repo does this for you automatically; BFG does not.


    Force-pushing the rewritten history

    After the rewrite, all SHAs have changed. The remote will reject a normal push:

    git push --force
    

    If the repo is large or the remote has many branches, the push is one big atomic operation per ref.

    For shared repositories, coordinate with the team before force-pushing. A typical sequence:

    1. Announce: "Force-push to clean history at HH:MM. Stop pushing until then."
    2. Have everyone push or stash any in-flight work.
    3. Run the rewrite and force-push.
    4. Have everyone re-clone or hard-reset:
      git fetch
      git reset --hard origin/main
      
      Or simply re-clone — easier to reason about.

    What does NOT remove a file from history

    These look like they should work, but don't:

    git rm path/to/file && git commit -m "remove file"
    

    This deletes the file going forward. Every prior commit still contains it. Cloning the repo at any earlier commit still gives you the file.

    git filter-branch --tree-filter 'rm -f path/to/file' HEAD
    

    Works, but is officially discouraged. Slow on large repos and error-prone in subtle ways. Use filter-repo.


    After the rewrite: post-cleanup checklist

    1. Force-push to all remotes. GitHub, mirrors, internal forks.
    2. Invalidate any leaked credentials. Even if the rewrite is complete, anyone who cloned the repo before still has the secret.
    3. Ask GitHub to purge cached refs. Open a support ticket if the leak was sensitive — git filter-repo removes commits from refs, but cached views on the hosting platform may linger. GitHub provides a documented process.
    4. Rebuild dependent forks. Forks on the platform are separate clones; they retain the old history. Either delete them or have their owners rewrite too.
    5. Update CI / build tags. Tags and branches that pointed at old SHAs are now dangling.
    6. Tell your team to re-clone. Their working clones diverge from the rewritten remote. Re-cloning is cleanest.

    Pitfalls

    • A leaked secret is never fully recoverable. Rotate first, then clean. Treat the cleanup as hygiene, not damage control.
    • Tags need rewriting too. Tagged versions still point at old SHAs. filter-repo rewrites tags by default; filter-branch does not.
    • Submodules retain their own history. Removing a submodule pointer from the parent doesn't clean the submodule repo.
    • CI build artifacts may still contain the secret. Logs, caches, Docker images.
    • Don't run filter-repo on your live working clone. Always work from a fresh clone or mirror.

    Summary

    • For new work: rotate the secret first, then rewrite history.
    • Use git filter-repo for almost every case. BFG is fine for simple file/string deletes.
    • Mirror-clone, run the rewrite, force-push, coordinate with collaborators.
    • Re-clone (or hard-reset) every working copy after the rewrite.
    • A history rewrite is hygiene — don't trust it as your only line of defense for leaked credentials.