Taming untracked files in a large git repository

Posted by Matthew Watkins on February 25, 2017

Have you ever tried to add a NuGet package to a Visual Studio project but accidentally added it to the whole solution? And now you have 100 untracked .config files showing up in your changed files list? Or have you ever stopped git in the middle of a big rebase and end up with a lot of new files on your machine and you need to rewind and get to a pristine state?

Well, that’s why we have git clean -fd. But if you work in a very large codebase like I do at work, that command takes forever to run. I’m talking “go to your one-hour meeting and hope it finishes by the time you get back” forever.

Luckily, if the number of files that show as untracked is manageable, there’s a faster way to get rid of those pesky untracked files than running git clean:

git ls-files --others --exclude-standard | xargs -n 1 rm -fr

I’m not one to post a command or code snippet without some explanation, but hopefully, this is obvious enough. git ls-files --others --exclude-standard is the magical incantation that prints the list of untracked file paths to stdout (just as they are printed in the last section you see when you run git status). We then pipe that list into the xargs utility, which iterates over them 1 at a time and passes each file path into the delete (rm) command, leaving you with no files left in the working directory that git doesn’t know about.

I maintain a bash script that I use to auto-create and synchronize my git aliases across my computers here. I’ve added these commands as git list-untracked-files and git delete-untracked-files since I use them so often.

Know of any better ways to clean untracked files from a large codebase in Git? Leave a comment below!

This post first appeared on Another Dev Blog