One of the vulnerabilities that are really easy to exploit is when people leave super-sensitive information in source code – and you get your hands on this source code. In early prototyping a lot of people will hardcode passwords and certificate keys in their code, and remove it later when moving to production code. Sometimes it is not even removed from production. But even in the case where you do remove it, this sensitive information can linger in your version history. What if your app is an open source app where you are sharing the code on github? You probably don’t want to share your passwords…
Getting this sensitive info out of your repository is not as easy as deleting the file from the repo and adding it to the .gitignore file – because this does not touch your version history. What you need to do is this:
- Merge any remote changes into your local repo, to make sure you don’t remove the work of your team if they have commited after your own last merge/commit
- Remove the file history for your sensitive files from your local repo using the filter-branch command:
git filter-branch –force –index-filter \
‘git rm –cached –ignore-unmatch \
PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA‘ cat — –all
Although the command above looks somewhat scary it is not that hard to dig out – you can find in the the Github doc files. When that’s done, there’s only a few more things to do:
- Add the files in question to your .gitignore file
- Force write to the repo (git push origin –force –all)
- Tell all your collaborator to clone the repo as a fresh start to avoid them merging in the sensitive files again
Also, if you have actually pushed sensitive info to a remote repository, particularly if it is an open source publicly available one, make sure you change all passwords and certificates that were included previously – this info should be considered compromised.
Like what you read? Sign up for free updates!
One thought on “Avoid keeping sensitive info in a code repo – how to remove files from git version history”
[…] So, while many employees will need training about identification of malicious links in e-mails, or understanding that they should not use the same password on every user account, other employees may have a higher level of security understanding; typically an understanding that is linked to the role they have and the responsibilities they take. So, while the awareness training for your salesforce may look quite similar to the awareness training you give to your managers and to your customer service specialists, the security awareness discussions you need to have with your more technical teams may look completely different. They already know about password strength. They already understand how to spot shaky URL’s and strange domains. But what they may not understand (without having thought about it and trained for it) is how their work practices can make products and services less secure – forcing us to rely even more on awareness training for the less technically inclined coworkers, customers and suppliers. One example of a topic for a security conversation with developers is the use of authentication information during development and how this information is treated throughout the code evolution. Basically, how to avoid keeping your secrets where bad guys can find them because you never considered the fact that they are still there – more or less hidden in plain site. Like this example, with hardcoded passwords in old versions of a git repository: Avoid keeping sensitive info in a code repo – how to remove files from git version history […]