Tuesday, May 26, 2015

Dotfiles Part 2: I Knew It Couldn't Be That Easy

    As I discussed in the first post, I wanted a place to store my dotfiles where I could easily pull them down onto a new system. Github has become a popular place for this. It makes sense. Github is a cloud based git repo that you can easily reach from anywhere you have an internet connection. Git is a VCS which makes it easy to track changes to text files, which dotfiles are by definition. A simple git clone on a new system, and you have all your files ready and waiting for you.

    There are, of course, some issues with simply creating a git repo out of your entire home directory. So, many systems have been created which usually use symlinks to point into the repo controlled directory instead. There is even a nice listing available on the unofficial guide at github at http://dotfiles.github.io. It lists out some bootstrap systems to handle the symlinking and setup. It moves on to some app specific options, for things like shells and editors which may have extensive config and plugins that may be better managed with a dedicated system. Lastly, it covers general purpose dotfiles utilities, which may do more than just symlinking and syncing them for you.

    I've started to figure out how I wanted to manage my dotfiles many times over the years, and never gotten much further than looking at this massive page of options, opening up many of them in tabs, and then getting overwhelmed or distracted. This time, however, I was determined to make a choice and start trying to implement it. Even if it didn't end up being the solution I use in the end, I needed to get started at some point and figure out what would and wouldn't work for me. So, I made an initial pick.

    After looking over all of the bootstrap systems, I selected Ben Alman's dotfiles due to their multiple OS support. This is important to me as I have switched between various Linux flavors and now OSX several times in the past. Most of my dotfiles work on either but I have found some pieces that are OS specific. The idea of having one repo for my dotfiles that would work on either OS is very attractive to me.

     I pulled up the repository and read the README to understand what it does and how it works. Then I started looking at all the included files. Since this isn't just a repo for the program, but is actually the repo the author maintains for his personal use, it has the author's personal dotfiles already in place. This is one of the fringe benefits of using github for storing dotfiles, it becomes easy to look at other people's and get ideas for better setups and improvements to what you already have. There are, of course, serious risks to just using other people's configs though, especially when you don't know what they actually do.

    There were a lot of files that wouldn't be relevant for me, and I didn't really want to replace my existing files completely either, so I chose not to fork this repo as a starting point. Instead, I took my own empty repo which I had created in the past for use whenever I got around to actually completing this project and cloned it to my machine. I also cloned Ben Alman's repo into a separate directory. This allows me to pick and choose what I want from his repo and copy it over into mine piece by piece. I left the license and README intact for the most part, since I am using someone else's work after all.

    The concept is pretty simple and explained well in the README. One directory for files that get linked into your home directory, one for files that get sourced, etc. Interestingly, there is also a directory for files that get copied. The source repo has only one file here, a .gitconfig. The author explains that this is for files that you will modify on your local system after install because they contain data that is either unique to that system or sensitive and you don't want it in a public git repo. In the case of .gitconfig this can be things like your email address and possibly SSH or API keys. This makes sense, but was also my first wrinkle. Yes, I don't want that stuff to be public (email I care less about, but private keys I certainly care about). However, I also don't want to have to do a bunch of hand edits after I install, especially of stuff like keys, which aren't exactly easy to memorize.

    I got a little further on adding my own stuff and hit my first snag. I copied some things from the author's gitconfig into my own, thinking I understood what it was doing. Well, it broke my ability to git push. So, this is just a cautionary tale to remind you, blindly copying config files is a recipe for a bad time. Thankfully, this is all version controlled now, so I just committed a change commenting those lines out and everything was happy again.

    Now I got into the bash configuration files. These are a big change from my current files, in that they are barebones files that source a directory of config files. This does several things. It lets you break your bash (in my case) config up into logical pieces which makes maintenance easier. It also lets you guard each piece with some logic code that checks the OS, thus letting you have different configs on different OSes. This is good, but also got me worried. Sure, this works because of how bash lets you source multiple files or directories as part of the design. However, is this the only way to have OS specific configs? What about for config files that don't support sourcing like this? I thought, from the initial description, that this script had a way of handling that built in instead of doing it this way.

    I also realized at this point that I can't put all of my bash config into this repo. I have a bunch of aliases that are work specific and that would reveal internal config information that shouldn't go into a public repo for this reason. This might be a problem. I spent some time thinking and looking, and found that I can setup git-crypt to encrypt the files before uploading and decrypt them upon download. That would let me transfer over a key to my new system, and then these files would work, but they wouldn't be exposed while on github. That would work, but only because we can break the bash config into pieces. What about my ssh config, where I have sensitive information as well, like individual host configs that specify username, port number, and ssh key to use? That isn't enough to get in, but is enough to tell you where to start looking and what is needed to get in. Additionally, I'm not aware of a way to source in additional files into a ssh config, although I admittedly haven't looked yet. I'm also not aware of a way to encrypt only certain parts of a file, it is all or nothing from what I have seen so far.

    This is the problem I've come up against so far. My dotfiles now carry many sensitive pieces of information, like api keys, internal api endpoints, etc. These are all pieces of information that I want synced, since they are the exact pieces I don't want to lose in case of a system failure or migration. However, they also cannot go on public github in plain text. Encryption may be one option, if they can live in standalone files, but that may not be the case. A second option may be to look at one of the general purpose utilities that supported using multiple repos as sources, so you can mix publix and private repos. This would mean still having the information in plain text, but on a private repo instead of a public one. This is better, but I'm not sure if it is good or not. Private repo means it isn't open to the world, but you and the admin of the repo will have access to it, as well as anyone who gets in via a security flaw of some sort. We have an internal github at work i could use, but it would still be visible to anyone on my team unless I can lock that specific repo down properly. That works for work related private info, but what about my personal sensitive data? Github only offers private repos on a paid account, so is it worth a fee?

    Another, theoretical option, would be to setup a script system that supports modifying the dotfiles on the fly. This would let you encrypt just the sensitive pieces, then have them decrypted, parsed, and inserted into their corresponding files on install. This gets around needing to source multiple files as a requirement for encryption, however you can't modify the files in place or they would include the secrets if you commited and pushed them to the repo again. This means more complexity is needed, a lot more.

    So, this is where I have gotten to so far. I was just getting started adding my stuff to my new dotfiles repo, and I realized that this approach may not work for me at all. Interestingly, I haven't seen these issues touched on in almost any of what I've read so far on the subject. Either I have much more sensitive data in my dotfiles than other people, or it just hasn't been talked about that much. So, I have my started repo in an unsafe state. If I run the dotfiles script again, it will replace my existing bash profile, but I haven't added my changes into the repo yet because of the sensitive pieces. I'm going to have to think about how to solve this problem and do some more research. I should probably remove the new bash scripts from the repo too, so I don't accidently replace mine.

    If you have any ideas for how to solve this, let me know. Otherwise, I'll hopefully be back when I've come up with an idea, if I don't forget about it entirely in the mean time.