GitHub is a well known tool to share and manage repositories among the developers. There are various ways to clone a GitHub repository, but in this article we are gonna focus on how to clone a repository with a Deploy Key and how to use the right SSH key.
GitHub is one of the leading host solutions for public Git repositories when it comes to open-source projects, and it offers, since mid 2020, unlimited private repositories and collaboration to the free accounts too.
While public repositories can be easily downloaded via web URL (HTTPS), private repositories can be cloned through user authentication or SSH URL.
Every developer that collaborates on a GitHub project has its own personal account, where he stores his public SSH keys. A GitHub project’s development is usually done locally, on the developer’s machine, but when the project arrives in the staging or production stage it is important to clone the repository on one or more servers. So, how is this operation usually done? Those are the most common solutions:
A Deploy keyit’s basically a ssh key generated for a specific server user, which allows a server to make a direct connection with a GitHub repository.
However, compared to other git hosting providers you can't reuse a Deploy Keyfor multiple repositories, and you have to create a different ssh key for every cloned one (see the commands below):
ssh-keygen -t rsa -b 4096 -f ~/.ssh/github_repo1
ssh-keygen -t rsa -b 4096 -f ~/.ssh/github_repo2
The action will generate the following files:
github_repo1
github_repo1.pub
github_repo2
github_repo2.pub
At this point, the user will have to paste the ssh public key (eg. github_repo1.pub) in the Key field of the Deploy keys on the GitHub repository settings. As title of the Deploy key it is suggested to use the following format:
<user name>@<server id>
Now we need to edit the user’s ssh configuration file, usually located under the user’s home folder (/home/user_name/.ssh/config), to tell git which ssh key to use. Within the config file, we’ll use the keyword Match with the criteria Host and Exec, with the bash conditional expressions.
Match Host github.com Exec "[[ git@github.com:/repo1.git = $(git config --get remote.origin.url) ]]" IdentityFile ~/.ssh/github_repo1
Let’s explain the expression a little bit. First of all we check if the hostname we are connecting to is github.com: if this matches, we will execute the bash command, that has to return 0 in case of match or 1 if it doesn’t match. This is done by the comparison between the set repository ssh URL:
git@github.com:/repo1.git
and the output of the command:
$(git config --get remote.origin.url)
This will return the same string if we are in the right local repository. The whole procedure sounds pretty easy when the repository has already been cloned, but this is just a particular case. What if the repository hasn’t been cloned yet? And if we clone a repository within another repository? (This latter case may sounds strange, but in a software environment can happen to have some additional modules linked to other repositories).
So, if we execute the command:
git config --get remote.origin.url
it could print the top repository’s remote origin url. In this case, we have to evaluate the ps command output. And here things become a little bit more complicated, since the output is not always equal: We could obtain different outputs if we make a simple clone, a pull or if we are working on a branch in debug mode.
In this case, the best thing to do is to search for the word ‘git-upload-pack’, which will give us a similar output:
1074313 pts/3 S+ 0:00 /usr/bin/ssh git@github.com git-upload-pack '/repo1.git' 1074333 pts/3 S+ 0:00 /bin/bash -c ps ax | grep git-upload-pack 1074335 pts/3 S+ 0:00 git-upload-pack
Therefore we will use a command like this:
ps ax | grep git-upload-pack | head -n 1
Finally, we have to evaluate this output in order to be able to compare it with the repository URL:
1074313 pts/3 S+ 0:00 /usr/bin/ssh git@github.com git-upload-pack '/repo1.git'
The output of this command could change, this is why, in order to check if one matches with the repository name, the best way is to save the output in an array and compare each of its elements. The builtin bash mapfile command comes in our help:
Match Host github.com Exec "unset MAPFILE; mapfile < <(ps ax | grep git-upload-pack | head -n 1 ); [[ ${MAPFILE[@]} =~ '/repo1.git' ]] && exit 0 || exit 1" IdentityFile ~/.ssh/github_repo1
Let’s explain the full command in detail:
This procedure, however, has a flow, since it can’t manage checkouts executed in parallel.
Now, what remains to be done is to check if the match process is working properly, and to do so we have to debug the SSH process:
GIT_SSH_COMMAND="ssh -vvv" git clone git@github.com:/repo1.git
Cloning into 'repo1'...
OpenSSH_8.2p1 Ubuntu-4ubuntu0.2, OpenSSL 1.1.1f 31 Mar 2020 debug1: Reading configuration data ~/.ssh/config debug2: checking match for 'Host github.com Exec "echo $(ps ax | grep git-upload-pack ) > /tmp/git_ssh; [[ ${MAPFILE[@]} =~ '/repo1.git' ]] && exit 0 || exit 1"' host github.com originally github.com
debug3: ~/.ssh/config line 5: matched 'Host "github.com"'
debug1: Executing command: 'echo $(ps ax | grep git-upload-pack ) > /tmp/git_ssh; [[ ${MAPFILE[@]} =~ '/repo1.git' ]] && exit 0 || exit 1'
debug3: command returned status 0
debug3: ~/.ssh/config line 5: not matched 'Exec "echo $(ps ax | grep git-upload-pack ) > /tmp/git_ssh; [[ ${MAPFILE[@]} =~ '/repo1"'
debug2: match found
...