I recently moved offices and brought a bunch of build machines into the new office. Everything is running inside a WireGuard network. WireGuard is fantastic.
Since moving, a bunch of the machines have not been responding via SSH. It seems to be related to where I’m located; if I’m in the same office as the machines, they DON’T respond, whereas if I’m outside, they do. But, not all the time, some WiFi networks at various coffee shops do work, strangely. Originally, I assumed there was some kind of packet filtering happening inside the office, but it is very strange and difficult to troubleshoot. In addition, if I don’t go through the WireGuard tunnel, and go through the IP on the local network directly, then it works reliably.
How to troubleshoot? Well, start with:
ssh -v machine
The -v switch gives verbose output (you can go as far as -vvvv if you want!). That results in this output:
debug1: Remote protocol version 2.0, remote software version OpenSSH_for_Windows_7.7 debug1: compat_banner: match: OpenSSH_for_Windows_7.7 pat OpenSSH* compat 0x04000000 debug1: Authenticating to [10.42.0.8:22](http://10.42.0.8:22) as 'vivoh' ... debug1: kex: client->server cipher: [firstname.lastname@example.org](mailto:email@example.com) MAC: <implicit> compression: none debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
(Yes, this machine is Windows running SSH! The same thing also happens with my Linux machines)
The last line is where it halts, so let’s look there on Google:
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
The correct advice is to change the MTU on either the client or the server, or both. I was looking for something less onerous, hoping to just adjust my client settings. Scrolling down, someone offered this advice:
ssh -o MACs=hmac-sha2-256 HOST
That’s worth a try. I added this to my ~/.ssh/config file as such:
Host windows-dev1 User vivoh MACs hmac-sha2-256 HostName 10.42.0.8
And, eureka! It works.