Using libssh2 on Windows: lessons learnt
After I managed to compile and link libssh2, the easiest way to get started using libssh2 was to look at the examples. Written in plain C, the examples cover some of the most common use cases and compensate for the fact that the rest of the documentation is rather sparse.
Managing threads
One key topic that unfortunately neither the examples nor the documentation address is thread-safety. The documentation for libssh2_init function states that the function
… uses a global state, and is not thread safe -- you must make sure this function is not called concurrently.
But none of the other functions are annotated in a similar way, leaving it unclear whether it is safe to use the API concurrently from multiple threads or not.
The only other somewhat-authoritative information on thread-safety I was able to find was this reply from Daniel Stenberg’s to an old mailing list thread:
I claim libssh2 _is_ thread safe, as long as you don't share libssh2's structs and handles between threads and assuming you use the crypto engine's mutex callbacks properly.
Based on this information I concluded that it would be best to maintain one lock per session (which is the top-most data structure libssh2 uses) and to then use that lock to synchronize all API calls operating on that session. So I began writing code like this:
lock (this.sessionHandle.SyncRoot)
{
var channelHandle = ...
var request = "exec";
var result = (LIBSSH2_ERROR)UnsafeNativeMethods.libssh2_channel_process_startup(
channelHandle,
request,
(uint)request.Length,
command,
command == null ? 0 : (uint)command.Length);
if (result != LIBSSH2_ERROR.NONE)
{
channelHandle.Dispose();
throw new SshNativeException(result);
}
}
This approach turned out to be too optimistic. Once I started calling the API from across multiple threads, I started experiencing spurious heap corruptions and buffer underruns. I double-checked my locking code and it all seemed okay on the surface. But looking closer at how libssh2 manages its data structure, I concluded that the safer route would be to limit access to a single thread (instead of a single thread at a time).
I removed the locks and changed the code so that a worker thread is being spawned for each session. Akin to a single-threaded COM apartment, a worker thread “owns” the session, and all interactions with the session are marshalled to the worker thread first. The worker thread uses non-blocking I/O so that reads and writes do not block each other.
Once I made these changes, the heap corruptions and buffer underruns immediately disappeared.
Managing memory
Another aspect of the libssh2 API that takes some getting used to as a Windows developer is its approach to memory management. Instead of expecting the caller to supply memory, many libssh2 functions simply return a pointer to an existing piece of memory. This memory is part of the session structure and therefore does not need to be freed by the caller. But as a caller, you also must not hold on to this memory because the next API call might overwrite it.
Some functions deviate from this pattern, for example libssh2_session_last_error: To use these functions, you must have specified a custom allocator in libssh2_session_init_ex – at least on Windows, you otherwise will not be able to free the returned memory because you do not know which heap the memory comes from.
Managing key pairs
What’s great about libssh2 is that it does not restrict you to any specific way of managing your public/private key pair. SSH clients typically store the key pair as PEM-formatted files in the user profile, but libssh2 does not force you to do it this way. This flexibility is particularly useful on Windows because storing keys as files in the user profile is not only uncommon, but also rather insecure – a better way to store keys is to use a CNG Key Storage Provider, which is what I wanted to do.
Unfortunately, the function that allows you to take control of key management, libssh2_userauth_publickey, is pretty much undocumented. The API documentation only contains boilerplate, and the source code does not contain many comments either. But with some cues from the Guacamole sources, I was able to figure out that:
- The callback function is expected to allocate memory for the signature by using the allocator passed to libssh2_session_init_ex.
- The public key must be passed in RFC4253 format, often referred to as OpenSSH format.
System.Security.Cryptography
does not contain any classes that would allow you to export a public key
in RFC4253 format, but it’s not difficult to do that yourself:
public static byte[] ToSshRsaPublicKey(this RSACng key)
{
var prefix = "ssh-rsa";
var prefixEncoded = Encoding.ASCII.GetBytes(prefix);
var modulus = key.ExportParameters(false).Modulus;
var exponent = key.ExportParameters(false).Exponent;
using (var buffer = new MemoryStream())
{
buffer.Write(ToBytes(prefixEncoded.Length), 0, 4);
buffer.Write(prefixEncoded, 0, prefixEncoded.Length);
buffer.Write(ToBytes(exponent.Length), 0, 4);
buffer.Write(exponent, 0, exponent.Length);
// Add a leading zero (!)
buffer.Write(ToBytes(modulus.Length + 1), 0, 4);
buffer.Write(new byte[] { 0 }, 0, 1);
buffer.Write(modulus, 0, modulus.Length);
buffer.Flush();
return buffer.ToArray();
}
}
As it turns out, the leading zero is important.
You can find the full source code of IAP Desktop’s wrapper classes for libssh2 in the IAP Desktop source tree.