[mosh-devel] Mosh and multi-path

john hood cgull at glup.org
Wed Jul 29 01:51:29 EDT 2015


On 7/26/15 11:39 AM, Matthieu Boutier wrote:
> Hi John,
> 
> You invalidate the mail I wrote in the plane!  Thanks for that great
> review.
> 
>> it looks like good quality code.
> 
> Thanks!!
> 
>> * If I'm understanding correctly, the two major changes you make to the
>> network message format are 1) inserting a flow ID into the message
>> sequence number, and 2) adding a 16-bit field for a message type to the
>> encrypted message string, immediately after the two timestamps. That
>> second change allows you to add two new message types-- a probe message,
>> and a message to report server addresses to the client.  Have I got that
>> more or less right?
> 
> Yes.  I describe it in details here (see fig. 3 and 7):
> 
>     http://www.pps.univ-paris-diderot.fr/~boutier/papiers/mpmosh.pdf

Thanks for reminding me about your paper, I'd forgotten about it.

>> * Does the flow ID need to be in the sequence number?  Is there any
>> reason it couldn't be in a separate protobuf field?  Along with the
>> compatibility issues, this is a significant change to Mosh's
>> cryptography.  I'm no cryptographic expert, but reducing the namespace
>> for sequence IDs and adding maybe-correlatable info to the sequence ID
>> worries me.
> 
> No.  And I have no objection to separate fields: having a global
> sequence number, and a seqno by flow.  The message format will then
> looks like:
> 
> |<-- nonce -->|
> [ dir & seqno | flowID | flow seqno | flags | timestamps |
> 
> It will just take a few more bits, but I don't think that 64 bits more
> is a big deal.

Adding extra fields to the outer packet header leaves them unprotected
by the encryption/authentication, and is probably not a good idea.  (I
think that's what you're suggesting here.)

If the fields are in a protobuf, they might be less expensive, since
protobuf compresses integers by stripping leading 0 or 1 bits.  So,
adding fields that carry small integers only costs two bytes for each
field, and each field can be optional.

>> * Does your message type need to be in the encrypted message string, but
>> outside the protobuf message?  In an ideal world, mosh would have used
>> another level of protobuf serialization for the timestamps, and adding
>> the message type wouldn't cause a compatibility problem.  But mosh is in
>> the same less-than-ideal world as the rest of us.
> 
> I think so.  If Chloé (the attacker) can change the message type, then
> she just have to switch the addr bit.  If the packet contains data, the
> mosh network layer will try to get addresses (will "probably" fail), and
> if the packet contains addresses... no clue, but I imagine that the
> message will probably be considered as corrupted (does that kill the mosh
> process, or just drop the packet?).

I've thought a little more about this.  My original idea was to add new
elements to transportinstruction.proto:

	optional bytes probe;
	optional bytes addresses;

Any Mosh message would be required to have one (and only one) of the
diff, probe, or addresses elements.  All of them would have mosh's
encryption applied.  In the case of the probe messages, the encrypted
string would be empty-- but it would still be authenticated, and thus
valuable for authenticating the entire message.  This authentication
eliminates your concern in the paper about malicious address-request
packets, I think.

I've spent a while trying to modify your branch to make it more
compatible with existing versions of Mosh.  Unfortunately, I've come to
the conclusion that it will be difficult:  The existing Mosh code really
expects every incoming packet to have a valid Instruction protobuf, and
every incoming Instruction affects the protocol state.  The existing
Mosh code throws exceptions on unexpected packet contents rather than
discarding or ignoring them, making it difficult to vary from the
existing packet format.  The best idea I've come up with is for
multipath-Mosh to inject a new flag or field into the Instruction
protobuf, and for it to send packets otherwise exactly compatible with
older versions of Mosh until it sees that flag/field from the other
side.  That would require significant change to your code to handle that
behavior.

>> * Your code makes the server responsible for discovering and reporting
>> its listening addresses to the client.  Can you explain why you chose
>> this approach?
> 
> Yes.  First, the server should be able to know it's NATed addresses...
> that's not implemented.  Second, if you're at the office, and you open
> a mosh session to your home server (or the converse), you will converge
> with the addresses of your local network.  Third, I like the idea to
> be connected by link-local addresses: it's so often (not so much these
> days) that I'm tweaking my OpenWRT configuration... and fail.  The only
> way is to connect via ll-addr.

I have to disagree here.  First, there are situations where the server
does not and cannot know its NAT address:  consider NAT-464 on your
cellphone, for example.  The second situation should be handled by
having public and private DNS that corresponds to your public and
private network.  Link-local addresses *are* a plausible use case, but
there are difficulties:  Mosh must know which interface to use for
link-local addresses and it may not know which one that is, unless an
address report was delivered using that link local address

>> My thinking is that the
>> *client* should be responsible for discovering server addresses-- it
>> should discover a list of possible server addresses at session start.
> 
> This is (also) already done :
> 
> At startup:
> 
>   AddrInfo ai( ip, port, &hints );
> 
>   for ( struct addrinfo *ra_it = ai.res; ra_it != NULL; ra_it = ra_it->ai_next ) {
>     if ( ra_it->ai_addr->sa_family == AF_INET || ra_it->ai_addr->sa_family == AF_INET6 ) {
>       remote_addr.push_back( Addr( *ra_it->ai_addr, ra_it->ai_addrlen ) );
>     }
>   }

Running this loop is not currently useful in standard mosh, because the
client is passed a numeric IPv4 or IPv6 address on the command line, and
we will not get any more addresses.  But I suspect you're not using the
Perl script at all for mp-mosh, since you've not modified it at all.
For multipath mosh, I think the Perl script will have to pass both the
IP address that was chosen for the SSH connection, and the hostname.

> Then, in check_flows(), we use "remote_addr".
> 
>     for ( std::vector< Addr >::iterator ra_it = remote_addr.begin();
> 	  ra_it != remote_addr.end();
> 	  ra_it++ ) {
>       new_flow( *la_it, *ra_it );
>     }
> 
>> Perhaps it should also update that list when it roams, because the
>> client may roam from a public IPv4 network with public DNS records for
>> the server to a private network with different routing and private DNS
>> records with different addresses for the same server.  If the server is
>> behind NAT or an L3 proxy, then its local addresses might not be correct
>> for either of these cases.
> 
> This is not done! :-)
> 
>> * Do we need to have a message type to describe the probe messages?
>> Could we instead say that an empty TransportInstruction protobuf string
>> represents a probe message?
> 
> Good question again.  What you propose should be possible.  I feel that
> it's bad, but no good reasons (flexibility, control…).
> 
>> move some of it to session setup
> 
> That's clearly my opinion.
> 
>> or to optional protobuf fields that would offer better compatibility.
> 
> How does it works?
> 
>> I've recently done some work on the mosh script that I think might be
>> relevant to multipath session setup, see
>> <https://github.com/keithw/mosh/pull/653>.  This collects server
>> addresses in a way that might be useful for a multipath mosh
>> implementation.  I think clients may still need to do name lookups later
>> to really fully handle roaming, though.  (This isn't specific to your
>> multipath work, though I think the need might be stronger there.)
> 
> l. 250+:
>   - why are you connecting a "dummy" socket instead of directly trying to
>     do ssh?
>   - where is that socket close?
> 
> (That's good at bootstrap.)

An ssh host can be configured in .ssh/config to have a name or IP
address that has no relation to the "host" that was given to ssh as an
argument.  In order to handle this, the mosh script configures ssh to
use a proxy program to connect to the host, and report the IP address
used.  That proxy program is actually the Perl script itself, run as
"mosh --fake-proxy".  It's not actually fake, it is a real proxy, and
must make a TCP connection to the remote SSH server.  That's the code
you're seeing there.

>> Also have a look at my thoughts on implementing a multiplexing moshd
>> server:
>> <https://github.com/keithw/mosh/issues/295#issuecomment-109861097>  This
>> is not the same problem, but I think a similar solution for discovering
>> unnamed flows might be appropriate here.
> 
> Interesting.  I consider using one daemon for probing and sharing the
> probing results for multiple sessions.  I think it will be great fun to
> implement it, but I'm convinced it's clearly not urgent, and that the
> overload probing trafic generated is negligible.  But… who knows!  I
> like micro optimisation. ;-)

I agree.  We also need to think a bit about how a single daemon might
affect security and information disclosure.

> (And I understand your reasons to use another separate protocol.)
> 
>> Also, do we need the full 5-tuple (ports, address, IPv4/6) to name
>> flows?  Can we safely use a 3-tuple of (client-port, client-address,
>> IPv4/6) instead?
> 
> Yes.  Both the client and the server can be multihomed.
> 
> 
> Ok, another point: I have around 20 patches, waiting for testing,
> which gives to mpmosh the ability to evaluate loss ratio in both
> directions, and tends to duplicate packets among paths to achieve
> a given loss ratio.  This extension to my extension [...] cut the
> 16 bits flag field in two 8 bits fields: flag and outgoing-loss-
> ratio.  The flow ID is needed in this extension, to evaluate
> correctly the loss ratio.

I have two thoughts here:

* Replication of packets will increase the bandwidth Mosh uses, obviously.

* If the replicated packets share some or all of their network paths,
the replication is effectively a more-aggressive congestion-avoidance
algorithm.  This doesn't matter much for mosh, which limits its
bandwidth and doesn't really do congestion-avoidance, but it can be
important with a TCP stack.

> Now, I think that extending mosh should be done at session setup (by
> the perl script).  The session setup should be able to get the
> supported extensions and options of the server, and choose among
> then which ones to use.
> 
> To get the server's options, I think that something like the attached
> patch should be sufficient.  We may be tempted to add a version for
> each extension, but that's clearly not urgent, and no version is
> equivalent to "an older one", so I prefer not using it.  The option
> list conversely is, IMHO, good to have.
> 
> For the perl script...

* Mosh is a small project with linear development.  Maybe we can just
detect extensions based on the Mosh version number, or a protocol
version number.

* Your patch adds a way to describe extensions that have command line
options, but new features may take other forms.  I also think that the
extension types aren't necessary.


Regards,

  --jh




More information about the mosh-devel mailing list