It would be a great feature if a client could be connected to multiple snowflake proxies at once, and split traffic across all of them. Like the Conflux proposal for Tor. It would be a hedge against getting assigned a single slow proxy.
This would require interposing some kind of sequencing and reliability layer, with possible retransmissions.
A potential benefit for privacy is that an individual snowflake proxy only sees a subset of a client's traffic flow.
However, instantly making N WebRTC connections is a tell for traffic classification.
Designs
Child items
0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
I am bit confused, after reading the technical documentation of Snowflake (https://keroserene.net/snowflake/technical/#35--recovery-and-multiplexing). In the "Recovery and Multiplexing" section it says: "...In any case, snowflake seeks to maintain high reliability and connectivity, and a high quality browsing experience for the user in the Tor Browser use-case, by having snowflake clients and proxies multiplex each other. When an individual WebRTC DataChannel fails, the snowflake client renews with a new WebRTC peer...". Does that mean that the required feature is covered? As far as I understand, that would switch the connection when a proxy leaves but not when it is very slow. Would it be still convenient to utilize several proxies but coordinately and congestion-based sending traffic to them?
Does that mean that the required feature is covered? As far as I understand, that would switch the connection when a proxy leaves but not when it is very slow. Would it be still convenient to utilize several proxies but coordinately and congestion-based sending traffic to them?
Good question. The multiplexing described in that document is simpler than what this ticket is about. You are correct that the failover only helps when a proxy dies, not when it is slow. But also, there's no way for a client to use, say, two 50 KB/s proxies as a single 100 KB/s channel—you can only use one at a time. The problem is that the bridge would be getting two streams of data and would not know how they should be interleaved.
But beyond that, while it's true a client manages a pool of proxies with the goal of switching between them, currently it doesn't work. After the first proxy dies, there's no way for the client to switch over to another proxy and resume the session. See legacy/trac#29206 (moved) and https://lists.torproject.org/pipermail/anti-censorship-team/2020-February/000059.html for how we are adding a meta-protocol to make it possible to recover a session after a proxy dies, and make use of multiple proxies at once.
Does that mean that the required feature is covered? As far as I understand, that would switch the connection when a proxy leaves but not when it is very slow. Would it be still convenient to utilize several proxies but coordinately and congestion-based sending traffic to them?
Good question. The multiplexing described in that document is simpler than what this ticket is about. You are correct that the failover only helps when a proxy dies, not when it is slow. But also, there's no way for a client to use, say, two 50 KB/s proxies as a single 100 KB/s channel—you can only use one at a time. The problem is that the bridge would be getting two streams of data and would not know how they should be interleaved.
Thanks for your answer, indeed the problem of resuming the session or any download with an alternate connection is necessary to be faced. I have seen the implementation of the Turbotunnel and seems interesting and I hope that can solve the usability problems. I will follow more in detail what you intend to integrate. I will also see deeply on alternate methods I hope I can get your comments on that.
I think we can ultimately do a lot better, and make better use of the available proxy capacity. I'm thinking of "striping" packets across multiple snowflake proxies simultaneously. This could be done in a round-robin fashion or in a more sophisticated way (weighted by measured per-proxy bandwidth, for example). That way, when a proxy dies, any packets sent to it would be detected as lost (unacknowledged) by the KCP layer, and retransmitted over a different proxy, much quicker than the 30-second timeout. The way to do this would be to replace RedialPacketConn—which uses one connection at a time—with a MultiplexingPacketConn, which manages a set of currently live connections and uses all of them. I don't think it would require any changes on the server.
A design sketch for accomplishing this, extending the code changes that enabled Turbo Tunnel:
I'm thinking of "striping" packets across multiple snowflake proxies simultaneously. This could be done in a round-robin fashion or in a more sophisticated way (weighted by measured per-proxy bandwidth, for example). That way, when a proxy dies, any packets sent to it would be detected as lost (unacknowledged) by the KCP layer, and retransmitted over a different proxy, much quicker than the 30-second timeout.
Sounds great. As far as weighting it based on proxy bandwidth or latency, can we use something similar to the connection migration you mentioned in one of your initial TurboTunnel posts? That way if more packets are coming through one snowflake, more packets will be sent out to it? Or is this going to choke out the other snowflakes too easily?
Sounds great. As far as weighting it based on proxy bandwidth or latency, can we use something similar to the connection migration you mentioned in one of your initial TurboTunnel posts? That way if more packets are coming through one snowflake, more packets will be sent out to it? Or is this going to choke out the other snowflakes too easily?
I was actually thinking about weighting at the client, so connection migration at the server wouldn't come into it. But now that you mention it, some kind of prioritization may be necessary at the server as well--currently all proxies for a client will pull from the outgoing queue equally and perhaps undo whatever prioritization the client tried to use--though I'm not sure what would actually happen. It may interact in weird ways with SCTP's congestion control in the DataChannel layer.
What I was thinking is that the client keeps a moving average of bandwidth for each of its current proxies (using EWMA or something), then allocates sends based on the ratios of those bandwidth averages. So if proxy A has recently done 100 KB/s and proxy B has 50 KB/s, you send 2/3 on proxy A and 1/3 on proxy B. But that may not be a good idea because there's not a way for a proxy to "recover" and get prioritized after the client has decided it's slower. Honestly for a first draft I would just prioritize proxies uniformly (either round-robin or uniform random selection per send) and see if the backpressure of SCTP or other layers automatically causes convergence to a good ratio.