Rendezvous Descriptor Upload Failures
Roger Dingledine wrote on June 20, 2008:
It looks like hid_serv_get_responsible_directories() looks at routerstatus entries, but ignores whether we actually have descriptors. So if I don't have the descriptor it chooses, I end up with this in my logs:
Jun 19 19:21:22.045 [warn] Requested exit point '$68333D0761BCF397A587A0C0B963E4A9E99EC4D3' is not known. Closing. Jun 19 19:21:22.045 [warn] Making tunnel to dirserver failed.
Right, that's the same bug that is described in the NLnet mid-June report:
"Descriptor Upload Failures: The current logic to upload rendezvous service descriptors does not handle failures in a reasonable way. In case of a failure, Tor waits for a solid hour before making the next attempt. There should either be a smaller timeout or an individual handling of failures per directory."
In the mid-June measurements and when using v0 rendezvous descriptors this bug affected 481 of 3270 cases (14.7 %); only in very few cases it affected all three hidden service authorities and became visible, because clients couldn't access the hidden service. In my yesterday evaluations with v2 rendezvous descriptors, this bug occurred in 1298 out of 9460 attempts (13.7 %). This means we really need to fix this bug to increase reliability.
The fix you suggested above only avoids those nasty warnings and unnecessary upload attempts, but it doesn't help to maintain rendezvous descriptor availability.
My first idea was to put all upload attempts that cannot be performed due to missing router descriptors in a queue. As soon as new router descriptors arrive, this queue should be checked, and rendezvous descriptors be uploaded.
However, this idea does not work. Tor doesn't seem to make any attempts to download missing router descriptors when it thinks it has enough (I'm not so sure about its behavior, could you confirm?). In one test case a Tor providing a hidden service failed to upload v0 rendezvous descriptors to the three hidden service authorities for more than two hours.
So, my second idea was to request router descriptors as soon as we realize that we need them and implement the queuing idea, so that a second attempt will be performed as soon as router descriptors have arrived.
[Automatically added by flyspray2trac: Operating System: All]