I'm interested in the rate of churn and stability of the intro points for hidden services. Changing IPs too often could result in performance issues when clients have a descriptor containing no valid introduction points. I've collected 2857 unique HS descriptors for 5 hidden services for the past two weeks.
I've also collected some information about how many descriptor requests failed which may provide some indication of descriptor availability. I don't really have time right now to parse and analyze the raw data, but hopefully it's helpful to someone
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related.
Learn more.
I'm still collecting descriptors for these services. I've got 4000 unique descriptors now. Let me know when you start working on this and I can give you the full data.
Here are some rather verbose logs from parsing these descriptors and some early graphs. Feel free to suggest more graphs or analyses here, and I'll try to implement them next week.
Another nice stat would be the amount of IPs used by each HS over the whole measurement period. Did the HSes expose themselves to 12 IPs or to 100 IPs?
Another useful figure would be the maximum amount of times we see an IP in sequential descriptors of a HS. That should be 24 hours which is the maximum time an HS should keep an IP, but I'm curious to see if it reflects reality.
I guess another graph that would be useful here (but not exactly related to IP health), is the number of descriptors per hour for each hidden service. I see that even the facebook HS has published 4 HS descriptors in a single hour sometimes. Hm, just noticed we have this graph already. I'm wondering why I don't see facebook ever having 3 descs per hour in this graph, even though this seems to have happened in publication-time 2015-03-19 22:00:00.
Another nice stat would be the amount of IPs used by each HS over the whole measurement period. Did the HSes expose themselves to 12 IPs or to 100 IPs?
Added as fifth graph.
Another useful figure would be the maximum amount of times we see an IP in sequential descriptors of a HS. That should be 24 hours which is the maximum time an HS should keep an IP, but I'm curious to see if it reflects reality.
That is (and has been) the first graph, "Lifetime of introduction points". Note that I fixed a bug in the analysis script that showed some lifetimes of over 40 hours. That was wrong. All lifetimes are below 25 hours, and that 25th hour may well be the result of truncating minutes of the hour in descriptor publication times. So, I think your assumption is valid.
I guess another graph that would be useful here (but not exactly related to IP health), is the number of descriptors per hour for each hidden service. I see that even the facebook HS has published 4 HS descriptors in a single hour sometimes. Hm, just noticed we have this graph already. I'm wondering why I don't see facebook ever having 3 descs per hour in this graph, even though this seems to have happened in publication-time 2015-03-19 22:00:00.
The old graphs contained daily averages. I changed them all to CDFs, because we care more about distributions than about trends over time.
Another nice stat would be the amount of IPs used by each HS over the whole measurement period. Did the HSes expose themselves to 12 IPs or to 100 IPs?
Added as fifth graph.
The fifth one seems to be "number of intro points established on the same relay". That's not the same as "number of different intro points established over whole measurement period", right? Sorry if I did not explain it properly.
If the fifth graph is indeed that, then how can the hidden wiki only have seen ~5 different IPs? I think the number should be bigger, no? Am I reading it wrong?
Right, the fifth graph didn't show absolute numbers of relays. I changed it and made updated graphs available.
Here's a more detailed explanation of these five graphs:
Lifetime of introduction points: This graph shows the number of hours between first and last seeing an introduction point in a hidden-service descriptor published by one of five publicly known services. As expected, the maximum lifetime of introduction points is 24 hours, with a single exception of 25 hours which may be the result of truncating minutes of the hour in descriptor publication times. This graph shows how much lifetime of introduction points depends on the service, with agorahoo using almost all of its introduction points for at most one hour.
Number of descriptors published per hour, including descriptor replicas: The second graph shows how many different descriptors a service publishes per hour. We didn't bother to filter out duplicates from having fetched both replicas of a descriptor, which is why most numbers in this graph are multiples of two. This graph shows that four out of five services published only two descriptors per hour in 90% of cases, with agorahoo publishing at least four in half of cases. It might even be that agorahoo was publishing more descriptors that we didn't fetch.
Number of distinct introduction points used per hour: This graph compares all descriptors published by a service in the same hour and counts how many distinct introduction points they contain. Three out of five services used only 3 introduction points in 90% of hours, whereas kpvz7kpm used at least 5 introduction points in 50% of hours. agorahoo is, again, the exception, with up to 25 different introduction points per hour in the extreme case.
Number of introduction points per descriptor: The fourth graph simply counts how many introduction points there were contained in a descriptor. This number was consistently at 3 for the three services that didn't stand out above. kpvz7kpm did stand out here with using between 4 and 10 introduction points in half of its descriptors as well as agorahoo with up to 10 introduction points in 10% of descriptors. If additional introduction points are established as a result of higher load seen by the service, one might say that kpvz7kpm was under higher load half of the time whereas agorahoo was under heavy load for only 10% of the time. It would have been trivial to plot these heavy-load times per service at a granularity of one hour.
Number of introduction points established on the same relay (in the measurement period): This graph shows to what extent relays are being used for establishing introduction points. The agorahoo service serves as best example here: while about half of the about 1300 relays have only been used a single time for establishing an introduction points, the other half was used more than once, up to almost 60 established introduction points on a single relay. This distribution looks plausible, given that tor's weighting algorithm favors relays based on their consensus weight. It's more difficult to analyze the other services which have only established a tiny number of introduction points in the measurement period compared to agorahoo. On a related note, if services were to change their preferences for selecting introduction points, that would stand out very clearly in this graph.
Right, the fifth graph didn't show absolute numbers of relays. I changed it and made updated graphs available.
Number of introduction points established on the same relay (in the measurement period): This graph shows to what extent relays are being used for establishing introduction points. The agorahoo service serves as best example here: while about half of the about 1300 relays have only been used a single time for establishing an introduction points, the other half was used more than once, up to almost 60 established introduction points on a single relay. This distribution looks plausible, given that tor's weighting algorithm favors relays based on their consensus weight. It's more difficult to analyze the other services which have only established a tiny number of introduction points in the measurement period compared to agorahoo. On a related note, if services were to change their preferences for selecting introduction points, that would stand out very clearly in this graph.
Karsten this is great and this set of graph is quite useful for deriving conclusions.
May I also ask you to make a second version of 5. but without the agorahoo hidden service? It's dominating the graph and it's hard to read the behavior of the other hidden services (even though it looks pretty normal).
I think after that, we are done with the graphs here and we can write an analysis paragraph or something.
Some correctness analysis (based on the newest graphs and comment:10):
It seems that introduction points have the intended lifetime, which is a random lifetime between 18 and 24 hours. Looking at the first graph we can see that about 75% of the intro points of the three normal hidden services indeed stay up for more than 18 hours.
We can see that the hidden wiki hidden service has lower lifetimes for its intro points. I think that's because of the intro point formula (#4862 (moved)). For example, say that the HS has 9 intro points and then it gets less traffic and wants to go down to 5 intro points; in that case, the HS will discard 4 IPs which makes it look like they have a very short lifetime.
Our measurement period was 25 days, and looking at the last graph, we see that the normal hidden services have cycled through about 100 relays as IPs. This appears to be normal assuming an average lifetime of 20 hours for each IP, and 3 IPs per HS. The hidden wiki saw about 300 relays as IPs because of #4862 (moved).
This means that about 100 relays had the chance to measure the popularity of those HSes.
We can see that even normal HSes will publish more than 2 descriptors per hour about 15% of the time. I assume this happens because the descriptor was marked as dirty because an intro point died or something.
All in all, from this preliminary analysis, it seems like there are no huge flaws in our code, and the introduction point lifetime logic works OK.
Karsten, if you are still enjoying this problem, another graph that could be fun, is a variant of graph 5, with the time on x axis and the "Number of relays used as intro points" on the y axis.
This might give us better visibility on when the hidden services change introduction points.
I imagine this is going to look fun on the hidden wiki service, since the it jumps from 3 to 9 intro points, and then back to 3, making it rotate relays rapidly.
Very close, but not entirely. Sorry I was not clear.
By "Number of relays used as intro points" I meant the additive number of distinct relays. This should be an increasing sequence.
So for example, the agorahoo hidden service will have 3 relays in t_0, then in t_1 it will have 6 relays used as intro points, then in t_2 it will have seen 9 relays, and goes on. If the agorahoo hidden service ever repicks the same relay, we don't count it.
Similarly, ideally the normal hidden services would have 3 relays in t_0, and should keep that for the next 18-24 hours and in t_18 they would change to 6 relays, then after that many hours to 9 relays.