I happened to be looking at eugeni's mail.log for other debugging, and saw that approximately 25% of the lines in mail.log contain the string gettor.
(Yesterday, eugeni's postfix had 460k lines in it, and 101k of them said "gettor" in them. Today in the first hour or so, it's 7k out of 25k.)
Does gettor get into fights with external addresses, where it replies to the bounce, gets another bounce and replies to that, etc?
There are probably smart guidelines for avoiding mail loop wars, like not answering names that start with mailer-domain, checking for the presence of an X-Something-Something header, or rate limiting responses to a given address.
And this is a great case where unifying how bridgedb handles its email answers, and how gettor does it, will save a lot of headache.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items
0
Link issues together to show that they're related.
Learn more.
Aha! Perhaps this is the reason we're getting so many repeat requests for icelandic versions of Tor Browser? I thought someone wrote a bot but this seems much more likely.
There are probably smart guidelines for avoiding mail loop wars, like not answering names that start with mailer-domain, checking for the presence of an X-Something-Something header, or rate limiting responses to a given address.
I couldn't find any best practices while poking around, so I went with ignoring all emails from mailer-daemon@ addresses. It will work for most cases for now.
We have a rate-limiter in place, but it only ensures that a single user doesn't request links too many times per minute. These means at least one of these auto-generation loop emails is still getting in every minute. I don't want to limit the total requests received from a given email because it's reasonable to expect someone to want to download Tor multiple times in their life (see also #33123 (moved)).
And this is a great case where unifying how bridgedb handles its email answers, and how gettor does it, will save a lot of headache.
Yeah, from what I can tell it actually looks like bridgedb might handle this by rate limiting (I'll need to look into it further). We might not want to handle this in the same way since bridgedb's reasons for rate limiting (preventing enumeration) are different from gettor's.
But in general I agree that this a case in which it is a pain to repeatedly solve problems for each system separately.
Trac: Status: assigned to needs_review Reviewer: N/Ato phw
On a slightly related note: I believe that an email's body is supposed to be separated by two (rather than one) newlines from its header. GetTor's unit tests are using only one (and mix \n with \r\n). Python's email module is also confused by this and thinks that the body is part of the To field:
In [1]: from email import message_from_stringIn [3]: m=message_from_string("From: MAILER-DAEMON@mx1.riseup.net\nSubject: Undelivered Mail Returned to Sender\r\nTo: gettor@torproject.org\n osx en\n")In [6]: m.items()Out[6]: [('From', 'MAILER-DAEMON@mx1.riseup.net'), ('Subject', 'Undelivered Mail Returned to Sender'), ('To', 'gettor@torproject.org\n osx en')]
Thanks, merged and deployed at 2020-05-23T14:21:52+0000.
On a slightly related note: I believe that an email's body is supposed to be separated by two (rather than one) newlines from its header. GetTor's unit tests are using only one (and mix \n with \r\n). Python's email module is also confused by this and thinks that the body is part of the To field:
When GetTor detects an autoresponder, it returns an empty request dictionary, {}. GetTor then calls parse_callback, which assumes that the given request has the "command" key but that's not the case.