gettor appears to be in an email loop war with a .sk address

added component::applications/gettor owner::cohosh priority::medium resolution::fixed reviewer::phw severity::normal status::closed type::defect labels

Aha! Perhaps this is the reason we're getting so many repeat requests for icelandic versions of Tor Browser? I thought someone wrote a bot but this seems much more likely.

Trac:
Status: new to assigned
Owner: N/A to cohosh

Here's a patch: https://gitlab.torproject.org/torproject/anti-censorship/gettor-project/gettor/-/merge_requests/10

This was a really good catch.

Replying to arma:

There are probably smart guidelines for avoiding mail loop wars, like not answering names that start with mailer-domain, checking for the presence of an X-Something-Something header, or rate limiting responses to a given address. I couldn't find any best practices while poking around, so I went with ignoring all emails from mailer-daemon@ addresses. It will work for most cases for now.

We have a rate-limiter in place, but it only ensures that a single user doesn't request links too many times per minute. These means at least one of these auto-generation loop emails is still getting in every minute. I don't want to limit the total requests received from a given email because it's reasonable to expect someone to want to download Tor multiple times in their life (see also #33123 (moved)).

And this is a great case where unifying how bridgedb handles its email answers, and how gettor does it, will save a lot of headache.

Yeah, from what I can tell it actually looks like bridgedb might handle this by rate limiting (I'll need to look into it further). We might not want to handle this in the same way since bridgedb's reasons for rate limiting (preventing enumeration) are different from gettor's.

But in general I agree that this a case in which it is a pain to repeatedly solve problems for each system separately.

Trac:
Status: assigned to needs_review
Reviewer: N/A to phw

Looks good to me!

On a slightly related note: I believe that an email's body is supposed to be separated by two (rather than one) newlines from its header. GetTor's unit tests are using only one (and mix \n with \r\n). Python's email module is also confused by this and thinks that the body is part of the To field:

In [1]: from email import message_from_string
In [3]: m=message_from_string("From: MAILER-DAEMON@mx1.riseup.net\nSubject: Undelivered Mail Returned to Sender\r\nTo: gettor@torproject.org\n osx en\n")
In [6]: m.items()
Out[6]: 
[('From', 'MAILER-DAEMON@mx1.riseup.net'),
 ('Subject', 'Undelivered Mail Returned to Sender'),
 ('To', 'gettor@torproject.org\n osx en')]

This seems like something we should fix.

Trac:
Status: needs_review to merge_ready

Replying to phw:

Looks good to me!

Thanks, merged and deployed at 2020-05-23T14:21:52+0000.

On a slightly related note: I believe that an email's body is supposed to be separated by two (rather than one) newlines from its header. GetTor's unit tests are using only one (and mix \n with \r\n). Python's email module is also confused by this and thinks that the body is part of the To field:

Created #34300 (moved) for this, thanks!

Okay I'm still getting the repeated requests so I looked at the emails in the database and found the following:

gettor@torproject.org|links|windows|is|email|20200523142434|ONHOLD
postmaster@[REDACTED].sk|links|windows|is|email|20200523142435|ONHOLD

So our list was incomplete :) Setting this back to assigned to add GetTor's own email and postmaster@ emails to the list we won't respond to.

Trac:
Status: merge_ready to assigned

Okay here's another patch that should be more complete: https://gitlab.torproject.org/torproject/anti-censorship/gettor-project/gettor/-/merge_requests/11

Trac:
Status: assigned to needs_review

Looks good to me! I left a nitpick as comment.

Trac:
Status: needs_review to merge_ready

Merged and deployed at 2020-05-29T19:04:49+0000.

Seems to be fixed now :)

Trac:
Status: merge_ready to closed
Resolution: N/A to fixed

I'm reopening this because we seem to have overlooked a bug. GetTor choked earlier today and I found the following exception in its log file:

2020-05-29 19:04:43+0000 [email parser] Error while parsing email content: [Failure instance: Traceback: <class 'KeyError'>: 'command'
	/usr/lib/python3/dist-packages/twisted/internet/defer.py:311:addCallbacks
	/usr/lib/python3/dist-packages/twisted/internet/defer.py:654:_runCallbacks
	/usr/lib/python3/dist-packages/twisted/internet/defer.py:1613:unwindGenerator
	/usr/lib/python3/dist-packages/twisted/internet/defer.py:1529:_cancellableInlineCallbacks
	--- <exception caught here> ---
	/usr/lib/python3/dist-packages/twisted/internet/defer.py:1418:_inlineCallbacks
	/srv/gettor.torproject.org/home/gettor/gettor/parse/email.py:256:parse_callback
	].

When GetTor detects an autoresponder, it returns an empty request dictionary, {}. GetTor then calls parse_callback, which assumes that the given request has the "command" key but that's not the case.

Here's a fix: tpo/anti-censorship/gettor-project/gettor!13 (merged)

Trac:
Status: closed to reopened
Resolution: fixed to N/A

lgtm. Thanks for catching this!

Trac:
Status: reopened to merge_ready

Merged and deployed at 2020-05-30T02:03:11+0000

Trac:
Resolution: N/A to fixed
Status: merge_ready to closed

closed

mentioned in issue #34300 (moved)

moved to tpo/anti-censorship/gettor-project/trac#34286 (closed)

mentioned in issue tpo/anti-censorship/gettor-project/trac#34300 (closed)

gettor appears to be in an email loop war with a .sk address

Child items 0

Activity