2022-11-19 11:30:22 +01:00
|
|
|
|
2022-11-07 12:38:22 +01:00
|
|
|
This extends nusenu's basic idea of using the stem library to
|
|
|
|
dynamically exclude nodes that are likely to be bad by putting them
|
|
|
|
on the ExcludeNodes or ExcludeExitNodes setting of a running Tor.
|
|
|
|
* https://github.com/nusenu/noContactInfo_Exit_Excluder
|
|
|
|
* https://github.com/TheSmashy/TorExitRelayExclude
|
|
|
|
|
2022-11-19 11:30:22 +01:00
|
|
|
The basic idea is to exclude Exit nodes that do not have ContactInfo:
|
|
|
|
* https://github.com/nusenu/ContactInfo-Information-Sharing-Specification
|
2022-11-07 12:38:22 +01:00
|
|
|
|
2022-11-19 11:30:22 +01:00
|
|
|
That can be extended to relays that do not have an email in the contact,
|
|
|
|
or to relays that do not have ContactInfo that is verified to include them.
|
2022-11-07 12:38:22 +01:00
|
|
|
But there's a problem, and your Tor notice.log will tell you about it:
|
2022-11-19 11:30:22 +01:00
|
|
|
you could exclude the relays needed to access hidden services or mirror
|
|
|
|
directories. So we need to add to the process the concept of a whitelist.
|
2022-11-08 15:15:05 +01:00
|
|
|
In addition, we may have our own blacklist of nodes we want to exclude,
|
|
|
|
or use these lists for other applications like selektor.
|
2022-11-07 12:38:22 +01:00
|
|
|
|
|
|
|
So we make two files that are structured in YAML:
|
|
|
|
```
|
2022-11-08 15:15:05 +01:00
|
|
|
/etc/tor/yaml/torrc-goodnodes.yaml
|
2022-11-29 15:52:48 +01:00
|
|
|
|
|
|
|
---
|
2022-11-08 15:15:05 +01:00
|
|
|
GoodNodes:
|
2022-11-29 15:52:48 +01:00
|
|
|
EntryNodes: []
|
2022-11-08 15:15:05 +01:00
|
|
|
Relays:
|
2022-11-29 15:52:48 +01:00
|
|
|
# ExitNodes will be overwritten by this program
|
|
|
|
ExitNodes: []
|
|
|
|
IntroductionPoints: []
|
|
|
|
# use the Onions section to list onion services you want the
|
|
|
|
# Introduction Points whitelisted - these points may change daily
|
|
|
|
# Look in tor's notice.log for 'Every introduction point for service'
|
|
|
|
Onions: []
|
|
|
|
# use the Services list to list elays you want the whitelisted
|
|
|
|
# Look in tor's notice.log for 'Wanted to contact directory mirror'
|
|
|
|
Services: []
|
|
|
|
|
|
|
|
|
2022-11-07 12:38:22 +01:00
|
|
|
By default all sections of the goodnodes.yaml are used as a whitelist.
|
|
|
|
|
2022-11-29 15:52:48 +01:00
|
|
|
Use the GoodNodes/Onions list to list onion services you want the
|
|
|
|
Introduction Points whitelisted - these points may change daily
|
|
|
|
Look in tor's notice.log for warnings of 'Every introduction point for service'
|
|
|
|
|
2022-11-29 16:27:16 +01:00
|
|
|
```--hs_dir``` ```default='/var/lib/tor'``` will make the program
|
|
|
|
parse the files named ```hostname``` below this dir to find
|
|
|
|
Hidden Services to whitelist.
|
|
|
|
|
|
|
|
The Introduction Points can change during the day, so you may want to
|
|
|
|
rerun this program to freshen the list of Introduction Points. A full run
|
|
|
|
that processes all the relays from stem can take 30 minutes, or run with:
|
|
|
|
|
|
|
|
```--saved_only``` will run the program with just cached information
|
|
|
|
on the relats, but will update the Introduction Points from the Services.
|
|
|
|
|
2022-11-08 15:15:05 +01:00
|
|
|
/etc/tor/yaml/torrc-badnodes.yaml
|
2022-11-29 15:52:48 +01:00
|
|
|
|
2022-11-08 15:15:05 +01:00
|
|
|
BadNodes:
|
2022-11-29 15:52:48 +01:00
|
|
|
# list the internet domains you know are bad so you don't
|
|
|
|
# waste time trying to download contacts from them.
|
|
|
|
ExcludeDomains: []
|
|
|
|
ExcludeNodes:
|
|
|
|
# BadExit will be overwritten by this program
|
|
|
|
BadExit: []
|
|
|
|
# list MyBadExit in --bad_sections if you want it used, to exclude nodes
|
|
|
|
# or any others as a list separated by comma(,)
|
|
|
|
MyBadExit: []
|
|
|
|
|
2022-11-07 12:38:22 +01:00
|
|
|
```
|
|
|
|
That part requires [PyYAML](https://pyyaml.org/wiki/PyYAML)
|
2022-11-19 11:30:22 +01:00
|
|
|
https://github.com/yaml/pyyaml/ or ```ruamel```: do
|
|
|
|
```pip3 install ruamel``` or ```pip3 install PyYAML```;
|
|
|
|
the advantage of the former is that it preserves comments.
|
2022-11-07 12:38:22 +01:00
|
|
|
|
2022-11-19 11:30:22 +01:00
|
|
|
(You may have to run this as the Tor user to get RW access to
|
|
|
|
/run/tor/control, in which case the directory for the YAML files must
|
2022-11-29 15:52:48 +01:00
|
|
|
be group Tor writeable, and its parent's directories group Tor RX.)
|
2022-11-07 12:38:22 +01:00
|
|
|
|
|
|
|
Because you don't want to exclude the introduction points to any onion
|
2022-11-08 15:15:05 +01:00
|
|
|
you want to connect to, ```--white_onions``` should whitelist the
|
2022-11-19 11:30:22 +01:00
|
|
|
introduction points to a comma sep list of onions; we fixed stem to do this:
|
2022-11-07 12:38:22 +01:00
|
|
|
* https://github.com/torproject/stem/issues/96
|
|
|
|
* https://gitlab.torproject.org/legacy/trac/-/issues/25417
|
|
|
|
|
2022-11-29 15:52:48 +01:00
|
|
|
Use the GoodNodes/Onions list in goodnodes.yaml to list onion services
|
|
|
|
you want the Introduction Points whitelisted - these points may change daily.
|
|
|
|
Look in tor's notice.log for 'Every introduction point for service'
|
|
|
|
|
|
|
|
```notice_log``` will parse the notice log for warnings about relays and
|
|
|
|
services that will then be whitelisted.
|
|
|
|
|
2022-11-08 15:15:05 +01:00
|
|
|
```--torrc_output``` will write the torrc ExcludeNodes configuration to a file.
|
2022-11-07 12:38:22 +01:00
|
|
|
|
2022-11-19 11:30:22 +01:00
|
|
|
```--good_contacts``` will write the contact info as a ciiss dictionary
|
2022-11-08 15:15:05 +01:00
|
|
|
to a YAML file. If the proof is uri-rsa, the well-known file of fingerprints
|
|
|
|
is downloaded and the fingerprints are added on a 'fps' field we create
|
|
|
|
of that fingerprint's entry of the YAML dictionary. This file is read at the
|
|
|
|
beginning of the program to start with a trust database, and only new
|
|
|
|
contact info from new relays are added to the dictionary.
|
2022-11-07 06:35:14 +01:00
|
|
|
|
2022-11-19 11:30:22 +01:00
|
|
|
Now for the final part: we lookup the Contact info of every relay
|
|
|
|
that is currently in our Tor, and check it the existence of the
|
|
|
|
well-known file that lists the fingerprints of the relays it runs.
|
|
|
|
If it fails to provide the well-know url, we assume its a bad
|
|
|
|
relay and add it to a list of nodes that goes on ```ExcludeNodes```
|
|
|
|
(not just ExcludeExitNodes```). If the Contact info is good, we add the
|
|
|
|
list of fingerprints to ```ExitNodes```, a whitelist of relays to use as exits.
|
|
|
|
|
|
|
|
```--bad_on``` We offer the users 3 levels of cleaning:
|
|
|
|
1. clean relays that have no contact ```=Empty```
|
|
|
|
2. clean relays that don't have an email in the contact (implies 1)
|
|
|
|
```=Empty,NoEmail```
|
|
|
|
3. clean relays that don't have "good' contactinfo. (implies 1)
|
|
|
|
```=Empty,NoEmail,NotGood```
|
|
|
|
|
2022-11-29 15:52:48 +01:00
|
|
|
The default is ```Empty,NoEmail,NotGood``` ; ```NoEmail``` is inherently imperfect
|
2022-11-19 11:30:22 +01:00
|
|
|
in that many of the contact-as-an-email are obfuscated, but we try anyway.
|
|
|
|
|
|
|
|
To be "good" the ContactInfo must:
|
|
|
|
1. have a url for the well-defined-file to be gotten
|
|
|
|
2. must have a file that can be gotten at the URL
|
|
|
|
3. must support getting the file with a valid SSL cert from a recognized authority
|
|
|
|
4. (not in the spec but added by Python) must use a TLS SSL > v1
|
|
|
|
5. must have a fingerprint list in the file
|
2022-11-29 15:52:48 +01:00
|
|
|
6. must have the FP that got us the contactinfo in the fingerprint list in the file.
|
2022-11-16 22:00:16 +01:00
|
|
|
|
2022-11-29 16:27:16 +01:00
|
|
|
```--wait_boot``` is the number of seconds to wait for Tor to booststrap
|
|
|
|
|
|
|
|
```--wellknown_output``` will make the program write the well-known files
|
|
|
|
(```/.well-known/tor-relay/rsa-fingerprint.txt```) to a directory.
|
|
|
|
|
|
|
|
```--torrc_output``` will write a file of the commands that it sends to
|
|
|
|
the Tor controller, so you can include it in a ```/etc/toc/torrc```.
|
|
|
|
|
2024-01-17 15:12:46 +01:00
|
|
|
```--relays_output``` write the download relays in json to a file. The relays
|
2022-11-29 16:27:16 +01:00
|
|
|
are downloaded from https://onionoo.torproject.org/details
|
2022-11-19 11:30:22 +01:00
|
|
|
|
2024-01-17 15:12:46 +01:00
|
|
|
For usage, do ```python3 exclude_badExits.py --help```
|
|
|
|
|
|
|
|
See [exclude_badExits.hlp](./exclude_badExits.hlp)
|
|
|
|
or there's a doctest file in [exclude_badExits.txt](./exclude_badExits.txt)
|
2022-11-19 11:30:22 +01:00
|
|
|
|
|
|
|
|