exclude_badExits/README.md

144 lines
6.2 KiB
Markdown
Raw Normal View History

2022-11-19 11:30:22 +01:00
2022-11-07 12:38:22 +01:00
This extends nusenu's basic idea of using the stem library to
dynamically exclude nodes that are likely to be bad by putting them
on the ExcludeNodes or ExcludeExitNodes setting of a running Tor.
* https://github.com/nusenu/noContactInfo_Exit_Excluder
* https://github.com/TheSmashy/TorExitRelayExclude
2022-11-19 11:30:22 +01:00
The basic idea is to exclude Exit nodes that do not have ContactInfo:
* https://github.com/nusenu/ContactInfo-Information-Sharing-Specification
2022-11-07 12:38:22 +01:00
2022-11-19 11:30:22 +01:00
That can be extended to relays that do not have an email in the contact,
or to relays that do not have ContactInfo that is verified to include them.
2022-11-07 12:38:22 +01:00
But there's a problem, and your Tor notice.log will tell you about it:
2022-11-19 11:30:22 +01:00
you could exclude the relays needed to access hidden services or mirror
directories. So we need to add to the process the concept of a whitelist.
2022-11-08 15:15:05 +01:00
In addition, we may have our own blacklist of nodes we want to exclude,
or use these lists for other applications like selektor.
2022-11-07 12:38:22 +01:00
So we make two files that are structured in YAML:
```
2022-11-08 15:15:05 +01:00
/etc/tor/yaml/torrc-goodnodes.yaml
2022-11-29 15:52:48 +01:00
---
2022-11-08 15:15:05 +01:00
GoodNodes:
2022-11-29 15:52:48 +01:00
EntryNodes: []
2022-11-08 15:15:05 +01:00
Relays:
2022-11-29 15:52:48 +01:00
# ExitNodes will be overwritten by this program
ExitNodes: []
IntroductionPoints: []
# use the Onions section to list onion services you want the
# Introduction Points whitelisted - these points may change daily
# Look in tor's notice.log for 'Every introduction point for service'
Onions: []
# use the Services list to list elays you want the whitelisted
# Look in tor's notice.log for 'Wanted to contact directory mirror'
Services: []
2022-11-07 12:38:22 +01:00
By default all sections of the goodnodes.yaml are used as a whitelist.
2022-11-29 15:52:48 +01:00
Use the GoodNodes/Onions list to list onion services you want the
Introduction Points whitelisted - these points may change daily
Look in tor's notice.log for warnings of 'Every introduction point for service'
2022-11-29 16:27:16 +01:00
```--hs_dir``` ```default='/var/lib/tor'``` will make the program
parse the files named ```hostname``` below this dir to find
Hidden Services to whitelist.
The Introduction Points can change during the day, so you may want to
rerun this program to freshen the list of Introduction Points. A full run
that processes all the relays from stem can take 30 minutes, or run with:
```--saved_only``` will run the program with just cached information
on the relats, but will update the Introduction Points from the Services.
2022-11-08 15:15:05 +01:00
/etc/tor/yaml/torrc-badnodes.yaml
2022-11-29 15:52:48 +01:00
2022-11-08 15:15:05 +01:00
BadNodes:
2022-11-29 15:52:48 +01:00
# list the internet domains you know are bad so you don't
# waste time trying to download contacts from them.
ExcludeDomains: []
ExcludeNodes:
# BadExit will be overwritten by this program
BadExit: []
# list MyBadExit in --bad_sections if you want it used, to exclude nodes
# or any others as a list separated by comma(,)
MyBadExit: []
2022-11-07 12:38:22 +01:00
```
That part requires [PyYAML](https://pyyaml.org/wiki/PyYAML)
2022-11-19 11:30:22 +01:00
https://github.com/yaml/pyyaml/ or ```ruamel```: do
```pip3 install ruamel``` or ```pip3 install PyYAML```;
the advantage of the former is that it preserves comments.
2022-11-07 12:38:22 +01:00
2022-11-19 11:30:22 +01:00
(You may have to run this as the Tor user to get RW access to
/run/tor/control, in which case the directory for the YAML files must
2022-11-29 15:52:48 +01:00
be group Tor writeable, and its parent's directories group Tor RX.)
2022-11-07 12:38:22 +01:00
Because you don't want to exclude the introduction points to any onion
2022-11-08 15:15:05 +01:00
you want to connect to, ```--white_onions``` should whitelist the
2022-11-19 11:30:22 +01:00
introduction points to a comma sep list of onions; we fixed stem to do this:
2022-11-07 12:38:22 +01:00
* https://github.com/torproject/stem/issues/96
* https://gitlab.torproject.org/legacy/trac/-/issues/25417
2022-11-29 15:52:48 +01:00
Use the GoodNodes/Onions list in goodnodes.yaml to list onion services
you want the Introduction Points whitelisted - these points may change daily.
Look in tor's notice.log for 'Every introduction point for service'
```notice_log``` will parse the notice log for warnings about relays and
services that will then be whitelisted.
2022-11-08 15:15:05 +01:00
```--torrc_output``` will write the torrc ExcludeNodes configuration to a file.
2022-11-07 12:38:22 +01:00
2022-11-19 11:30:22 +01:00
```--good_contacts``` will write the contact info as a ciiss dictionary
2022-11-08 15:15:05 +01:00
to a YAML file. If the proof is uri-rsa, the well-known file of fingerprints
is downloaded and the fingerprints are added on a 'fps' field we create
of that fingerprint's entry of the YAML dictionary. This file is read at the
beginning of the program to start with a trust database, and only new
contact info from new relays are added to the dictionary.
2022-11-07 06:35:14 +01:00
2022-11-19 11:30:22 +01:00
Now for the final part: we lookup the Contact info of every relay
that is currently in our Tor, and check it the existence of the
well-known file that lists the fingerprints of the relays it runs.
If it fails to provide the well-know url, we assume its a bad
relay and add it to a list of nodes that goes on ```ExcludeNodes```
(not just ExcludeExitNodes```). If the Contact info is good, we add the
list of fingerprints to ```ExitNodes```, a whitelist of relays to use as exits.
```--bad_on``` We offer the users 3 levels of cleaning:
1. clean relays that have no contact ```=Empty```
2. clean relays that don't have an email in the contact (implies 1)
```=Empty,NoEmail```
3. clean relays that don't have "good' contactinfo. (implies 1)
```=Empty,NoEmail,NotGood```
2022-11-29 15:52:48 +01:00
The default is ```Empty,NoEmail,NotGood``` ; ```NoEmail``` is inherently imperfect
2022-11-19 11:30:22 +01:00
in that many of the contact-as-an-email are obfuscated, but we try anyway.
To be "good" the ContactInfo must:
1. have a url for the well-defined-file to be gotten
2. must have a file that can be gotten at the URL
3. must support getting the file with a valid SSL cert from a recognized authority
4. (not in the spec but added by Python) must use a TLS SSL > v1
5. must have a fingerprint list in the file
2022-11-29 15:52:48 +01:00
6. must have the FP that got us the contactinfo in the fingerprint list in the file.
2022-11-16 22:00:16 +01:00
2022-11-29 16:27:16 +01:00
```--wait_boot``` is the number of seconds to wait for Tor to booststrap
```--wellknown_output``` will make the program write the well-known files
(```/.well-known/tor-relay/rsa-fingerprint.txt```) to a directory.
```--torrc_output``` will write a file of the commands that it sends to
the Tor controller, so you can include it in a ```/etc/toc/torrc```.
2024-01-17 15:12:46 +01:00
```--relays_output``` write the download relays in json to a file. The relays
2022-11-29 16:27:16 +01:00
are downloaded from https://onionoo.torproject.org/details
2022-11-19 11:30:22 +01:00
2024-01-17 15:12:46 +01:00
For usage, do ```python3 exclude_badExits.py --help```
See [exclude_badExits.hlp](./exclude_badExits.hlp)
or there's a doctest file in [exclude_badExits.txt](./exclude_badExits.txt)
2022-11-19 11:30:22 +01:00