Exercise 1.3: Bulk Data Queries
Sending too many queries for individual IP addresses will tax our servers and will also be quite slow. Instead, use our "Daily Sources" feed which will summarize all data received the prior day in an easy to parse, tab delimited file.
The most recent file can be found at https://isc.sans.edu/feeds/daily_sources
The file is quite large (50-100 MBytes). Please download it only once a day. For this exercise, we will start with
curl https://isc.sans.edu/feeds/daily_sources > /tmp/sources.txt
Let's answer a simple question: What are the top 10 /24 networks, based on the number of IP addresses listed in the file:
First we need to remove comments:
grep -v '^#' /tmp/sources.txt
Next, we "cut" the first column, the source IPs.
grep -v '^#' /tmp/sources.txt | cut -f1
We need to count the number of distinct IPs. So we remove duplicates.
grep -v '^#' /tmp/sources.txt | cut -f1 | sort -u
But we are only interested in /24s. So we run another "cut" to only keep the first 3 octets.
grep -v '^#' /tmp/sources.txt | cut -f1 | sort -u | cut -f1-3 -d'.'
Finally, we sort, count unique network addresses and sort to see the most frequent /24s at the end.
grep -v '^#' /tmp/sources.txt | cut -f1 | sort -u | cut -f1-3 -d'.' | sort | uniq -c | sort -n
When I ran the command a couple days ago, I got this result for the top 10 (your result may be different):
126 063.088.023 132 042.115.009 136 205.251.193 139 045.083.066 148 045.083.065 148 045.083.067 149 045.083.064 165 103.131.071 185 071.006.233 212 192.035.168
In this example, 45.83/16 was interesting as it showed up 4 times. A whois look reveals that 22.214.171.124 - 126.96.36.199 is owned by yet another Internet Security Research project (Alpha Strike Labs) that is not yet fully listed in our feed but will be by the end of the week :).
Use the "Daily Sources" (see above) and the APIs "Miner" feed to find any IPs in the daily sources that are also in the miner feed. Let your command line Kung-Fu shine for this one!
Please try to hit the API only once and save the output to a file.
To remove the extra "0"s from the source IPs in the "daily sources" feed, use this sed command:
sed -E 's/^0+//' < /tmp/sources.txt | sed -E 's/\.0+/./g'
The API function you are looking for is
Probably the easiest way to get a list of IPs is
curl 'https://dshield.org/api/threatlist/miner?json' | jq '..ipv4' | tr -d '"' > /tmp/miners
Make sure the miner feed, as well as the daily sources only contain unique IPs:
sort -u /tmp/miners > /tmp/miners_uniq sort -u /tmp/sources.txt > /tmp/sources_uniq
Combine the two feeds, and check which IP addresses show up only once:
cat /tmp/miners_uniq /tmp/sources_uniq | sort | uniq -c | grep '^2'
You will likely get no duplicates. Mining pools are "passive" in that they will only accept connections. These IPs should not show up in our firewall log feed.