Top 1000 Websites Blocking VPN & TOR Users

One of the tips that security professionals love to give is to use a VPN on public wifi networks.   This is great advice and  (I personally like PrivateInternetAccess and NordVPN). Recently I noticed blocks traffic from TOR and VPN providers:
Screen Shot 2016-07-06 at 6.36.19 AM
That got me wondering what other websites were  blocking traffic from these sources so I decided to test the Alexa Top 1000 websites.
First I needed to get a list of the Top 1000 websites.   To do this I used this line of command line kung fu that grabs a CSV of the top 1 million websites and puts the top 1000 in a urls.txt file:
curl -s -O ; unzip -q -o top-1m.csv ; head -1000 top-1m.csv | cut -d, -f2 | cut -d/ -f1 > urls.txt
Here is the output from this command.
I now needed to automatically take a screenshot of 1000 websites.   I had started to write my own terrible python script using selenium until Chris Truncer pointed me to his amazing project called EyeWitness.
The command I used was:
./ --web -f urls.txt
Screen Shot 2016-07-06 at 8.45.38 AM
During my first test using  PrivateInternetAccess I found  11 of 1000* blocked access with a 401/404:
With,, and being the most inpactful websites on that list:

I then ran the test again through tor (using the tor container I built) and found 40 of 1000* blocked access with a 401/404: :
With many more asking for a captcha before gaining access:
Epilogue:  I play defense in my day job.  I understand the need stop malicious traffic from reaching your website.  This isn’t an indictment just an academic exercise although if more and more websites take this  approach tools like TOR and commercial VPNs will become less useful.
Final Notes: 
I was surprised at how many porn websites are in the top 1000 overall websites.
It takes 1.8 gigs of storage to screenshot the top 1000 websites.
*Your results will vary on what is blocked based on exit node,  VPN, time you test and what color shirt you have one.

Site Footer