I’ve been researching ways to create an algorithm which can reliably detect if a user is using VPN or not. So far, I’m looking into traffic patterns, VPN IP list comparison and time-zone/geolocation method.
What else can I use? What other methods are there to detect VPN?
The crackdown on remote workers begins!! To be honest, if the user is using Wireguard/Tailscale back at home, I don’t think you’re going to detect it with anything other than deep packet inspection. Latency maybe if they’re legit on other side of world.
Not perfectly reliable but can be an indicator with other metrics:
Latency, especially when combined with TTL since it will be very different from non-VPN users.
Scan for open common VPN access/control ports in their IP. Useless for UDP-based protocol, and some users do host their own server at home, but the majority would be from a public VPN server.
You can look at TTL, RTT, and packet fragmentation, but all those methods can be masked or could be other false positives for other situations.
Really the only generally useful way to do it is IP reputation services. Which can be defeated with personal/self hosted VPNs. Reputation services are the industry best practice. There are also services that can discern if it’s an ISP or server IP address.
There are TONS of ways to detect the use of a VPN. The technique you use will depend on the environment.
If you are running a network, and trying to detect the use of a VPN to connect outbound you can look for the commonly used negotiation and encryption algorithms used by VPN clients and servers on you edge filtering solution. You can also look for other oddities in terms of machine behavior (like no DNS requests).
If you are hosting a service on the internet you can 1) look for end users source source IP address is in a block allocated to a data center (or AWS). You can also look at the segment size of the inbound traffic. If you do not see a packet of 1500 bytes from an end user (the max might be 1430), they are probably using a VPN.
You can also look for communication delays and packet timing. You can hide from a lot of things by using a VPN, but you cannot hide from physics. The JA4 folks use this. they call it “light distance locality”. Their work does not directly pertain to detecting the use of VPNs, but the “light distance locality” thing applies to VPNs. In short greater physical distance mean longer RTTs. RTTs much greater than what can be explained by the physical distance between the source and destination IP could indicate the use of a VPN.
You’ve got some of the items there. It depends on your environment and ability/willingness to make life harder for people.
If you’re in a corporate environment, you can deploy agents on workstations that look for VPN software packages. You can ‘restrict’ the ability to use VPNs by restricting outbound ports to say 80 & 443, then if you’ve got CA’s deployed, you do traffic analysis with https decryption, and if it’s not actual http/s traffic (but going out on 443), you kill it (or flag it as “possible VPN traffic”).
I’m sure smarter people than I have existing tools and packages for this.
You need more info on the ip itself like a history of activity to better detect it because there is no way to reliably detect a vpn. You could make a guess based on the ip address so if someone is coming from a cloud or vps or hosting company ip there is a good chance of a vpn. If you have enough history on the ip and know attacks/fraud/card testing/captcha passes and failures you could also make a determination which is what hcaptcha / cloudflare / google are doing.
Reading though all the responses, some of those are valid points and all can have various levels of success. But what are you trying to *prevent*.
Can you describe what the value of the information and or prevention will be?
What do you hope to achieve with this information, there may be less whackamole options.
As others mentioned IP datasets are great tools to detect whether a user is using a VPN with greater confidence. Pangea has an IP Intel API that gives you access to VPN detection based on the IP datasets powered by Digital Element. You can try it out at https://pangea.cloud/services/ip-intel/vpn/
We’re working on an ASM (Attack surface monitoring) tool which will have a feature where organizations can add a piece of code to their domains and get the data back in the ASM tool of how many of their users are on VPN and it will also generate deeper analytics like if someone was behaving suspiciously (had inspector opened) and much more that I can’t disclose here.
But blocking port 80 and 443, how would that block only vpn traffic?
I’d say use existing tools to blacklist possible known vpn proxie ips as long as it’s not their isp if their using their home network as a vpn. But that’s obviously less common.