Overview
The choice between using a transparent proxy or a non-transparent proxy is often made based on convenience. Since a non-transparent proxy requires proxy settings on client devices, the easiest method of getting web filtering to work, is just to rely on a transparent proxy.
Both methods have their pros and cons but luckily it's not a choice of either/or - both can be used at the same time. In general, we recommend that you use proxy settings for managed devices and transparent proxy for non-managed devices.
Understanding client behavior when using transparent and non-transparent proxies
When a client is set to use a proxy, the client sends any requests directly to the proxy. The proxy takes care of the DNS resolution and retrieves the resource on the clients behalf. The client knows that it'll use a proxy, which means that the software, browser or otherwise, can add additional information, like SNI headers, to the request, to help the proxy understand the request better.
SNI is short for "Server name indication" and is an unencrypted part of the HTTPS header that contains the host and domain name of the destination request.
When a client is behind a transparent proxy, the client performs it's own DNS resolution and sends the request to an IP address. This is the state that the transparent proxy receives the request in, so additional steps need to be taken on the transparent proxy to validate the requests. This is mainly an issue with HTTPS requests. For HTTP requests, all the information is available to the transparent proxy because it's not encrypted. The proxy can see the full URL of the request by investigating HTTP headers. When the request is using HTTPS, the proxy can initially only see a request going to an IP address. If the request includes SNI headers, the host and domain name can also be seen by the transparent proxy. Browsers add SNI headers to HTTPS requests, but other type of applications generally don't.
Without the SNI information the transparent proxy needs to take additional steps to validate the HTTPS traffic. On a Smoothwall there are two checks that can be performed to validate that HTTPS traffic is legitimate. This validation is needed to prevent other types of applications, like VPN or peer-to-peer traffic, from trying to use HTTPS to bypass web filtering. The two checks are:
- Comparing the destination IP address to the category called "Transparent HTTPS incompatible sites". This category contains the target IP addresses used by legitimate applications known to not include SNI information by default. These will include some Office 365 applications as well as software updaters and anti-virus applications etc.
- Validating certificate information. HTTPS traffic will always use certificates to encrypt traffic. The Smoothwall can perform certificate validation on these certificates to ensure that the certificates are trusted, before allowing the traffic to use the proxy.
These checks are done to allow the HTTPS traffic to use the proxy. Passing these checks doesn't allow the traffic to pass through the web filter automatically. For example, a request for https://www.playboy.com using the transparent proxy, will have a valid certificate and thus be allowed to use the proxy but the web filter will be able to then block the request on a Smoothwall, where the categories "Pornography" or "Adult content" are set to be blocked.
When setting up a transparent proxy on a Smoothwall, the options for HTTPS traffic are:
- Block if SNI isn't present.
- Allow "Transparent HTTPS incompatible sites".
- Validate certificate.
- Allow "Transparent HTTPS incompatible sites" and validate certificate on other requests.
Blocking SNI is the most restrictive option. We recommend that you use the combined method of allowing the "Transparent HTTPS incompatible sites" and validate others by certificate.
Using proxy settings
The main stumbling block for using non-transparent proxy is the proxy settings. Setting these statically on devices might cause issues with portable devices, like laptops that users bring home or take on the road. A good method for avoiding those issues is to use the "Automatic proxy discovery" option, which is available on most desktop browsers. You can set up the infrastructure for this. Clients need to be able to resolve the host name of "wpad" on their network. If that host is available, the clients will request the "wpad.dat" file from that host. This file can contain proxy settings, just like a proxy.pac file.
The Smoothwall can create and host these files. In the "Web proxy » Web proxy » Automatic configuration" section, the configuration for proxy exceptions can be added as well. The exceptions list should include the following:
- Private IP subnets (192.168.0.0/16, 172.16.0.0/12 and 10.0.0.0/8)
- Local Active directory domains
- Any external IP or domains that shouldn't be accessed via the proxy.
Note: You might need to add external proxy exceptions as IP addresses in the "Guardian » Web filter » Exceptions" destination exceptions input field to make sure the traffic isn't intercepted by the transparent proxy.
Once you have set up the automatic configuration, you need to add a DNS alias for the Smoothwall like this:
This was done on a domain named training.local for the Smoothwall host name of utm.
For the final configuration step, you need to remove the wpad name from the blocklist, where it's enabled by default on a Microsoft DNS server. This is to prevent dynamic updates from adding a wpad host name. If you set that statically like above, that risk no longer exists.
Use the DNSCMD options on your DNS servers to remove wpad from the list.
Now you need to enable the automatic proxy discovery option on client browsers. This is the default setting on Internet Explorer. When users are in the domain network, they'll be able to resolve the wpad host name and get proxy settings. When users are outside the network, they won't be able to resolve the wpad setting and thus the browser won't use a proxy.