Switching from Squid To Apache
I've run a Squid cache on my home network for a long time. Setting it up was one of my very first Linux projects, motivated by the allure of being able to filter out ads and annoyances with AdZapper
Squid is the kind of thing that runs by itself and almost never needs any looking after. But it's also capable of far more than a home network would ever need.
I'm much more familiar with Apache than I am with Squid. When I read that Apache could serve websites and act as a proxy server, a kill-two-birds-with-one-server project was born.
It seemed pretty straightforward. A note on the AdZapper homepage mentioned compatibility with Apache, and there are sample configurations available from the Apache documentation for mod_proxy and mod_cache, the two modules that make the proxy magic happen. In other words, cake as far as the eye can see!
But the cake was unexpectedly lumpy.
Lump #1 was the realization that AdZapper might be overkill, just like Squid. AdZapper consists of over 2,000 filter rules-- begging the question of whether I really needed all of them. For that matter, were they all still even valid? There was also the problem of getting AdZapper + Apache working in the first place. AdZapper requires a file called flush.pl, which has to do with Perl 4 compatibility and wasn't available with the default, Debian-stable installation of Perl that I had. It was a workable problem in that I could have just copied the file from another machine, but suddenly the idea of attaching a Perl-oriented script to what could/should otherwise be an Apache-only solution didn't seem like such a hot idea.
Lump #2 came from trying to covert AdZapper's rules into mod_rewrite-compatible syntax. AdZapper uses shell-style patterns rather than PCRE patterns. I wrote a script to negotiate the differences, but while trying to find the stray rule that was somehow preventing stylesheets from loading properly, I decided I'd be better off starting from scratch.
Lump #3 was the mysterious appearance of exiting PIDS and floating point exceptions in the Apache error log. One site would get proxied just fine, then another would fail to load at all. Worse, requests for pages on the same site would occasionally hang for no apparent reason. I initially suspected mod_cache, but the culprit seems to have been DNS-based.
Finally, bump #4 was realizing the value of both memory- and disk-based caching. Although mod_mem_cache may have speed advantages over mod_disk_cache in that reading from memory would obviously be faster than reading from disk, a memory-based cache won't persist between restarts and reboots. I certainly can't tell the difference when browsing. With a disk based cache, you need to run the htcacheclean utility provided with Apache to prune out the old junk and keep the overall size of the cache within reasonable bounds.
In spite of these problems, I think the switch was ultimately worthwhile. Ad filtering is working ok, and I'm gradually building out my own set of blocking rules. Caching is ok too-- pages load right quick, although comparisons to Squid would likely be like splitting hairs.
Here are the details of how I compiled Apache 2.2 for all this:
./configure --prefix=/usr/local/apache22 \ --enable-proxy \ --enable-proxy-connect \ --enable-proxy-http \ --enable-shared=max \ --enable-mem-cache \ --enable-disk-cache \ --enable-cache \ --enable-rewrite \ --with-mpm=worker \ --disable-userdir \ --disable-authz-groupfile \ --disable-include
The worker MPM is used here because I'm not going to use this Apache server for PHP or any other duties, and because it offers lower resource usage than the prefork MPM according to what I've read. Compiled modules are used instead of dynamic modules as a speed optimization.
Here's the proxy-related portion of httpd.conf. Nothing terribly special here; it's basically taken from the Apache documentation:
ProxyRequests On ProxyVia On ProxyTimeout 3 AllowCONNECT 443 ProxyReceiveBufferSize 2048 ErrorDocument 403 /zaps/403.html ErrorDocument 502 /zaps/502.html <IfModule mod_cache.c> CacheEnable mem / MCacheSize 10240 MCacheMaxObjectCount 100 MCacheMinObjectSize 1 MCacheMaxObjectSize 2048 CacheRoot /var/proxycache CacheEnable disk / CacheDirLevels 5 CacheDirLength 3 </IfModule> <Proxy *> Order deny,allow Deny from all Allow from <authorized ips> RewriteEngine On RewriteRule &proxy:.*eyewonder.com$ - [F,L] RewriteRule ^proxy:.*doubleclick.net/ad.* - [F,L] ... further rewrite rules </Proxy>