This is a report on how I implemented two different types of HTTP accelerator's with Squid (2.4.STABLE2). A more elaborate documentation will appear in the upcoming issue of the German computer magazine iX (http://www.heise.de/ix/). Setup 1: Simple accelerator with virtual hosts served by one Apache ------------------------------------------------------------------- Usally Apache listens on port 80 (www) of the interfaces lo (127.0.0.1) and eth0. This document describes the modifications to httpd.conf to make Apache listen on the loopnet interface only. Squid is then configured to listen to the ethernet interface, forwarding all requests to the loopnet interface (to Apache). Before: Apache <-+-----------------------------> eth0:80 <--> Internet `-> localhost:80 After: Apache <-+···························,-> eth0:80 <--> Internet `-> localhost:80 <-- Squid -' First, all virtual host definitions in Apache's httpd.conf have to be rewritten for the loopnet interface. If your NameVirtualHost was set to 192.168.1.1 before, it has to be changed to 127.0.0.1 and all the virtual host addresses have to be set to 127.0.0.1, too. If you did not already, make a backup of your original httpd.conf. Next, all port statements in httpd.conf are removed and replaced with the statement "Listen 127.0.0.1:80". Because Apache does only see a small fraction of the original requests, change the logfile names to reflect that. Example: CustomLog /opt/apache/logs/access-cached.log cached The only client Apache sees is Squid, thus the usal log file format definitions will log 127.0.0.1 as the ip address of all requests. More meaningful are the ip addresses of the original clients. That information is transmitted by Squid in the header named X-Forwarded-For. Use the following format to have that style: LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i"" cached To reduce the resources needed by Apache, switch off the keep alives. Usally you can gain more by saving the memory for 50 Apache processes just waiting in keep-alive-state and increase cache_mem for Squid instead. The following statements in squid.conf make Squid listen to 192.168.1.1:80 and send requests to 127.0.0.1:80: http_port 192.168.1.1:80 httpd_accel_host 127.0.0.1 httpd_accel_port 80 httpd_accel_single_host on More speaking names would have been httpd_accel_destination_host and httpd_accel_destination_port (that would solve the questions if accel belongs to host or to http). To have virtual hosts, the original host headers of HTTP/1.1 must be sent to Apache: httpd_accel_uses_host_header on redirect_rewrites_host_header off It is safe to switch httpd_accel_uses_host_header to on, because the destination host is already fixed with "single_host" (127.0.0.1). For compatibility with the extended log file format (anybody *not* using it?) Squid has to be patched. The patch is available from http://www.idle.com/~roy/patch.elf 2.2.STABLE4 http://wt.xpilot.org/projects/squid/http_accel/ 2.4.STABLE2 The second one is just a tiny port of the first for 2.4. The patch introduces a new option called emulate_extended_httpd_log, which is switched on: cache_access_log /opt/apache/logs/access.log emulate_extended_httpd_log on However, the logging format is not exactely what Apache defines as "combined". Squid logs the full URL, not only the path. The patch does not change that behaviour. You better check your analyzing software can read that. The patch adds not only referer and user agent, but one more field with Squid internal data. Example: x.75.182.232 - - [15/Sep/2001:03:58:29 +0200] "GET http://www.x.de/politik/img/x.jpg HTTP/1.1" 200 2168 "http://www.x.de/politik/index.html" "Mozilla/4.0 (compatible)" TCP_HIT:NONE I use a small script split_squid_log.pl to clean up the format before feeding it to the statistics software. For production use, you may also consider the following settings: request_body_max_size 100 MB acl QUERY urlpath_regex /edit-cgi/ \.mp3$ no_cache deny QUERY A big request_body_max_size is required for HTTP file uploads with big files. The no_cache definition prevents certain URLs from being cached. With "refresh_pattern" you can force caching for some type of CGI scripts. Setup 2: Simple accelerator for one server process (with virtual hosts) ----------------------------------------------------------------------- Apache's support for executing scripts under different UID's is limited to what suexec can do. The seperation into an external set-uid-application may sound as a good idea first, but after thinking about it, I noticed there are only two cases for suexec: unsecure or unusable (read: either you let your 20 users share the same accout [after all, this is what Unix is for, isn't it?] or suexec will deny to work). If only Roxen·Challenger 2.1 (http://www.roxen.com/) would be a little more stable under Linux 2.4... Ok, so what I did was to start five Apaches with different User and Group settings. They listen to different ip addresses on the loopnet interface. Squid forwards the requests from the main port to them. Basically the setup is like the first one described above, but with the following additions: httpd_accel_host virtual redirect_program /etc/squid-redirect.pl redirect_children 5 The application /etc/squid-redirect.pl rewrites the requested URL's. Example: http://www.x.de (Squid) -> http://www_x_de.loopnet (Apache) http://www.y.de (Squid) -> http://www_y_de.loopnet (Apache) Yes, this requires some modification to your DNS server, but is the most flexible approach (it still gives you virtual hosts in HTTP/1.1 style; however, remember to define them in httpd.conf for the loopnet interface.). Here is a prototype of such a script (mine is too complex to serve as an example): #! perl -w use strict; $| = 1; my $vhosts = "http://www\.(x|y)\.de/"; while (defined(my $line = )) { unless ($line =~ s,^$vhosts,http://www_$1_de.loopnet/,sio) { $line = "http://www_x_de.loopnet/ "; } print $line; } Solutions written in C are available from here: http://ivs.cs.uni-magdeburg.de/~elkner/webtools/jesred/ http://squirm.foote.com.au/