HTTP Proxy Servlet

Update: License Added (2010-06-05)

I have recieved a number of emails regarding the license under which this little piece of code is released, and I’ve been meaning to get around to answering those requests for some time, but now I’m finally doing it. And the winner is: Apache License, Version 2.0. So yeah, I hereby release this software under the Apache license, yada, yada, yada. If this license is to restrictive for you, drop me a line and we can discuss other possible licensing options.Oh, and by the way: since the time of this writing, I have since realized that Apache HTTPD’s mod_proxy is none too shabby, so I actually have started using that instead of this little proxy servlet. But if this helps you to accomplish your mission, then more power to ya!

I’m hoping to get around to hosting this on Google Code soon so that I can leverage their source control, issue tracker, wiki, etc. When I do, I’ll be sure to post a link to it here :)


So here’s the deal:

I have some applications running in an Apache server, and I have other applications running in a Tomcat server.
I like the fact that these applicaitons are running in their respective servers, and I definitely don’t want to have to choose between using only Apache or only Tomcat.
The problem is, I can only bind one application on port 80 (the standard port for the HTTP protocol) at a time. But I like having my servers on port 80, because when I type in a URL in my web browser, I dont have to bother with the whole :<port> business to specify the port.
I really don’t want to have to remember which application is running in which server on what port when I enter a URL. For example, if I want to got to my wiki, I don’t want to find myself asking “OK, was the wiki running under Apache or Tomcat?”

Solutions:

The mod_jk connector provides a means of solving this problem by properly configuring Apache and Tomcat to play nicely with each other. For most people, this is probably the best solution, but I don’t like it. My reasons for not liking the mod_jk solution are pretty arbitrary, but I find my justification in the desire for Loose Coupling of software systems, which has been ingrained in to my very core through years of computer science education.
So, if we want to have loosely coupled webservers both appearing to operate on port 80, we are going to have to perform some magic with some kind of Transparent Proxy.

There are two possibilities for a proxy server configuration:

  1. Use a third-party proxy application such as Squid to accept all requests on port 80 and proxy them to the appropriate webserver.
  2. Configure one of the webservers to listen on port 80 and automagically proxy certain requests to the other webserver.

I’m sure that the former option is possible, but I don’t want to install squid on my server. I’m running everything on a server that is cobbled together from spare parts, so it’s already over-taxed without me installing squid on it.

I am therefore left only with the choice of which webserver will handle the proxying. Apache has some built-in proxying capabilities, and this guide explains how to set up Apache to proxy over to Tomcat. However, as far as I can tell, this requires that you update Apache‘s httpd.conf every time a new webapp is installed in Tomcat. Alternatively, all webapps in Tomcat could have a certain path prefix (for example, you could configure Apache such that all requests to http://example.com/tomcat/* are proxied to Tomcat).
I am not looking to add any more administrative burden for myself when I deploy my applications, so this option is out for me.

Given the power of Tomcat for deploying applications, I figured it should be no trouble for a servlet to solve the proxying problem. Therefore, I have a a small HTTP Proxy Servlet running in the root context of the Tomcat server, that is mapped to the URL pattern ‘/*’ (the so-called ROOT context). This way, any request made to the Tomcat server that does not map to a certain web application gets handled by the HTTP Proxy Servlet sitting in the ROOT web application context.

Proxy Servlet Options:

In my search for a solution, I ran across a number of HTTP Proxy Servlets, but none that quite suited my fancy:

When I tried the servlet from Coldbeans Software, I was unable to get it to start right away, so I pretty much gave up on it right then and there.

The most full featured servlet was Noodle, fully supporting custom filters and streaming. However, the software depends on a seemingly archaic HTTP client library called (creatively enough) HTTPClient. From my cursory investigation, it seems like the HTTPClient project died soon after it was integrated into Noodle. At the time of this writing (October 2007), Noodle ships with version 0.3-2 of the HTTPClient library, and the most recent version available is 0.3-3. Additionally, the HTTPClient website currently shows a date of “6. May 2001″ for most of the pages (including the “Bugs fixed in V0.3-3″ page), so it seems safe to say that the project is fairly well dead. I would never have known or cared about any of this info about the HTTPClient library except for the fact that it choked on a simple HTTP redirect, which the library claims to support.
Support for HTTP redirects is a crucial feature for what I want to accomplish, and I was not interested in gutting the Noodle source code of its current HTTPClient library usage and sticking a different HTTP Client library in its place.

The servlet from Frank’s Internet Playground worked out very well for me for a while. I did have to make a modification to the source code, as there were two lines that handled redirects, but were commented out. I simply uncommented those two lines, and everything was hunky-dory. Unfortuantely, I was trying to log in to a PHP based web application through the proxy servlet one day, and to my chagrin, I discovered that the proxy servlet that I knew and loved did not properly pass “Set-Cookie” headers back to the client.

It was at this point that I decided to write my own proxy servlet, borrowing a good deal of code from Noodle, and using the Jakata Commons HttpClient library, which seems to be under more active development than the HTTPClient library used by noodle.

Implementation:

I have attached a the proxy servlet as a WAR file, you can find the link at the bottom of the page.

Source code

I have also attached the source code for the proxy servlet separately, you can also find the link at the bottom of the page.

Dependencies

Currently, all of the libraries I use are from the Jakarta Commons.

Library Version Used
Commons HttpClient 3.1
Commons Logging 1.1
Commons Codec 1.3
Commons FileUpload 1.2
Commons IO 1.3.2

Servlet Configuration

To configure the servlet, put this in your web.xml and customize the intit-param values to suit your needs:

  <servlet>
    <servlet-name>ProxyServlet</servlet-name>
    <servlet-class>net.edwardstx.ProxyServlet</servlet-class>
    <init-param>
      <param-name>proxyHost</param-name>
      <param-value>localhost</param-value>
    </init-param>
    <init-param>
      <param-name>proxyPort</param-name>
      <param-value>80</param-value>
    </init-param>
    <init-param>
      <param-name>proxyPath</param-name>
      <param-value></param-value>
    </init-param>
    <init-param>
      <param-name>maxFileUploadSize</param-name>
      <param-value></param-value>
    </init-param>
  </servlet>

...

  <servlet-mapping>
    <servlet-name>ProxyServlet</servlet-name>
    <url-pattern>/*</url-pattern>
  </servlet-mapping>

Attachments:

ROOT.war
ProxyServlet.java

Comments are closed.