2006/01/19
编译 Apache for Microsoft Windows
在你开始编译Apache之前有许多重要问题需要注意。开始之前参看 在Microsoft Windows平台上使用Apache 。
系统要求
编译Apache需要正确安装以下环境:
磁盘空间
确保至少有50 MB空闲磁盘空间可用。安装以后Apache要求大约 10 MB磁盘空间,再加上会快速增长的日志和缓存文件需要的空间。 实际需要的空间大小会相当大程度上取决于你选择的配置以及使用的第三方模块和库。
Microsoft Visual C++ 5.0 或更高版本。
可以使用命令行工具,也可以在Visual Studio集成开发工作环境内编译Apache。使用命令行工具要求环境变量中包含
路径,头文件,库和其他一些变量,这些环境变量可以用vcvars32批处理文件来设置;"c:\Program Files\DevStudio\VC\Bin\vcvars32.bat"The Windows Platform SDK.
Visual C++ 5.0 编译需要一套新版的Microsoft Windows Platform SDK来允许Apache的某些特性。 对于命令行编译,用
setenv批处理文件来设置环境变量:"c:\Program Files\Platform SDK\setenv.bat"随Visual C++ 6.0 及以后版本发布的Platform SDK文件足以满足要求,所以新版本的用户可以略过这条要求。
注意,需要新版的Windows Platform SDK来使得Apache支持的全部mod_isapi特性可用。没有新版SDK的话,在MSVC++ 5.0 下编译Apache会出现某些mod_isapi特性将被禁止的警告。 在http://msdn.microsoft.com/downloads/sdks/platform/platform.asp可以找到新版的Microsoft Winodws Platform SDK。awk工具(awk, gawk或类似软件).
为了在编译系统内安装Apache,用
awk.exe工具修改了几个文件。选择awk是因为它很小易于下载(与Perl或者WSH/VB相比),而且能够完成生成文件的任务。 Brian Kernighan的http://cm.bell-labs.com/cm/cs/who/bwk/ 站点有一个编译好的本地Win32代码版本,这个文件 http://cm.bell-labs.com/cm/cs/who/bwk/awk95.exe你必须将它名字保存为awk.exe而不是awk95.exe。注意Developer Studio集成开发环境只能在Tools - Options菜单中的Directories页上列出的可执行文件搜索路径列表中查找awk.exe(对于Developer Studio 7.0 是在the Projects - VC++ Directories 面板)。 把awk.exe的路径加入到列表中,并按要求加入到系统PATH环境变量里。如果你用的是Cygwin (http://www.cygwin.com/)需要注意,awk工具的文件名是gawk.exe而文件awk.exe实际上是gawk.exe的一个符号连接。 而Windows命令行解释程序不认识符号连接,因此编译二进制安装文件会失败。可行的变通办法是从cygwin安装目录删除文件awk.exe并把gawk.exe改名为awk.exe。[可选] OpenSSL库 (因为
mod_ssl和ab.exe用到ssl支持)警告:在整个世界范围使用和发布强壮的密码体系与专利知识产权都有相当大的限制和严格的禁令。 OpenSSL包括了在美国及其他国家和地区受到出口条例、国内法律以及受专利保护的知识产权所限制的强壮密码体系。对于OpenSSL项目提供的代码,不管是Apache软件基金会还是OpenSSL项目都不能提供关于拥有、使用和发布该代码的法律建议。向你自己的法律顾问咨询,你需要为你自己的行为负责。
为了编译
mod_ssl或abs项目(ab.exe用到SSL支持),OpenSSL必须安装到srclib目录下名为openssl的子目录中,openSSL可以从http://www.openssl.org/source/获得。要是准备既编译release版本又编译debug版本,而且要禁止 0.9.6 版中受专利保护的特性,你应该使用下列编译命令:perl util\mkfiles.pl >MINFO
perl util\mk1mf.pl dll no-asm no-mdc2 no-rc5 no-idea VC-WIN32 >makefile
perl util\mk1mf.pl dll debug no-asm no-mdc2 no-rc5 no-idea VC-WIN32 >makefile.dbg
perl util\mkdef.pl 32 libeay no-asm no-mdc2 no-rc5 no-idea >ms\libeay32.def
perl util\mkdef.pl 32 ssleay no-asm no-mdc2 no-rc5 no-idea >ms\ssleay32.def
nmake
nmake -f makefile.dbg[可选] zlib 源码 (用于
mod_deflate)Zlib必须安装到
srclib目录下的zlib子目录,但是你不需要去编译那些源码。编译系统会直接把压缩源码编译到mod_deflate模块中去。 Zlib可以从http://www.gzip.org/zlib/获得 --mod_deflate已经经过验证可以使用版本 1.1.4 正确编译。
命令行编译
首先,将Apache源码解包到合适的目录。打开一个命令提示符窗口并用cd切换到那个目录。
主要的Apache make文件命令都包含在文件Makefile.win中。要在Windows NT上编译Apache,只需要简单地使用下列命令之一就可以编译release版本或者debug版本,分别是:
nmake /f Makefile.win _apacher
nmake /f Makefile.win _apached
两条命令都可以编译Apache。后者会在编译结果文件中包含调试信息,使发现bugs和跟踪问题更容易。
Developer Studio集成开发环境的工作区编译
Apache也能够用VC++的Visual Studio集成开发环境编译。为了简化过程,提供了一个Visual Studio工作区文件,Apache.dsw。 它阐述了完整的Apache二进制发行版需要的全部.dsp项目列表。 它包含了项目之间的依存关系来保证编译按合适的顺序进行。
打开 Apache.dsw 工作区文件,选择 InstallBin (根据需要选择编译Release 或者 Debug 版本) 为活动项目。InstallBin会引发编译相关的项目并调用 Makefile.win 移动编译后的可执行文件和动态链接库。你可以改变InstallBin项目的设置来定制 INSTDIR=选项,修改设置中General页里面的Build Command line条目。INSTDIR的缺省值是 /Apache2目录。如果你只是想要测试编译(不安装),就用 BuildBin项目代替。
.dsp项目文件使用Visual C++ 6.0格式发行。Visual C++ 5.0 (97)也能识别这种格式。而Visual C++ 7.0 (.net)必须把Apache.dsw和.dsp 文件转换成Apache.sln和.msproj文件, 如果有任何一个.dsp源文件改变了,必须重新转换相应的.msproj文件! 这很容易,只需要在VC++ 7.0 集成开发环境中重新打开 Apache.dsw文件。
Visual C++ 7.0 (.net)的用户还应该使用Build 菜单下的Configuration Manager对话框来不选中模块abs,mod_ssl和mod_deflate, 对编译Debug和Release版本都是。 仅当srclib目录下至少存在openssl或者zlib子目录二者之一, 才能调用nmake或者明白地使用BinBuild目标直接从集成开发环境来编译这几个模块。
导出的那些.mak文件造成很大的争议,但对于 Visual C++ 5.0 的用户它们是编译mod_ssl、abs(带SSL支持的ab)和mod_deflate是必需的。 VC++ 7.0 (.net)的用户也能从中受益,用nmake编译比用 binenv要快。 从VC++ 5.0 or 6.0 集成开发环境编译所有项目,再使用Project菜单 - Export导出所有make文件。 为了创建全部自动产生的动态目标你必须首先编译项目,以便互相之间的依存关系可以被正确解析。运行下面命令修正路径使之能编译到任何位置:
perl srclib\apr\build\fixwin32mak.pl
你必须在httpd源码树的顶层目录输入这个命令。 当前目录及其子目录下所有的.mak and .dep项目文件都将被改正,并且时间戳被调节到与.dsp一致。
如果你贡献修正项目文件的补丁,我们必须以Visual Studio 6.0 格式来确认项目文件。 改动应该简单而且只带有最少的编译和连接标记以便能够被从VC++ 5.0 到 7.0 的所有环境识别。
项目组件
Apache.dsw工作区文件和makefile.win nmake脚本都是以下列顺序编译Apache服务器的.dsp项目文件:
srclib\apr\apr.dspsrclib\apr\libapr.dspsrclib\apr-util\uri\gen_uri_delims.dspsrclib\apr-util\xml\expat\lib\xml.dspsrclib\apr-util\aprutil.dspsrclib\apr-util\libaprutil.dspsrclib\pcre\dftables.dspsrclib\pcre\pcre.dspsrclib\pcre\pcreposix.dspserver\gen_test_char.dsplibhttpd.dspApache.dsp
此外,modules\子目录树包含了大多数模块的项目文件。
support\子目录包含了一些附加程序的项目文件,它们运行时不是Apache的一部分, 但是管理员要使用它们来测试Apache和维护密码与日志文件。 Windows平台特有的支持项目在support\win32\目录下。
support\ab.dspsupport\htdigest.dspsupport\htpasswd.dspsupport\logresolve.dspsupport\rotatelogs.dspsupport\win32\ApacheMonitor.dspsupport\win32\wintty.dsp
一旦编译了Apache,它需要被安装在它的服务器根目录,缺省是在同一个盘符下的\Apache2目录。
要自动编译和安装所有文件到指定的目录dir,使用下列nmake命令之一:
nmake /f Makefile.win installr INSTDIR=dir
nmake /f Makefile.win installd INSTDIR=dir
INSTDIR的dir参数给出了安装目录;如果要安装到\Apache2目录可以省略。
安装结果如下列:
dir\bin\Apache.exe- Apache可执行文件dir\bin\ApacheMonitor.exe- 服务监视器托盘图表工具dir\bin\htdigest.exe- 摘要授权密码文件工具(Digest auth password file utility)dir\bin\htdbm.exe- SDBM授权数据库密码文件工具(SDBM auth database password file utility)dir\bin\htpasswd.exe- 基本授权密码文件工具(Basic auth password file utility)dir\bin\logresolve.exe- 日志文件dns名称查找工具dir\bin\rotatelogs.exe- 日志文件遍历工具dir\bin\wintty.exe- 控制台窗口工具dir\bin\libapr.dll- Apache可移植运行时共享库dir\bin\libaprutil.dll- Apache运行时共享库工具dir\bin\libhttpd.dll- Apache核心库dir\modules\mod_*.so- Apache可装载模块dir\conf- 配置目录dir\logs- 空日志目录dir\include- C语言头文件dir\lib- 连接库文件
关于从开发树编译Apache的警告
发行版本之间,只有.dsp文件被维护。 考虑到会对审阅者的时间造成巨大浪费,并不重新产生.mak文件。 因此,你不能依靠上述的NMAKE命令来编译修订过的.dsp项目文件, 除非你自己从项目中导出全部.mak文件。如果你在Microsoft Developer Studio环境中编译这样做是不必要的。BuildBin目标项目是非常值得的(或者用命令行目标 _apacher或_apached)。 许多文件在编译过程中自动产生。只有一次完全编译才提供为正确的编译行为编译正确的依存关系树所需要的全部依赖文件。为创建供发布的.mak文件,一定要检查.mak (或.dep)中Platform SDK和其他头文件的依存性。 DevStudio\SharedIDE\bin\(VC5)或者DevStudio\Common\MSDev98\bin\(VC6) 目录包含了sysincl.dat文件, 其中列出了所有的例外情况来告诉VC++创建依存关系时不扫描列表中的文件, 更新此文件以包含这些头文件 (同时包括正斜杠和反斜杠路径,比如sys/time.h和sys\time.h要同时列出)。 在发布的.mak文件中包含一个本地安装路径将使编译完全失败,所以, 不要忘了运行srclib/apr/build/fixwin32mak.pl来修正.mak文件中的绝对路径。
quake 发表于 2006-01-19 15:16 阅读( 731) 评论( 0) 引用( 0) Tech
Win32 下开发 Apache2 Module 起步
Apache 的模块是很有意思的东西,原来 O’Reilly 有本 Writing Apache Modules in Perl and C,算是这方面的终极宝典,可惜此书名不副实,Perl 的内容占了绝大部分,C 的内容几乎是一笔带过 (也难怪,作者就是 mod_perl 的作者嘛),而且这本书是 1999 年写的,也是按照 Apache 1.3 的内容来写的,和现在 Apache2 的情况已经有了不少差别。
而且网上有的教程都是讲 Linux 下的配置过程,问题在于我自己在测试开发的时候是要用 Win32 下的 Apache 的,所以只好自己搞定。
我在网上 google 了一下,没找到什么详细的资料,考虑可能有朋友需要这种上手的教程,把今天的步骤写下来。
安装 Apache 2.
到 这里 找到一份最简单的 Apache2 module 源代码,这里有这份代码的详细说明,不过讨论是 Unix/Linux 下的情况。
如果你没有安装 Visual C++ 6.0 或者 Visual Studio .Net 的话,可以安装 Visual C++ 2003 Toolkit,这是免费的 M$ 的 C++ 编译器。
在 mod_tut1.c 的开头加一句:
#ifndef WIN32
#define WIN32
#endif
把 Apache2 的头文件和库文件目录加入 INCLUDE 和 LIB 环境变量中。可以参考下面这个 batch file:
set LIB=%LIB%;E:\Progra~1\Apache~1\Apache2\lib
set INCLUDE=%INCLUDE%;E:\Progra~1\Apache~1\Apache2\include
用 cl /c mod_tut1.c 编译,生成 mod_tut1.obj。
用 link /DLL mod_tut1.obj libhttpd.lib 链接起来,生成 mod_tut1.dll。
把 mod_tut1.dll 改名为 mod_tut1.so,复制到 Apache2 安装目录的 modules 子目录下。
在 httpd.conf 中加一句:
LoadModule tut1_module modules/mod_tut1.so
重新启动 Apache,访问 http://localhost/ 一次,然后打开 access.log,看看里面有没有新增“A request was made”的内容,有的话,说明搞定了。:)
当然,上面的内容也不是开发的实际过程,具体的开发过程,前面提到的那本书和 Apache 的文档里描述得更详细,这些,无论是 Unix/Linux 还是 Win32 都是共通的,所以也就无需我饶舌了。
quake 发表于 2006-01-19 14:35 阅读( 989) 评论( 1) 引用( 0) Tech
APACHE MOD 开发 Step 2 获取用户输入
APACHE MOD 开发 Step 2 获取用户输入
1. 了解apr_table_t结构
可以把他理解为一个HASH表,可以对他进行取值赋值操作,常用的有
apr_table_add
apr_table_set
apr_table_get
例如:
char *slen=apr_table_get(r->headers_in, "Content-Length");
2. 了解request_rec
这是一个最重要的结构,定义在httpd.h第682行.
注意到handle函数的唯一的参数
static int mod_hello_method_handler (request_rec *r);你可以认为从这个结构里面你可以得到所有一切.
重要的几个结构成员
apr_pool_t *pool;
/** The connection to the client */
conn_rec *connection;
/** The virtual host for this request */
server_rec *server;
int method_number; //提交信息的类型,GET或者POST
char *args; //存放GET的参数
apr_table_t *headers_in; //提交信息的头信息的保存位置
const char *handler; //处理的类型
3. 读取HTTP头
信息在r->headers_in里面,那么就是
char *slen=apr_table_get(r->headers_in, "Content-Length");
4. 获得GET方法传递的数据
信息在r->args里面.注意,这里的数据是没有经过解析的,也就是说URL编码过的,如果你不使用类似libapreq2而自行解析的话,需要自行编码.
ap_log_rerror(APLOG_MARK, APLOG_ERR,0,r,"get query string:%s",r->args);
5. 获得POST方法传递的数据
数据在request_rec关联的bucket里面.bucket的解释将在下一步解释,那么我们简单的使用ap_get_client_block来读取吧.其实这个函数里面也是调用了bucket操作.
6. 简单的例子.
if(strcmp("hello-s cript",r->handler)) return DECLINED;
//get the comand.
if(r->method_number==M_GET){
ap_log_rerror(APLOG_MARK, APLOG_ERR,0,r,"get query string:%s",r->args);
}else if(r->method_number==M_POST){
handle_post (r);
}else{
return DECLINED;
}
handle_post 函数
void handle_post(request_rec *r)
{
size_t total_bytes;
int rstat=0;
char cbuf[HUGE_STRING_LEN];
rstat = ap_setup_client_block(r, REQUEST_CHUNKED_DECHUNK);
if (ap_should_client_block(r)) {
int nbytes;
while ((nbytes = ap_get_client_block(r, cbuf, HUGE_STRING_LEN)) > 0){
cbuf[nbytes]='\0';
ap_rputs(cbuf,r);
total_bytes += nbytes;
}
}
}
7. 小结
现在对模块有了简单的理解,知道如何写一个模块,知道数据在哪.知道如何处理输入输出,下面要知道APAHCE内部是如何的运作,APR运行库的常用函数,然后是利用APACHE的服务框架完成更多的工作.要把APACHE看做一个SOCKET服务器,而不仅仅是一个WEB服务器
quake 发表于 2006-01-19 14:32 阅读( 592) 评论( 0) 引用( 0) Tech
APACHE2.0 MOD 模块开发 STEP 1
一.目的
写一个APACHE2.0的MOD模块,读取配置,并对所有后缀为.hello的请求进行处理。
二.步骤
创建一个mod_hello.c文件
1. 定义一个模块。
#include "httpd.h"
#include "http_config.h"
module AP_MODULE_DECLARE_DATA hello_module;
2. 定义接口。
module AP_MODULE_DECLARE_DATA hello_module =
{
STANDARD20_MODULE_STUFF, // standard stuff; no need to mess with this.
NULL, // create per-directory configuration structures - we do not.
NULL, // merge per-directory - no need to merge if we are not creating anything.
create_modhello_config, // create per-server configuration structures.
NULL, // merge per-server - hrm - examples I have been reading don't bother with this for trivial cases.
mod_hello_cmds, // configuration directive handlers
mod_hello_register_hooks, // request handlers
};
说明:
其中create_modhello_config函数为用来为自定义的结构分配空间,mod_hello_cmds定义了参数序列和参数的读取函数。mod_hello_register_hooks定义了请求处理函数
3. 初始化配置,读取配置。
配置结构的定义:
typedef struct {
char *welcome;
int max_process;
} modhello_config;
参数的定义:
static const command_rec mod_hello_cmds[] =
{
AP_INIT_TAKE1(
"welcome",
set_modhello_string,
NULL,
RSRC_CONF,
"hello,apache"
),
AP_INIT_TAKE1(
"ModuleMaxProcess",
set_modhello_string,
NULL,
RSRC_CONF,
NULL
),
{NULL}
};
参数结构的创建,由apache在装载模块时候调用。
static void *create_modhello_config(apr_pool_t *p, server_rec *s)
{
modhello_config *newcfg;
// allocate space for the configuration structure from the provided pool p.
newcfg = (modhello_config *) apr_pcalloc(p, sizeof(modhello_config));
// return the new server configuration structure.
return (void *) newcfg;
}
参数读取函数
static const char *set_modhello_string(cmd_parms *parms, void *mconfig, const char *arg)
{
modhello_config *s_cfg = ap_get_module_config(parms->server->module_config, &hello_module);
if(!strcmp(parms->cmd->name,"welcome")){
s_cfg->welcome= (char *) arg;
}else if(!strcmp(parms->cmd->name,"ModuleMaxProcess")){
s_cfg->max_process=atoi(arg);
}
// success
return NULL;
}
4. 处理请求。
注册请求。
static void mod_hello_register_hooks (apr_pool_t *p)
{
ap_hook_handler(mod_hello_method_handler, NULL, NULL, APR_HOOK_LAST);
}
请求处理函数
static int mod_hello_method_handler (request_rec *r)
{
modhello_config *s_cfg ;
if(strcmp("hello-s cript",r->handler)) return DECLINED;
s_cfg= ap_get_module_config(r->server->module_config, &hello_module);
fprintf(stderr,"%s,%s,%d\n",r->content_type,r->handler,s_cfg->max_process);
ap_rputs("hello,world!",r);
return 0;
}
三.安装。
1. 编译。
看Makefile.
all: mod_hello.c
gcc -g -I/home/wee/apache2/include/ -fPIC -o mod_hello.o -c mod_hello.c
gcc -shared -I/home/wee/apache2/include/ -o libmodhello.so -lc mod_hello.o
cp *.so /home/wee/apache2/modules/
clean:
rm *.o *.so
2. 配置。
修改Httpd.conf。
增加处理:
LoadModule hello_module modules/libmodhello.so
AddHandler hello-s cript .hello
增加参数:
welcome "hello,world"
ModuleMaxProcess 5
3. 安装
gcc -v
gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-110)
make.
四.测试。
访问http://xxx.xxx.xxx.xxx/a.hello, 屏幕上打印出 “hello,world”,同时LOG中也有打印信息。
五.参考资料
1. http://threebit.net/tutorials/apache2_modules/tut1/tutorial1.html
2. Writing.Apache Modules with Perl and C(Lincoln Stein and Doug MacEachern)
4. http://www.apache.org/
quake 发表于 2006-01-19 14:30 阅读( 691) 评论( 0) 引用( 0) Tech
Two Servlet Filters Every Web Application Should Have 3
Caching Content Using a Servlet Filter
The second filter this article addresses is a cache filter. Caching is helpful because it saves time and processing power. The basic idea is that it takes time for a web application to generate content, and in many situations, the content won't change between different requests to a particular servlet or JSP. Therefore, if you simply save the exact output (e.g., HTML) that is produced for a given URI, you can recycle this content several times before having the web application generate it again. Assuming your cache is faster then the web application -- it almost always is -- the end result is that you save a large amount of the time and processing power required to generate a dynamic response. Currently, there is no official standard for caching web application content. However, building a simple, generic caching system is a straightforward process.
We will now begin to discuss building a simple cache filter. In general, caching at the filter level is most helpful, as it allows you to save the entire response any particular JSP or servlet generates. However, it is worth considering that you can certainly try to cache elsewhere; for instance, using a set of custom tags that auto-cache any content placed between them, or using a custom Java class to cache information retrieved from a database. Caching possibilities are endless, but for practical purposes we shall focus on implementing caching at the filter level.
Before seeing some code, let's make sure what I mean by "caching at the filter level" is clear. "Caching at the filter level" simply means using a standard servlet filter that will intercept all requests to a web application and attempt to intelligently use the cache. Should a valid cached copy of content exist in a cache, the filter will immediately respond to the request by sending a copy of the cache. However, if no cache exists, the filter will pass the request on to its intended endpoint, usually a servlet or JSP, and the response will be generated as it normally is. Once a response is successfully generated, it will also be cached, so that on future requests to the same resource, the cache may be used.
Understand that as this filter is intended to be used on an entire web application, it can cache all of the various responses from different servlets and JSPs. Think about how this is possible: each servlet or JSP will likely produce a different response. The filter will need to be able to distinguish between different responses, store the appropriate content somewhere, and correctly match a cached copy of the content to an incoming request. Doing all of this is no problem at all -- different requests can almost always be distinguished by the requested URI, and the same information can be used to identify cached resources. Cached content can be stored either in memory, on the hard disk, or via any other method your server allows for -- usually, the hard disk is a great solution. With all of that said, here is the code for a filter that caches content in the web application's temporary directory. The full code is given below, and important parts of the code are highlighted after the listing.
package com.jspbook;
import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;
import java.util.Calendar;
public class CacheFilter implements Filter {
ServletContext sc;
FilterConfig fc;
long cacheTimeout = Long.MAX_VALUE;
public void doFilter(ServletRequest req,
ServletResponse res,
FilterChain chain)
throws IOException, ServletException {
HttpServletRequest request =
(HttpServletRequest) req;
HttpServletResponse response =
(HttpServletResponse) res;
// check if was a resource that shouldn't be cached.
String r = sc.getRealPath("");
String path =
fc.getInitParameter(request.getRequestURI());
if (path!= null && path.equals("nocache")) {
chain.doFilter(request, response);
return;
}
path = r+path;
String id = request.getRequestURI() +
request.getQueryString();
File tempDir = (File)sc.getAttribute(
"javax.servlet.context.tempdir");
// get possible cache
String temp = tempDir.getAbsolutePath();
File file = new File(temp+id);
// get current resource
if (path == null) {
path = sc.getRealPath(request.getRequestURI());
}
File current = new File(path);
try {
long now =
Calendar.getInstance().getTimeInMillis();
//set timestamp check
if (!file.exists() || (file.exists() &&
current.lastModified() > file.lastModified()) ||
cacheTimeout < now - file.lastModified()) {
String name = file.getAbsolutePath();
name =
name.substring(0,name.lastIndexOf("/"));
new File(name).mkdirs();
ByteArrayOutputStream baos =
new ByteArrayOutputStream();
CacheResponseWrapper wrappedResponse =
new CacheResponseWrapper(response, baos);
chain.doFilter(req, wrappedResponse);
FileOutputStream fos = new FileOutputStream(file);
fos.write(baos.toByteArray());
fos.flush();
fos.close();
}
} catch (ServletException e) {
if (!file.exists()) {
throw new ServletException(e);
}
}
catch (IOException e) {
if (!file.exists()) {
throw e;
}
}
FileInputStream fis = new FileInputStream(file);
String mt = sc.getMimeType(request.getRequestURI());
response.setContentType(mt);
ServletOutputStream sos = res.getOutputStream();
for (int i = fis.read(); i!= -1; i = fis.read()) {
sos.write((byte)i);
}
}
public void init(FilterConfig filterConfig) {
this.fc = filterConfig;
String ct =
fc.getInitParameter("cacheTimeout");
if (ct != null) {
cacheTimeout = 60*1000*Long.parseLong(ct);
}
this.sc = filterConfig.getServletContext();
}
public void destroy() {
this.sc = null;
this.fc = null;
}
}First note that the code is part of the com.jspbook package. This code is the Servlet-2.4-compliant cache filter that is detailed in the book. It is tested code that is used in several web applications, and is maintained at the book's support site, http://www.jspbook.com. This is no contrived example; it is serious code.
The next thing I'd like to draw attention to is how the servlet identifies caches and saves them to the local hard disk. As mentioned before the code, the filter uses the request URI and any parameters in the query string to generate a unique name for the cache.
String id = request.getRequestURI()+request.getQueryString();Once the filter has this unique name, it uses the name to check if the resource exists in the web application's cache. If it does, the cached copy is sent and the filter does not pass the request and response down the filter chain. If no cache exists, the filter passes the request and response down the filter chain so that the desired JSP or servlet can generate a response. Once the response is made, the cache filter sends it to the client and makes a copy of the response in the web application's cache.
// use the web applications temporary work directory
File tempDir =
(File)sc.getAttribute("javax.servlet.context.tempdir");
// look to see if a cached copy of the response exists
String temp = tempDir.getAbsolutePath();
File file = new File(temp+id);
// get a reference to the servlet/JSP
// responsible for this cache
if (path == null) {
path = sc.getRealPath(request.getRequestURI());
}
File current = new File(path);
// check if the cache exists and is newer than the
// servlet or JSP responsible for making it.
try {
long now = Calendar.getInstance().getTimeInMillis();
//set timestamp check
if (!file.exists() || (file.exists() &&
current.lastModified() > file.lastModified()) ||
cacheTimeout < now - file.lastModified()) {
// if not, invoke chain.doFilter() and
// cache the response
String name = file.getAbsolutePath();
name = name.substring(0,name.lastIndexOf("/"));
new File(name).mkdirs();
ByteArrayOutputStream baos =
new ByteArrayOutputStream();
CacheResponseWrapper wrappedResponse =
new CacheResponseWrapper(response, baos);
chain.doFilter(req, wrappedResponse);
FileOutputStream fos = new FileOutputStream(file);
fos.write(baos.toByteArray());
fos.flush();
fos.close();
}
} catch (ServletException e) {
if (!file.exists()) {
throw new ServletException(e);
}
}
catch (IOException e) {
if (!file.exists()) {
throw e;
}
}
// return to the client the cached resource.
FileInputStream fis = new FileInputStream(file);
String mt = sc.getMimeType(request.getRequestURI());
response.setContentType(mt);
ServletOutputStream sos = res.getOutputStream();
for (int i = fis.read(); i!= -1; i = fis.read()) {
sos.write((byte)i);
}And that is a basic cache filter. Two support classes are needed -- CacheResponseStream and CacheResponseWrapper -- but they are nothing more than implementations of the ServletOutputStream class and HttpServletResponseWrapper class that are appropriate for CacheFilter.java. The full source code for everything is given at the end of this article, but to keep things moving along, I'll have you use a JAR file that includes the compiled cache filter. If you didn't already for the compression filter, grab a copy of jspbook.jar and put it in the WEB-INF/lib directory of your favorite web application and deploy the filter to intercept all requests going to resources ending in .jsp, and reload the web application for the changes to take effect. Next we will make a simple JSP to test the code.
Here is the complete code for a simple JSP that tests the cache filter. The JSP wastes time and processing power by executing several loops. Save the following as TimeMonger.jsp somewhere in your web application.
<html>
<head>
<title>Cache Filter Test</title>
</head>
<body>
A test of the cache Filter.
<%
// mock time-consuming code
for (int i=0;i<100000;i++) {
for (int j=0;j<1000;j++) {
//noop
}
}
%>
</body>
</html>Browse to TimeMonger.jsp for the ever-so-sophisticated cache test. Notice how long it takes to generate the page; it should take several seconds due to the embedded for loops. Now browse to the page once again; notice that it appears near-instantly. Continue browsing to the page and notice it will continue to appear near-instantly. This is the cache filter in action. After the page is generated once, a copy is saved in your web application's temporary work directory (on Tomcat, this is in a subdirectory of ./work), and on subsequent requests, this cache is used instead of executing the JSP. You can test this by deleting the cache file located in your web application's temporary directory, and browsing to the page. Once again it will take several seconds to load. We can quantify the time difference by making a simple JSP that spoofs two HTTP requests and measuring the time it takes for each request to be answered. To ensure that the test works, we will have to delete the cache before running the JSP. This will force the first HTTP request to execute the JSP and allow the second request to hit the cache. Here is the code for the needed JSP.
<%@ page import="java.util.*,
java.net.*,
java.io.*" %>
<%
String url = request.getParameter("url");
long[] times = new long[2];
if (url != null) {
for (int i=0;i<2;i++) {
long start =
Calendar.getInstance().getTimeInMillis();
URL u = new URL(url);
HttpURLConnection huc =
(HttpURLConnection)u.openConnection();
huc.setRequestProperty("user-agent",
"Mozilla(MSIE)");
huc.connect();
ByteArrayOutputStream baos =
new ByteArrayOutputStream();
InputStream is = huc.getInputStream();
while(is.read() != -1) {
baos.write((byte)is.read());
}
long stop =
Calendar.getInstance().getTimeInMillis();
times[i] = stop-start;
}
}
request.setAttribute("t1", new Long(times[0]));
request.setAttribute("t2", new Long(times[1]));
request.setAttribute("url", url);
%><html>
<head>
<title>Cache Test</title>
</head>
<body>
<h1>Cache Test Page</h1>
Enter a URL to test.
<form method="POST">
<input name="url" size="50">
<input type="submit" value="Check URL">
</form>
<p><b>Testing: ${url}</b></p>
Request 1: ${t1} milliseconds<br/>
Request 2: ${t2} milliseconds<br/>
Time saved: ${t1-t2} milliseconds<br/>
</body>
</html>Save the above code in your web application. Next, delete the temporary directory of that web application in order to ensure that there is no cache. Now browse to the cache test page. Initially, a blank page appears with a simple HTML form, as shown here.

Figure 4. Blank cache test page
Just as with the compression test page, fill out the URL that the cache-testing JSP should check. Any value will do, but for this example let us test TimeMonger.jsp -- a JSP we know takes a relatively long amount of time to execute. Here is what the cache-testing JSP returns after testing TimeMonger.jsp.

Figure 5. Cache test page used on TimeMonger.jsp
Notice that TimeMonger.jsp normally takes about five seconds to execute, but when a cache is used, it takes a hundredth of the time. If you like, try the page again and notice that the cache will continue to be used; each response will take about 50 milliseconds. However, if you delete the cache and force the JSP to execute, you will once again see the page take about five seconds to execute before it is once again cached.
The point to see is that CacheFilter.java is saving a copy of the HTML it used in a response and reusing it instead of executing dynamic code. This results in time-consuming and processor-intensive code being skipped. In TimeMonger.jsp, the skipped code was a few for loops -- admittedly, a poor example. But understand that the dynamic code can be anything, such as a database query or an execution of any custom Java code. The time it takes to retrieve content from the cache will always be about the same; in this example, it was about 50 milliseconds. Therefore, you can increase the speed of just about any dynamic page to be roughly 50 milliseconds, no matter how time intensive the page is.
Cache Filter Summary and Good Practice Tips
Once again you have been presented with a filter that is incredibly helpful and near-trivial to use. Caching can save enormous amounts of your server's time and processing power, and caching is as easy to implement as putting a copy of jspbook.jar in your web application's WEB-INF/lib directory and deploying the filter to intercept requests to any resource you want to cache. I suggest you use a caching filter as much as possible in order to speed your web application up to peak performance.
While caching can save a web application a lot of time and processing power, and it can make even the most complex server-side code appear to execute unbelievably fast, caching is not suitable for everything. Some pages can't be cached because the page's content must be dynamically generated each time the page is viewed -- for instance, a web site that lists stock quotes. Often, though, resources that are supposedly always dynamic can really be cached for short periods of time. For example, consider news.google.com: content is cached for a few minutes at a time to save server-side resources, but the cache is updated quick enough to make the site appear to be completely dynamic. In the given cache filter code, you can configure whether the filter caches a particular resource at all, and how long the filter uses a cache before updating it. Both of these are initial configuration elements.
<filter-mapping>
<filter-name>CacheFilter</filter-name>
<url-pattern>*.jsp</url-pattern>
<init-param>
<param-name>/timemonger.jsp</param-name>
<param-value>nocache</param-value>
</init-param>
<init-param>
<param-name>cacheTimeout</param-name>
<param-value>1</param-value>
</init-param>
</filter-mapping>To tell the cache filter that a resource shouldn't be cached, set an initial configuration element of the same name as the resource's request URI to have a value of nocache. To configure how long the filter waits before updating cached content change the cacheTimeout initial configuration parameter to have a numerical value that represents the number of minutes a cache is valid. Both of these features are specific only to this cache filter. Feel free to examine CacheFilter.java to see exactly how they are implemented.
In general, a cache filter is a very powerful enhancement to add to a web application. Cached content can be served to users as fast as the server can read files from disk (or memory, if you keep the cache in RAM), which is almost always much faster than executing a servlet or JSP, especially complex, database-driven pages. However, caching must be done in an intelligent manner. Some pages simply can't be cached, or they can only be cached for a few minutes at a time. Make sure you cache as much of your web application's content for as long as you can, and be sure to configure the cache filter to appropriately handle pages that either shouldn't be cached or should only be cached for short periods of time.
Conclusion
Every web application should have a caching filter and a compression filter. These two filters optimize how quickly a web application generates content and how long it takes the content to be sent across the World Wide Web, both of which are arguably the most important tasks a web application performs. The code presented in this article provides a good implementation of each of these filters. The code is both free and open source. If you don't want to build your own caching and compression support, simply deploy the jspbook.jar with your web application and reap the rewards. If you do wish to develop your own caching and compression support, you have the full code to both of these filters, and you can get any updates to the code from the book's support site, www.jspbook.com. Take the code and go!
Links
- Servlets and JSP; the J2EE Web Tier Book Support Site. Check here for the latest code for both the cache and compression filter. The authors actively maintain the book's code, and attempt to fix any bugs that may be present. You can also find lots of other free code examples and excerpts from the book itself.
- jspbook.jar. A ready-to-use JAR with compiled versions of both the cache and compression filter.
- jspbook.zip. All source code for this article in one ZIP.
Jayson Falkner is a J2EE developer, student, and webmaster of JSP Insider.

