This page documents the attempt to verify this requirement.
I wrote a sample page for this test. It expands four real numbers into continued fractions. I created three versions:
<include>
s chain-frac-1.
The reason for creating chain-frac-2.adp
is that in
this way, the script chain-frac-1.tcl
is handled
inside the templating system, and hence loaded once and cached.
There is hope that this might be faster.
Normally, the templating system re-reads a file whenever the
version on disk is out of date. ADP pages are compiled to TCL,
and both ADP and TCL pages are cached as TCL procs. The
parameter RefreshCache
in section
template
can be set to always
or
never
to affect the cacheing strategy; the latter
may be useful for a production site. All timing is carried out
for the three settings always
, normal
,
and never
; the associated variable is called
check
.
I created a script in e-Tester that requests the three pages
from my development server on dev0103-001. One timing of
requesting a page isn't very expressive. A lot of factors
affect the page load time. To offset these and get better
results, I repeatedly request the pages. For the timing, I have
e-Tester iterate this script 200 times. To compesate for
varying load on the machine, i ran the iteration twice for each
setting of RefreshCache
at different times of the
day.
The timing information is taken from the error log file entries
that the request processor produces with parameter
LogDebugP=1
. For finer granularity I changed
rp_debug to divide the clock clicks (microsecond) difference by
1000.0 instead of 1000. Delivering a page gives us 15 log file
entries with timestamps. I treat the first one (? ms) as 0.
There must be no other page requests from this AOLserver during
the measurement. I note the length of the error log before and
after one run of the script. Afterwards I cut out the error log
sections indicated by these positions into files
never
, normal
, and
always
.
The following steps extract the relevant information and bring it in a form suitable for gnuplot.
foreach i ( never normal always ) fgrep '] Notice: RP (' $i > $i-0 echo > $i-1 foreach conn (`cut -d- -f2 always-0 | sort -u`) echo "$i '$conn'" fgrep "[-$conn-]" $i-0 | cut -d: -f5- >> $i-1 end cut -d" " -f3 $i-1| cut -c2-| tr \? 0 > $i-2 endThe three resulting files, ending in -2, contain 18000 numbers, for 2 runs * 200 tries * 3 pages * 15 stages.
dev0103-001:/home/brech/prog/tcl/timing.tcl
,
which generates the directories stage00
to
stage14
and the ten files
{0
,1
,2
}-{never
,normal
,always
}, and
t_max
in each of them. Each of the former files
contains the 2*200 samples, ordered for graphing.
The different experiments are color coded as follows.
never | normal | always | |
---|---|---|---|
chain-frac-0 | |||
chain-frac-1 | |||
chain-frac-2 |
A number of statistic measures can summarize an ensemble of data, in our case, the 400 timings. The average is affected by outliers; the median is much more robust. Maybe some dispersion measure would be of interest, too. Some people plot histograms, but that approach is frought with its own problems. I chose to use all the data and plot estimated distribution functions. Our random variable being time T, its distribution function is defined as
The curve always increases monotonically from 0 to 1. In case of a normal distribution, you get the erf shape; for a uniform distribution it is a ramp. This form of presentation throws away no information, and it shows all about the distribution of a single variable. I am pretty sure the times that different stages of one request take are statistically dependent, but I don't consider that for the time being. The median is the abscissa t at which the ordinate F(t)=1/2.
The curves for all nine experiments are drawn in the same graph for comparison. Note that the abscissa is scaled very differently for various stages.
#! /bin/csh foreach i (stage[01][0-9]) (cd $i; gnuplot ../distrib.gplt > $i.gif) echo $i done. end
Currently, the template system doesn't meet the performance requirement.
Earlier on dev0103-001, templated pages loaded fast enough. Although the processing stage seems to a lot be more than 10% slower, the overall performance is rather increased than slowed by templating.
On reusia.boston, we had a much better performance of the request processor. The processing times of the pages proper (stage 10 before, 11 now) didn't change much; we just got clearer results. The total processing time of TCL-only pages is around 155ms, while templated pages take around 220ms, that is 42% longer (or TCL-only pages take 30% less).
Relative times depend on the other components of the pipeline. The difference of 65ms is a large percentage of a total serving time of 155ms; when other parts of the system (e.g., the request processor) were slower, this wasn't that noticeable.
For ACS 3.4, TCL-only chain-frac-0 pages take 115ms, where the templated versisons are much slower, 320ms for chain-frac-1 and 340 for -2.
Tune templating in ACS 4.0.