This page documents the attempt to verify this requirement.
- 110.0 Performance
The Templating System must not cause any performance problems to a site. It must be fast and efficient, and it must not slow down page load speed by more than 10% versus a Tcl page with inline HTML.
I wrote a sample page for this test. It expands four real numbers into continued fractions. I created three versions:
<include>
s chain-frac-1.The reason for creating chain-frac-2.adp
is that in
this way, the script chain-frac-1.tcl
is handled
inside the templating system, and hence loaded once and cached.
There is hope that this might be faster.
Normally, the templating system re-reads a file whenever the
version on disk is out of date. ADP pages are compiled to TCL, and
both ADP and Tcl pages are cached as Tcl procs. The parameter
RefreshCache
in section template
can be
set to always
or never
to affect the
caching strategy; the latter may be useful for a production site.
All timing is carried out for the three settings
always
, normal
, and never
;
the associated variable is called check
.
I created a script in e-Tester that requests the three pages
from my development server on dev0103-001. One timing of requesting
a page isn't very expressive. A lot of factors affect the page
load time. To offset these and get better results, I repeatedly
request the pages. For the timing, I have e-Tester iterate this
script 200 times. To compesate for varying load on the machine, i
ran the iteration twice for each setting of
RefreshCache
at different times of the day.
The timing information is taken from the error log file entries
that the request processor produces with parameter
LogDebugP=1
. For finer granularity I changed rp_debug
to divide the clock clicks (microsecond) difference by 1000.0
instead of 1000. Delivering a page gives us 15 log file entries
with timestamps. I treat the first one (? ms) as 0. There must be
no other page requests from this AOLserver during the measurement.
I note the length of the error log before and after one run of the
script. Afterwards I cut out the error log sections indicated by
these positions into files never
, normal
,
and always
.
The following steps extract the relevant information and bring it in a form suitable for gnuplot.
foreach i ( never normal always ) fgrep '] Notice: RP (' $i > $i-0 echo > $i-1 foreach conn (`cut -d- -f2 always-0 | sort -u`) echo "$i '$conn'" fgrep "[-$conn-]" $i-0 | cut -d: -f5- >> $i-1 end cut -d" " -f3 $i-1| cut -c2-| tr \? 0 > $i-2 endThe three resulting files, ending in -2, contain 18000 numbers, for 2 runs * 200 tries * 3 pages * 15 stages.
dev0103-001:/home/brech/prog/tcl/timing.tcl
, which
generates the directories stage00
to
stage14
and the ten files
{0
,1
,2
}-{never
,normal
,always
},
and t_max
in each of them. Each of the former files
contains the 2*200 samples, ordered for graphing.
The different experiments are color coded as follows.
never | normal | always | |
---|---|---|---|
chain-frac-0 | |||
chain-frac-1 | |||
chain-frac-2 |
A number of statistic measures can summarize an ensemble of data, in our case, the 400 timings. The average is affected by outliers; the median is much more robust. Maybe some dispersion measure would be of interest, too. Some people plot histograms, but that approach is frought with its own problems. I chose to use all the data and plot estimated distribution functions. Our random variable being time T, its distribution function is defined as
The curve always increases monotonically from 0 to 1. In case of a normal distribution, you get the erf shape; for a uniform distribution it is a ramp. This form of presentation throws away no information, and it shows all about the distribution of a single variable. I am pretty sure the times that different stages of one request take are statistically dependent, but I don't consider that for the time being. The median is the abscissa t at which the ordinate F(t)=1/2.
The curves for all nine experiments are drawn in the same graph for comparison. Note that the abscissa is scaled very differently for various stages.
#! /bin/csh foreach i (stage[01][0-9]) (cd $i; gnuplot ../distrib.gplt > $i.gif) echo $i done. end
Currently, the template system doesn't meet the performance requirement.
Earlier on dev0103-001, templated pages loaded fast enough. Although the processing stage seems to a lot be more than 10% slower, the overall performance is rather increased than slowed by templating.
On reusia.boston, we had a much better performance of the request processor. The processing times of the pages proper (stage 10 before, 11 now) didn't change much; we just got clearer results. The total processing time of Tcl-only pages is around 155ms, while templated pages take around 220ms, that is 42% longer (or Tcl-only pages take 30% less).
Relative times depend on the other components of the pipeline. The difference of 65ms is a large percentage of a total serving time of 155ms; when other parts of the system (e.g., the request processor) were slower, this wasn't that noticeable.
For ACS 3.4, Tcl-only chain-frac-0 pages take 115ms, where the templated versions are much slower, 320ms for chain-frac-1 and 340 for -2.
Tune templating in ACS 4.0.