log_latency

Usage:
 -l log file
      default: /var/log/aerospike/aerospike.log
 -h histogram name (eg, )
      MANDATORY - NO DEFAULT
      e.g. 'reads', 'writes_master', 'proxy'
 -t analysis slice interval
      default: 10
      other e.g. 3600 or 1:00:00
 -f log time from which to analyze
      default: tail
      other e.g. head or 'Sep 22 2011 22:40:14' or -3600 or -1:00:00
 -d maximum duration for which to analyze
      default: not set
      e.g. 3600 or 1:00:00
 -n number of buckets to display
      default: 3
 -e show 0-th then every n-th bucket
      default: 3
 -r (roll until user hits enter key)
      default: set if -f tail, otherwise not set

log_latency.py is a python script to analyze latency histograms in citrusleaf 
server log files.  The script analyzes a histogram by looking at latencies 
during successive time slices, and calculating what percentages of operations 
in each time slice exceeded various latency thresholds.  This tool runs at a 
time slice of 10 seconds.

To have different time slices other than in multiple of 10’s , modify the 
configuration dynamically, Enter the following commands:

	telnet localhost 3003
	set-config:context=service;ticker-interval=1;

The above set of commands will set the interval to 1 second.
The most common histograms monitored are:
writes_master
reads
proxy

The script is invoked from the command line with various option parameters.  
If the script is invoked with no option parameters, it will display the list of options.

One typical use mode is to run the script "realtime", on a "live" log file:

	./log_latency.py -h writes_master -t 1
	writes_master
	Oct 20 2011 00:04:51
	           	   % > (ms)
	slice-to (sec)      1      8     64  ops/sec
	-------------- ------ ------ ------ --------
	00:04:52     1   1.79   0.00   0.00   4761.0
	00:04:53     1   1.85   0.00   0.00   4875.0
	00:04:54     1   0.33   0.00   0.00   4813.0
	00:04:55     1   0.78   0.00   0.00   4644.0
	00:04:56     1   4.73   2.86   0.00   4756.0
	00:04:57     1   0.36   0.00   0.00   4672.0
	00:04:58     1   0.65   0.00   0.00   4785.0
	00:04:59     1   0.70   0.00   0.00   4855.0
	00:05:00     1   0.44   0.00   0.00   4760.0
	00:05:01     1   1.71   1.04   0.00   4724.0
	00:05:02     1   0.38   0.00   0.00   4721.0
	00:05:03     1   0.71   0.00   0.00   4773.0
	...

(Press enter to stop the script and display averages & maximums.)  

In the above example, we specified -t 1, meaning analyze 1-second slices 
(the smallest possible).  The highlighted row is interpreted as follows: 

    in the 1-second slice ending at 00:05:01 GMT, 1.71% of all "writes_master" 
    operations took longer than 1 millisecond, 1.04% of all "writes_master" operations 
    took longer than 8 milliseconds, none took longer than 64 milliseconds, and there 
    were 4724 "writes_master" operations during this second.

In another typical usage mode, we examine a period in the past:

	./log_latency.py -h reads -f -12:00:00 -d 2:00
	reads
	Oct 17 2011 08:58:35
	           	   % > (ms)
	slice-to (sec)      1      8     64  ops/sec
	-------------- ------ ------ ------ --------
	08:58:45    10  10.28   5.76   0.00   3470.9
	08:58:55    10   7.89   3.37   0.00   3419.9
	08:59:05    10   3.54   1.29   0.00   3475.2
	08:59:15    10   4.27   2.23   0.00   3454.5
	08:59:25    10   7.72   3.41   0.00   3129.6
	08:59:35    10  11.65   5.49   0.00   3458.5
	08:59:45    10   4.50   2.15   0.00   3452.4
	08:59:55    10  12.49   6.47   0.00   3470.5
	09:00:05    10   6.56   3.37   0.00   3474.7
	09:00:15    10   9.12   4.78   0.00   3442.3
	09:00:25    10  11.43   5.89   0.00   3490.3
	09:00:35    10   5.30   2.67   0.00   3455.9
	-------------- ------ ------ ------ --------
	   avg           7.90   3.91   0.00   3432.0
	   max          12.49   6.47   0.00   3490.3

In the above example, we looked back at a period 12 hours before the end of the 
log file (now) for 2 minutes, analyzing 10-second slices (the default).

Caution: don't specify a time period that bridges a server restart -- this will 
confuse the analysis.
