blob: 5f939a5159fb4050f253dc421be668ea8b169876 [file] [log] [blame]
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html><head><title>Python: module telemetry.util.statistics</title>
<meta charset="utf-8">
</head><body bgcolor="#f0f0f8">
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="heading">
<tr bgcolor="#7799ee">
<td valign=bottom>&nbsp;<br>
<font color="#ffffff" face="helvetica, arial">&nbsp;<br><big><big><strong><a href="telemetry.html"><font color="#ffffff">telemetry</font></a>.<a href="telemetry.util.html"><font color="#ffffff">util</font></a>.statistics</strong></big></big></font></td
><td align=right valign=bottom
><font color="#ffffff" face="helvetica, arial"><a href=".">index</a><br><a href="../telemetry/util/statistics.py">telemetry/util/statistics.py</a></font></td></tr></table>
<p><tt>A&nbsp;collection&nbsp;of&nbsp;statistical&nbsp;utility&nbsp;functions&nbsp;to&nbsp;be&nbsp;used&nbsp;by&nbsp;metrics.</tt></p>
<p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#aa55cc">
<td colspan=3 valign=bottom>&nbsp;<br>
<font color="#ffffff" face="helvetica, arial"><big><strong>Modules</strong></big></font></td></tr>
<tr><td bgcolor="#aa55cc"><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</tt></td><td>&nbsp;</td>
<td width="100%"><table width="100%" summary="list"><tr><td width="25%" valign=top><a href="math.html">math</a><br>
</td><td width="25%" valign=top></td><td width="25%" valign=top></td><td width="25%" valign=top></td></tr></table></td></tr></table><p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#eeaa77">
<td colspan=3 valign=bottom>&nbsp;<br>
<font color="#ffffff" face="helvetica, arial"><big><strong>Functions</strong></big></font></td></tr>
<tr><td bgcolor="#eeaa77"><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</tt></td><td>&nbsp;</td>
<td width="100%"><dl><dt><a name="-ArithmeticMean"><strong>ArithmeticMean</strong></a>(data)</dt><dd><tt>Calculates&nbsp;arithmetic&nbsp;mean.<br>
&nbsp;<br>
Args:<br>
&nbsp;&nbsp;data:&nbsp;A&nbsp;list&nbsp;of&nbsp;samples.<br>
&nbsp;<br>
Returns:<br>
&nbsp;&nbsp;The&nbsp;arithmetic&nbsp;mean&nbsp;value,&nbsp;or&nbsp;0&nbsp;if&nbsp;the&nbsp;list&nbsp;is&nbsp;empty.</tt></dd></dl>
<dl><dt><a name="-Clamp"><strong>Clamp</strong></a>(value, low<font color="#909090">=0.0</font>, high<font color="#909090">=1.0</font>)</dt><dd><tt>Clamp&nbsp;a&nbsp;value&nbsp;between&nbsp;some&nbsp;low&nbsp;and&nbsp;high&nbsp;value.</tt></dd></dl>
<dl><dt><a name="-Discrepancy"><strong>Discrepancy</strong></a>(samples, location_count<font color="#909090">=None</font>)</dt><dd><tt>Computes&nbsp;the&nbsp;discrepancy&nbsp;of&nbsp;a&nbsp;set&nbsp;of&nbsp;1D&nbsp;samples&nbsp;from&nbsp;the&nbsp;interval&nbsp;[0,1].<br>
&nbsp;<br>
The&nbsp;samples&nbsp;must&nbsp;be&nbsp;sorted.&nbsp;We&nbsp;define&nbsp;the&nbsp;discrepancy&nbsp;of&nbsp;an&nbsp;empty&nbsp;set<br>
of&nbsp;samples&nbsp;to&nbsp;be&nbsp;zero.<br>
&nbsp;<br>
<a href="http://en.wikipedia.org/wiki/Low-discrepancy_sequence">http://en.wikipedia.org/wiki/Low-discrepancy_sequence</a><br>
<a href="http://mathworld.wolfram.com/Discrepancy.html">http://mathworld.wolfram.com/Discrepancy.html</a></tt></dd></dl>
<dl><dt><a name="-DivideIfPossibleOrZero"><strong>DivideIfPossibleOrZero</strong></a>(numerator, denominator)</dt><dd><tt>Returns&nbsp;the&nbsp;quotient,&nbsp;or&nbsp;zero&nbsp;if&nbsp;the&nbsp;denominator&nbsp;is&nbsp;zero.</tt></dd></dl>
<dl><dt><a name="-DurationsDiscrepancy"><strong>DurationsDiscrepancy</strong></a>(durations, absolute<font color="#909090">=True</font>, location_count<font color="#909090">=None</font>)</dt><dd><tt>A&nbsp;discrepancy&nbsp;based&nbsp;metric&nbsp;for&nbsp;measuring&nbsp;duration&nbsp;jank.<br>
&nbsp;<br>
DurationsDiscrepancy&nbsp;computes&nbsp;a&nbsp;jank&nbsp;metric&nbsp;which&nbsp;measures&nbsp;how&nbsp;irregular&nbsp;a<br>
given&nbsp;sequence&nbsp;of&nbsp;intervals&nbsp;is.&nbsp;In&nbsp;order&nbsp;to&nbsp;minimize&nbsp;jank,&nbsp;each&nbsp;duration<br>
should&nbsp;be&nbsp;equally&nbsp;long.&nbsp;This&nbsp;is&nbsp;similar&nbsp;to&nbsp;how&nbsp;timestamp&nbsp;jank&nbsp;works,<br>
and&nbsp;we&nbsp;therefore&nbsp;reuse&nbsp;the&nbsp;timestamp&nbsp;discrepancy&nbsp;function&nbsp;above&nbsp;to&nbsp;compute&nbsp;a<br>
similar&nbsp;duration&nbsp;discrepancy&nbsp;number.<br>
&nbsp;<br>
Because&nbsp;timestamp&nbsp;discrepancy&nbsp;is&nbsp;defined&nbsp;in&nbsp;terms&nbsp;of&nbsp;timestamps,&nbsp;we&nbsp;first<br>
convert&nbsp;the&nbsp;list&nbsp;of&nbsp;durations&nbsp;to&nbsp;monotonically&nbsp;increasing&nbsp;timestamps.<br>
&nbsp;<br>
Args:<br>
&nbsp;&nbsp;durations:&nbsp;List&nbsp;of&nbsp;interval&nbsp;lengths&nbsp;in&nbsp;milliseconds.<br>
&nbsp;&nbsp;absolute:&nbsp;See&nbsp;TimestampsDiscrepancy.<br>
&nbsp;&nbsp;interval_multiplier:&nbsp;See&nbsp;TimestampsDiscrepancy.</tt></dd></dl>
<dl><dt><a name="-GeneralizedMean"><strong>GeneralizedMean</strong></a>(values, exponent)</dt><dd><tt>See&nbsp;<a href="http://en.wikipedia.org/wiki/Generalized_mean">http://en.wikipedia.org/wiki/Generalized_mean</a></tt></dd></dl>
<dl><dt><a name="-GeometricMean"><strong>GeometricMean</strong></a>(values)</dt><dd><tt>Compute&nbsp;a&nbsp;rounded&nbsp;geometric&nbsp;mean&nbsp;from&nbsp;an&nbsp;array&nbsp;of&nbsp;values.</tt></dd></dl>
<dl><dt><a name="-Median"><strong>Median</strong></a>(values)</dt><dd><tt>Gets&nbsp;the&nbsp;median&nbsp;of&nbsp;a&nbsp;list&nbsp;of&nbsp;values.</tt></dd></dl>
<dl><dt><a name="-NormalizeSamples"><strong>NormalizeSamples</strong></a>(samples)</dt><dd><tt>Sorts&nbsp;the&nbsp;samples,&nbsp;and&nbsp;map&nbsp;them&nbsp;linearly&nbsp;to&nbsp;the&nbsp;range&nbsp;[0,1].<br>
&nbsp;<br>
They're&nbsp;mapped&nbsp;such&nbsp;that&nbsp;for&nbsp;the&nbsp;N&nbsp;samples,&nbsp;the&nbsp;first&nbsp;sample&nbsp;is&nbsp;0.5/N&nbsp;and&nbsp;the<br>
last&nbsp;sample&nbsp;is&nbsp;(N-0.5)/N.<br>
&nbsp;<br>
Background:&nbsp;The&nbsp;discrepancy&nbsp;of&nbsp;the&nbsp;sample&nbsp;set&nbsp;i/(N-1);&nbsp;i=0,&nbsp;...,&nbsp;N-1&nbsp;is&nbsp;2/N,<br>
twice&nbsp;the&nbsp;discrepancy&nbsp;of&nbsp;the&nbsp;sample&nbsp;set&nbsp;(i+1/2)/N;&nbsp;i=0,&nbsp;...,&nbsp;N-1.&nbsp;In&nbsp;our&nbsp;case<br>
we&nbsp;don't&nbsp;want&nbsp;to&nbsp;distinguish&nbsp;between&nbsp;these&nbsp;two&nbsp;cases,&nbsp;as&nbsp;our&nbsp;original&nbsp;domain<br>
is&nbsp;not&nbsp;bounded&nbsp;(it&nbsp;is&nbsp;for&nbsp;Monte&nbsp;Carlo&nbsp;integration,&nbsp;where&nbsp;discrepancy&nbsp;was<br>
first&nbsp;used).</tt></dd></dl>
<dl><dt><a name="-Percentile"><strong>Percentile</strong></a>(values, percentile)</dt><dd><tt>Calculates&nbsp;the&nbsp;value&nbsp;below&nbsp;which&nbsp;a&nbsp;given&nbsp;percentage&nbsp;of&nbsp;values&nbsp;fall.<br>
&nbsp;<br>
For&nbsp;example,&nbsp;if&nbsp;17%&nbsp;of&nbsp;the&nbsp;values&nbsp;are&nbsp;less&nbsp;than&nbsp;5.0,&nbsp;then&nbsp;5.0&nbsp;is&nbsp;the&nbsp;17th<br>
percentile&nbsp;for&nbsp;this&nbsp;set&nbsp;of&nbsp;values.&nbsp;When&nbsp;the&nbsp;percentage&nbsp;doesn't&nbsp;exactly<br>
match&nbsp;a&nbsp;rank&nbsp;in&nbsp;the&nbsp;list&nbsp;of&nbsp;values,&nbsp;the&nbsp;percentile&nbsp;is&nbsp;computed&nbsp;using&nbsp;linear<br>
interpolation&nbsp;between&nbsp;closest&nbsp;ranks.<br>
&nbsp;<br>
Args:<br>
&nbsp;&nbsp;values:&nbsp;A&nbsp;list&nbsp;of&nbsp;numerical&nbsp;values.<br>
&nbsp;&nbsp;percentile:&nbsp;A&nbsp;number&nbsp;between&nbsp;0&nbsp;and&nbsp;100.<br>
&nbsp;<br>
Returns:<br>
&nbsp;&nbsp;The&nbsp;Nth&nbsp;percentile&nbsp;for&nbsp;the&nbsp;list&nbsp;of&nbsp;values,&nbsp;where&nbsp;N&nbsp;is&nbsp;the&nbsp;given&nbsp;percentage.</tt></dd></dl>
<dl><dt><a name="-StandardDeviation"><strong>StandardDeviation</strong></a>(data)</dt><dd><tt>Calculates&nbsp;the&nbsp;standard&nbsp;deviation.<br>
&nbsp;<br>
Args:<br>
&nbsp;&nbsp;data:&nbsp;A&nbsp;list&nbsp;of&nbsp;samples.<br>
&nbsp;<br>
Returns:<br>
&nbsp;&nbsp;The&nbsp;standard&nbsp;deviation&nbsp;of&nbsp;the&nbsp;samples&nbsp;provided.</tt></dd></dl>
<dl><dt><a name="-TimestampsDiscrepancy"><strong>TimestampsDiscrepancy</strong></a>(timestamps, absolute<font color="#909090">=True</font>, location_count<font color="#909090">=None</font>)</dt><dd><tt>A&nbsp;discrepancy&nbsp;based&nbsp;metric&nbsp;for&nbsp;measuring&nbsp;timestamp&nbsp;jank.<br>
&nbsp;<br>
TimestampsDiscrepancy&nbsp;quantifies&nbsp;the&nbsp;largest&nbsp;area&nbsp;of&nbsp;jank&nbsp;observed&nbsp;in&nbsp;a&nbsp;series<br>
of&nbsp;timestamps.&nbsp;&nbsp;Note&nbsp;that&nbsp;this&nbsp;is&nbsp;different&nbsp;from&nbsp;metrics&nbsp;based&nbsp;on&nbsp;the<br>
max_time_interval.&nbsp;For&nbsp;example,&nbsp;the&nbsp;time&nbsp;stamp&nbsp;series&nbsp;A&nbsp;=&nbsp;[0,1,2,3,5,6]&nbsp;and<br>
B&nbsp;=&nbsp;[0,1,2,3,5,7]&nbsp;have&nbsp;the&nbsp;same&nbsp;max_time_interval&nbsp;=&nbsp;2,&nbsp;but<br>
<a href="#-Discrepancy">Discrepancy</a>(B)&nbsp;&gt;&nbsp;<a href="#-Discrepancy">Discrepancy</a>(A).<br>
&nbsp;<br>
Two&nbsp;variants&nbsp;of&nbsp;discrepancy&nbsp;can&nbsp;be&nbsp;computed:<br>
&nbsp;<br>
Relative&nbsp;discrepancy&nbsp;is&nbsp;following&nbsp;the&nbsp;original&nbsp;definition&nbsp;of<br>
discrepancy.&nbsp;It&nbsp;characterized&nbsp;the&nbsp;largest&nbsp;area&nbsp;of&nbsp;jank,&nbsp;relative&nbsp;to&nbsp;the<br>
duration&nbsp;of&nbsp;the&nbsp;entire&nbsp;time&nbsp;stamp&nbsp;series.&nbsp;&nbsp;We&nbsp;normalize&nbsp;the&nbsp;raw&nbsp;results,<br>
because&nbsp;the&nbsp;best&nbsp;case&nbsp;discrepancy&nbsp;for&nbsp;a&nbsp;set&nbsp;of&nbsp;N&nbsp;samples&nbsp;is&nbsp;1/N&nbsp;(for<br>
equally&nbsp;spaced&nbsp;samples),&nbsp;and&nbsp;we&nbsp;want&nbsp;our&nbsp;metric&nbsp;to&nbsp;report&nbsp;0.0&nbsp;in&nbsp;that<br>
case.<br>
&nbsp;<br>
Absolute&nbsp;discrepancy&nbsp;also&nbsp;characterizes&nbsp;the&nbsp;largest&nbsp;area&nbsp;of&nbsp;jank,&nbsp;but&nbsp;its<br>
value&nbsp;wouldn't&nbsp;change&nbsp;(except&nbsp;for&nbsp;imprecisions&nbsp;due&nbsp;to&nbsp;a&nbsp;low<br>
|interval_multiplier|)&nbsp;if&nbsp;additional&nbsp;'good'&nbsp;intervals&nbsp;were&nbsp;added&nbsp;to&nbsp;an<br>
exisiting&nbsp;list&nbsp;of&nbsp;time&nbsp;stamps.&nbsp;&nbsp;Its&nbsp;range&nbsp;is&nbsp;[0,inf]&nbsp;and&nbsp;the&nbsp;unit&nbsp;is<br>
milliseconds.<br>
&nbsp;<br>
The&nbsp;time&nbsp;stamp&nbsp;series&nbsp;C&nbsp;=&nbsp;[0,2,3,4]&nbsp;and&nbsp;D&nbsp;=&nbsp;[0,2,3,4,5]&nbsp;have&nbsp;the&nbsp;same<br>
absolute&nbsp;discrepancy,&nbsp;but&nbsp;D&nbsp;has&nbsp;lower&nbsp;relative&nbsp;discrepancy&nbsp;than&nbsp;C.<br>
&nbsp;<br>
|timestamps|&nbsp;may&nbsp;be&nbsp;a&nbsp;list&nbsp;of&nbsp;lists&nbsp;S&nbsp;=&nbsp;[S_1,&nbsp;S_2,&nbsp;...,&nbsp;S_N],&nbsp;where&nbsp;each<br>
S_i&nbsp;is&nbsp;a&nbsp;time&nbsp;stamp&nbsp;series.&nbsp;In&nbsp;that&nbsp;case,&nbsp;the&nbsp;discrepancy&nbsp;D(S)&nbsp;is:<br>
D(S)&nbsp;=&nbsp;max(D(S_1),&nbsp;D(S_2),&nbsp;...,&nbsp;D(S_N))</tt></dd></dl>
<dl><dt><a name="-Total"><strong>Total</strong></a>(data)</dt><dd><tt>Returns&nbsp;the&nbsp;float&nbsp;value&nbsp;of&nbsp;a&nbsp;number&nbsp;or&nbsp;the&nbsp;sum&nbsp;of&nbsp;a&nbsp;list.</tt></dd></dl>
<dl><dt><a name="-TrapezoidalRule"><strong>TrapezoidalRule</strong></a>(data, dx)</dt><dd><tt>Calculate&nbsp;the&nbsp;integral&nbsp;according&nbsp;to&nbsp;the&nbsp;trapezoidal&nbsp;rule<br>
&nbsp;<br>
TrapezoidalRule&nbsp;approximates&nbsp;the&nbsp;definite&nbsp;integral&nbsp;of&nbsp;f&nbsp;from&nbsp;a&nbsp;to&nbsp;b&nbsp;by<br>
the&nbsp;composite&nbsp;trapezoidal&nbsp;rule,&nbsp;using&nbsp;n&nbsp;subintervals.<br>
<a href="http://en.wikipedia.org/wiki/Trapezoidal_rule#Uniform_grid">http://en.wikipedia.org/wiki/Trapezoidal_rule#Uniform_grid</a><br>
&nbsp;<br>
Args:<br>
&nbsp;&nbsp;data:&nbsp;A&nbsp;list&nbsp;of&nbsp;samples<br>
&nbsp;&nbsp;dx:&nbsp;The&nbsp;uniform&nbsp;distance&nbsp;along&nbsp;the&nbsp;x&nbsp;axis&nbsp;between&nbsp;any&nbsp;two&nbsp;samples<br>
&nbsp;<br>
Returns:<br>
&nbsp;&nbsp;The&nbsp;area&nbsp;under&nbsp;the&nbsp;curve&nbsp;defined&nbsp;by&nbsp;the&nbsp;samples&nbsp;and&nbsp;the&nbsp;uniform&nbsp;distance<br>
&nbsp;&nbsp;according&nbsp;to&nbsp;the&nbsp;trapezoidal&nbsp;rule.</tt></dd></dl>
</td></tr></table>
</body></html>