tikz.dev / PGFplots Manual

Manual for Package pgfplots
2D/3D Plots in LA, Version 1.18.1
http://sourceforge.net/projects/pgfplots

Related Libraries

\(\newcommand{\footnotename}{footnote}\) \(\def \LWRfootnote {1}\) \(\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\let \LWRorighspace \hspace \) \(\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }\) \(\newcommand {\mathnormal }[1]{{#1}}\) \(\newcommand \ensuremath [1]{#1}\) \(\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } \) \(\newcommand {\setlength }[2]{}\) \(\newcommand {\addtolength }[2]{}\) \(\newcommand {\setcounter }[2]{}\) \(\newcommand {\addtocounter }[2]{}\) \(\newcommand {\arabic }[1]{}\) \(\newcommand {\number }[1]{}\) \(\newcommand {\noalign }[1]{\text {#1}\notag \\}\) \(\newcommand {\cline }[1]{}\) \(\newcommand {\directlua }[1]{\text {(directlua)}}\) \(\newcommand {\luatexdirectlua }[1]{\text {(directlua)}}\) \(\newcommand {\protect }{}\) \(\def \LWRabsorbnumber #1 {}\) \(\def \LWRabsorbquotenumber "#1 {}\) \(\newcommand {\LWRabsorboption }[1][]{}\) \(\newcommand {\LWRabsorbtwooptions }[1][]{\LWRabsorboption }\) \(\def \mathchar {\ifnextchar "\LWRabsorbquotenumber \LWRabsorbnumber }\) \(\def \mathcode #1={\mathchar }\) \(\let \delcode \mathcode \) \(\let \delimiter \mathchar \) \(\def \oe {\unicode {x0153}}\) \(\def \OE {\unicode {x0152}}\) \(\def \ae {\unicode {x00E6}}\) \(\def \AE {\unicode {x00C6}}\) \(\def \aa {\unicode {x00E5}}\) \(\def \AA {\unicode {x00C5}}\) \(\def \o {\unicode {x00F8}}\) \(\def \O {\unicode {x00D8}}\) \(\def \l {\unicode {x0142}}\) \(\def \L {\unicode {x0141}}\) \(\def \ss {\unicode {x00DF}}\) \(\def \SS {\unicode {x1E9E}}\) \(\def \dag {\unicode {x2020}}\) \(\def \ddag {\unicode {x2021}}\) \(\def \P {\unicode {x00B6}}\) \(\def \copyright {\unicode {x00A9}}\) \(\def \pounds {\unicode {x00A3}}\) \(\let \LWRref \ref \) \(\renewcommand {\ref }{\ifstar \LWRref \LWRref }\) \( \newcommand {\multicolumn }[3]{#3}\) \(\require {textcomp}\) \( \newcommand {\meta }[1]{\langle \textit {#1}\rangle } \) \(\newcommand {\toprule }[1][]{\hline }\) \(\let \midrule \toprule \) \(\let \bottomrule \toprule \) \(\def \LWRbooktabscmidruleparen (#1)#2{}\) \(\newcommand {\LWRbooktabscmidrulenoparen }[1]{}\) \(\newcommand {\cmidrule }[1][]{\ifnextchar (\LWRbooktabscmidruleparen \LWRbooktabscmidrulenoparen }\) \(\newcommand {\morecmidrules }{}\) \(\newcommand {\specialrule }[3]{\hline }\) \(\newcommand {\addlinespace }[1][]{}\) \(\require {colortbl}\) \(\let \LWRorigcolumncolor \columncolor \) \(\renewcommand {\columncolor }[2][named]{\LWRorigcolumncolor [#1]{#2}\LWRabsorbtwooptions }\) \(\let \LWRorigrowcolor \rowcolor \) \(\renewcommand {\rowcolor }[2][named]{\LWRorigrowcolor [#1]{#2}\LWRabsorbtwooptions }\) \(\let \LWRorigcellcolor \cellcolor \) \(\renewcommand {\cellcolor }[2][named]{\LWRorigcellcolor [#1]{#2}\LWRabsorbtwooptions }\) \(\newcommand {\intertext }[1]{\text {#1}\notag \\}\) \(\let \Hat \hat \) \(\let \Check \check \) \(\let \Tilde \tilde \) \(\let \Acute \acute \) \(\let \Grave \grave \) \(\let \Dot \dot \) \(\let \Ddot \ddot \) \(\let \Breve \breve \) \(\let \Bar \bar \) \(\let \Vec \vec \) \(\newcommand {\nicefrac }[3][]{\mathinner {{}^{#2}\!/\!_{#3}}}\)

5.12Statistics

  • \usepgfplotslibrary{statistics} % and plain

  • \usepgfplotslibrary[statistics] % Cont

  • \usetikzlibrary{pgfplots.statistics} % and plain

  • \usetikzlibrary[pgfplots.statistics] % Cont

  • A library which provides plot handlers for statistics.

5.12.1Box Plots

Box plots are visualizations for one-dimensional distributions. They provide a fast overview over characteristics of the distribution. Box plots are inherently one-dimensional; they only use a second axis to place multiple box plots next to each other.

pgfplots supports two related plot handlers: boxplot and boxplot prepared. The boxplot handler takes a one-dimensional sample as input, computes the median, the lower quartile, the upper quartile, the lower whisker and the upper whisker, and visualizes the result using the boxplot prepared handler. The boxplot prepared handler expects all required values on input and visualizes them.

5.12.1.1Prepared Box Plots and Common Options

The boxplot prepared handler is discussed first; all its customizations apply to boxplot as well.

  • /pgfplots/boxplot prepared={(math image)options with boxplot/ prefix(math image)}

If you place multiple plots with handler boxplot prepared into the same axis, they will automatically be placed next to each other by means of the default value of draw position:

The preceding examples read their outlier data streams from the \(y\) coordinate of the input streams: for \addplot table, we have explicitly said y index=0 and for \addplot coordinates, we have used (0,35) (0,55) where the \(x\) components are ignored. This default can be changed using the boxplot/data key.

  • /pgfplots/boxplot/data={(math image)expression(math image)} (initially y)

  • Tells boxplot how to get its data. The common idea is to provide a mathematical (math image)expression(math image) which depends on data supplied by the \addplot statement. For example, if you have \addplot expression, the (math image)expression(math image) may depend upon x, y or z. In case of an \addplot table input routine, the (math image)expression(math image) can employ \thisrow{(math image)colname(math image)} to access the currently active table row in the designated column.

    It is also possible to avoid invocations of the math parser. Use boxplot/data value={(math image)value(math image)} instead to do so. Here, (math image)value(math image) should be of a numeric constant.

    The initial configuration employs what would usually become the final y coordinate as input (to be more precise, the initial value is data value=\pgfkeysvalueof{/data point/y}).

  • /pgfplots/boxplot/lower whisker={(math image)value(math image)} (initially auto)

  • /pgfplots/boxplot/lower quartile={(math image)value(math image)} (initially auto)

  • /pgfplots/boxplot/median={(math image)value(math image)} (initially auto)

  • /pgfplots/boxplot/upper quartile={(math image)value(math image)} (initially auto)

  • /pgfplots/boxplot/upper whisker={(math image)value(math image)} (initially auto)

  • /pgfplots/boxplot/average={(math image)value(math image)} (initially empty)

  • These keys constitute the supported statistics. Typically, a box plot uses each of them except for average.

    Any numeric value for (math image)value(math image) will be used as is. This holds for both boxplot prepared and boxplot.

    An empty (math image)value(math image) disables the respective key: its associated visualization will be omitted. This is the default for average.

    The value auto tells pgfplots to include the statistics in the automatic computation applied by boxplot. It is irrelevant for boxplot prepared (where it is essentially the same as an empty (math image)value(math image)).

    The definition of the values is as follows. Assume that we have a given sample of a distribution, say \(x_1,\dotsc ,x_N\), and assume that the values are sorted, \(x_1 < \dotsb < x_N\) (which is not a requirement for boxplot, by the way). For any real number \(p\) with \(0\le p\le 1\), the “\(p\)-quantile” (or \(p\)–percentage) is defined as

    \[ x_p := \begin {cases} x_{N \cdot p} & \text {if $N \cdot p$ is an integer number}\\ \frac {1}{2} (x_{\lfloor N p \rfloor } + x_{\lceil N \cdot p \rceil }) & \text {if $N \cdot p$ is not an integer.} \end {cases} \]

    median is the \(0.5\)-quantile of the input data: half of the points are less and half of the points are larger than the median.

    lower quartile is the \(0.25\)-quantile of the input data.

    upper quartile is the \(0.75\)-quartile of the input data.

    lower whisker is the smallest data value which is larger than lower quartile \(-1.5 \cdot \text {IQR}\) where \(\text {IQR}\) is the “interquartile range”, i.e. the difference between upper quartile and lower quartile.

    upper whisker is the largest data value which is smaller than upper quartile \(+1.5 \cdot \text {IQR}\).

    average is the sample average. It is omitted by boxplot in its default configuration. Set it to auto to enable its auto-computation.

  • /pgfplots/boxplot/sample size={(math image)number(math image)} (initially auto)

  • The number of samples used to derive the statistics. This number is used if variable width=true.

    The value auto means to “use it whenever it can be acquired somewhere”. For a boxplot, it means that the size of the input sample is taken as is. For a boxplot prepared, it means that the data is unavailable.

    The empty string means that the value is unavailable.

    Otherwise, a number is expected.

  • /pgfplots/boxplot/variable width expr={(math image)math expression(math image)} (initially sqrt(#1))

  • A math expression which is used to evaluate the scaling factors of variable width. The argument is the current value of sample size. This key is used to implement common (nonlinear) transformations which are to be applied to the sample size before the result is used to scale down box sizes.

    Typically, the argument should be a monotonically increasing function.

  • /pgfplots/boxplot/sample size min={(math image)min sample size of group(math image)} (initially empty)

  • /pgfplots/boxplot/sample size max={(math image)max sample size of group(math image)} (initially empty)

  • This is part of the variable width scaling: it is used to determine the box extend relative to all other box plots of the same group. It fixes the range.

  • /pgfplots/boxplot/variable width min target={(math image)factor for the box width minimal size(math image)} (initially 0.2)

  • Used for the variable width feature to determine the size for the box plot with smallest value of sample size. The argument is interpreted to be a scaling factor in the range \([0,1]\).

    It is to be understood as percentage of box extend: a value of \(1\) means \(100\%\) of box extend. The initial configuration is 0.2, meaning \(20\%\) of box extend.

    The box plot with largest value of sample size has \(100\%\) of box extend.

  • /pgfplots/boxplot/whisker extend={(math image)axis unit for whisker extension(math image)} (initially \pgfkeysvalueof{/pgfplots/boxplot/box extend}*0.8)

  • A parameter which configures how large whisker lines are with respect to the non-data axis.

    It is used in the same way as box extend, and it also affects axis limits.

    The initial configuration couples its value to box extend (it is \(80\%\) of box extend, to be more precise).

5.12.1.2Analyzing Samples Automatically
  • /pgfplots/boxplot={(math image)options with boxplot/ prefix(math image)}

Attention:

Computing the statistics automatically is considerably faster if you use compat=1.12 combined with lualatex: this library has a special lua backend which allows scalability, speed, and accuracy beyond ’s capabilities.

(-tikz- diagram)

% Preamble: \pgfplotsset{width=7cm,compat=1.18}\usepgfplotslibrary{statistics} \begin{tikzpicture} \begin{axis}[y=1cm] \addplot+ [boxplot] table [row sep=\\,y index=0] { data\\ 1\\ 2\\ 1\\ 5\\ 4\\ 10\\ 7\\ 10\\ 9\\ 8\\ 9\\ 9\\ }; \end{axis} \end{tikzpicture}

The values do not need to be sorted. However, if they are sorted in ascending order, pgfplots might need less time to analyze them.

Data points can be given by means of any supported input stream, although the most useful ones are probably \addplot table and \addplot coordinates. In any case, boxplot acquires only one-dimensional data. To this end, it uses the current value of the boxplot/data key to see which input coordinate is to be used. In the default configuration, this is the \(y\)-coordinate of the input stream. All other input items are ignored (except for point meta, which is handed down to the outlier stream).

  • /pgfplots/boxplot/estimator=value (initially Excel)

  • Selects one of 10 available boxplot value estimators.

    The default estimator is R7 alias Excel if pgfplots is configured to use compat=1.12 or higher. For all older compatibility levels, it is legacy.

    The choice R1 resembles the estimator type 1 used by R. It has aliases SAS3 and Maple1. This choice is currently limited to the lua backend.

    The choice R2 resembles the estimator type 2 used by R. It has aliases SAS5 and Maple2. This choice is currently limited to the lua backend.

    The choice R3 resembles the estimator type 3 used by R. It has aliases SAS2.

    The choice R4 resembles the estimator type 4 used by R. It has aliases SAS1, SciPy0-1, and Maple3.

    The choice R5 resembles the estimator type 5 used by R. It has aliases SciPy12-12 and Maple4.

    The choice R6 resembles the estimator type 6 used by R. It has aliases SAS4, SciPy0-0, and Maple5.

    The choice R7 resembles the estimator type 7 used by R. It has aliases Excel, SciPy1-1 and Maple6.

    The choice R8 resembles the estimator type 8 used by R. It has aliases ScuPy13-13 and Maple7.

    The choice R9 resembles the estimator type 9 used by R. It has aliases SciPy38-38 and Maple8.

    The choice legacy is a minimally repaired variant of the estimator which was shipped with the first version of the statistics library. It is merely kept for reasons of backwards compatibility.98

  • /pgfplots/boxplot/whisker range={(math image)number(math image)} (initially 1.5)

  • Defines how to determine lower whisker and upper whisker. In the default configuration, the lower whisker is placed at the smallest data point which is larger than lower quartile \(- 1.5 \cdot \text {IQR}\). The upper whisker is placed at the largest data point which is smaller than upper quartile \(+ 1.5 \cdot \text {IQR}\). Here, \(\text {IQR}\) is the interquartile range, defined as

    \(\text {IQR} := \) upper quartile \(-\) lower quartile.

    Everything outside of the whisker range is supposed to be an outlier.

98 There is also an estimator called legacy*. This is the original one shipped with the first version of this library. It is discouraged but kept in case someone really needs it.

5.12.1.3Styles
  • /pgfplots/boxplot/every boxplot(style, no value)

  • A style which is immediately installed whenever boxplot or boxplot prepared are set.

    The initial value is empty.

  • /pgfplots/boxplot/every whisker(style, no value)

  • A style which is installed whenever a whisker is drawn. It is empty initially.

  • /pgfplots/boxplot/every box(style, no value)

  • A style which is installed whenever a box is drawn. It is empty initially. Note that this does not apply to the path for the median.

  • /pgfplots/boxplot/every median(style, no value)

  • A style which is installed whenever a median is drawn. It is empty initially.

5.12.1.4Placing Annotations
  • \pgfplotsboxplotvalue{(math image)key name(math image)}

  • Same as

    \pgfkeysvalueof{/pgfplots/boxplot/(math image)key name(math image)}.

  • Coordinate system boxplot whisker

  • A coordinate system which is almost the same as boxplot box cs, except that it aligns at whisker extend instead of box extend.

    The boxplot whisker cs accepts two arguments of the form boxplot whisker cs=((math image)data coordinate, whisker-relative offset(math image)) where the first is a value of the box plot’s data (it is expressed in the same space as median or upper whisker).

    The second argument is an offset expressed as signed multiple of whisker extend. An offset of \(0\) means to place the point exactly on the lower end of the whisker line. An offset of \(1\) places the point on the upper end of the whisker line. An offset of \(0.5\) places the point in the middle of the whisker line.

    (-tikz- diagram)

    % Preamble: \pgfplotsset{width=7cm,compat=1.18}\usepgfplotslibrary{statistics} \begin{tikzpicture} \begin{axis}[y=1.5cm, ymax=2] \addplot+[boxplot] table[row sep=\\,y index=0] { data\\ 1\\ 2\\ 1\\ 5\\ 4\\ 10\\ 7\\ 10\\ 9\\ 8\\ 9\\ 9\\ } [above] node at (boxplot whisker cs:\boxplotvalue{lower whisker},1) {\pgfmathprintnumber{\boxplotvalue{lower whisker}}} node at (boxplot box cs: \boxplotvalue{median},1) {\pgfmathprintnumber{\boxplotvalue{median}}} node at (boxplot whisker cs:\boxplotvalue{upper whisker},1) {\pgfmathprintnumber{\boxplotvalue{upper whisker}}} ; \end{axis} \end{tikzpicture}
5.12.1.5Customizing Visualization Paths

The following keys are of interest if you want to redefine the shape of a box, of a median, or of the whiskers.

Note that you should customize styles like boxplot/every box if you merely wish to change fill colors.

5.12.2Histograms
  • /pgfplots/hist={(math image)options with hist/ prefix(math image)}

Attention:

do not use hist/data=x or other symbolic values as input when you have symbolic coords. Rather than symbolic values, you need to provide expandable values like \pgfkeysvalueof{/data point/x} (which has the same effect, but directly expands to the correct value).

Please refer to the documentation of symbolic x coords for further details about symbolic coordinates.