# Grouping data and descriptive statistics

## How to set statistical aggregators for a single column

Aggregators provide summary statistics for grouped rows.

Command | Operation |
---|---|

`+` aggregator |
adds aggregator to current column |

`z+` aggregator |
displays result of aggregator over values in selected rows for current column |

The following statistical aggregators are available:

Aggregator | Description |
---|---|

`min` |
smallest value in the group |

`max` |
largest value in the group |

`avg` /`mean` |
average value of the group |

`mode` |
most frequently appearing value in group |

`median` |
median value in the group |

`q3/q4/q5/q10` |
add quantile aggregators to group (e.g. q4 adds p25, p50, p75) |

`sum` |
total summation of all numbers in the group |

`distinct` |
number of distinct values in the group |

`count` |
number of values in the group |

`keymax` |
key of the row with the largest value in the group |

`list` |
gathers values in column into a list |

`stdev` |
standard deviation of values |

The follow howtos will have examples of workflows involving grouping of data and statistical aggregation.

## How to create a pivot table

This example uses the file sample.tsv.

- Press
`!`

on the independent variable to set it as a key column. - Press
`+`

on a second column to add an*aggregrate*. This configures which column is to be composed in relation to others. - Optional: Set the type for the column being aggregated by pressing
`~`

(string),`#`

(int),`%`

(float),`$`

(currency), or`@`

(date). - Press
`Shift+W`

on the dependent categorical variable to pivot on it.

## How to create a frequency chart

The following examples use the file sample.tsv.

###### How to make a simple histogram

**Question** How many of each **Item** were sold?

- Move the cursor to the
**Item**column. - Press
`Shift+F`

to open the**Frequency table**.

###### How to use the Frequency table to view the results of statistical aggregation

**Question** What was the monthly revenue?

- On the
**OrderDate**column, type`;`

followed by`(\d+-\d+)`

to create a column with only the year and the month using regex capture groups. - Press
`-`

to hide the**OrderDate**column. - On the **OrderDate_re0** column, type
`^`

followed by`OrderDate`

to rename it. - On the
**Total**column, press`$`

to set its type to currency. - Type
`+`

followed by`sum`

to add a statistical aggregator to**Total**. - On the
**OrderDate**column type`Shift+F`

to open the**Frequency table**. - On the
**OrderDate**column, press`[`

to sort the table in chrononological order. - On the **sum_Total** column, type
`^`

followed by`Revenue`

to rename the column.

## How to calculate some descriptive statistics

- Press
`Shift+I`

to open the**Describe sheet**.

## How to filter for grouped or described rows

###### How to filter for grouped rows

- Press
`Shift+F`

to open the**Frequency table**. - Press
`s`

or`t`

on the groups you are interested in to select those entries in the source sheet. - Press
`q`

or`Ctrl+^`

to return to the source sheet. - Press
`"`

to open a duplicate sheet with selected rows.

**or**

- Press
`Shift+F`

to open the**Frequency table**. - Press
`Enter`

on the grouping you are interested in to open a sheet of the source rows that are part of that group.

###### How to filter for described rows

- Press
`Shift+I`

to open the**Describe sheet**. - Use
`zs`

to select rows on source sheet which are being described in the cells of interest. - Press
`q`

or`Ctrl+^`

to return to the source sheet. - Press
`"`

to open a duplicate sheet with selected rows.

**or**

- Press
`Shift+I`

to open the**Describe sheet**. - Press
`zEnter`

open copy of source sheet with rows being described in the current cell of interest.

###### How to filter for the rows aggregated in a pivot table

`Enter`

/`zEnter`

can both be used in the **Pivot table** to open a sheet of the source rows which are aggregated in the current pivot row/cell.

## How to combine cells in one for column for matching rows in another

This tutorial shows you how to go from this dataset:

field | value |
---|---|

A | 1 |

A | 2 |

B | 3 |

C | 4 |

C | 5 |

to

field | value |
---|---|

A | 1;2 |

B | 3 |

C | 4;5 |

in VisiData.

- On the
**value**column, type`+`

followed by`list`

to add to it a list aggregator. - Press
`Shift+F`

on the**field**column to open the**Frequency table**. - Type
`=`

followed by`';'.join(value_list)`

to add an expression column with the joined values.