# DATA ANALYSIS

**Data analysis does not have to be complicated to be informative!** Your analysis can be as simple as looking at the difference in averages between two time points.

Although we cannot cover data analysis in depth in this Toolkit, we can introduce you to topics relevant to analyzing your indicator data and direct you towards further resources. Since your indicator analysis will likely be quantitative, this section will focus on quantitative rather than qualitative methods of analysis.

Please see the Building Capacity section for training recommendations.

Much of the content for this page draws from the Statistics Canada Introduction to Basic Statistics course. Information on how to register for this course is in the Building Capacity section.

There are two branches of statistics: descriptive statistics and inferential statistics. Descriptive statistics make definite statements about known quantities. Stating the mean, standard deviation, and interquartile ranges about a set of numbers are examples of descriptive statistics. By comparison, inferential statistics is about educated guesses, or inferences, about quantities that we cannot measure directly. These unknowable quantities are called parameters.

This page does not explain how to use computer software to perform the steps. Refer to the Building Capacity section for courses on how to use different software.

### Data Processing

Carefully document any data processing steps taken before analysis, so that your analysis is reproducible. If possible, avoid manipulating or changing record level data manually. If you do, document all your steps carefully so that people know how the data has been modified since its collection.

Data processing includes steps such as:

- Checking for errors and missing data.
- Generating new data points from existing data points. For example, generating “age” from date of birth and current date.
- Linking different data sets.
- Reformatting or reorganizing the data. For example, when you export the data from the data collection software, the data may not be in quite the right format to facilitate analysis.

### Descriptive Statistics

Descriptive statistics are used to summarize and organize your data. The product of your descriptive analysis might be enough to meet your needs, or you might choose to do more advanced analyses.

This document describes how to use descriptive statistics to analyze your data, including:

- Frequency and percent distributions (i.e., frequency tables, bar charts, histograms).
- Mean, median, and mode.
- Measures of dispersion (i.e., range, interquartile range, standard deviation).
- Sub-population groupings.

### Inferential Statistics

**Inferential statistics** allow us to make predictions about the total population based on a sample of the population. In other words, inferential statistics allows us to generalize the results we have collected from a sample of individuals to the larger population. In many practical cases, the population is what we really want information about, but the sample is what we can get.

This document introduces the topic of inferential statistics.

### Relationships Between Variables

If you have a large enough data set, you may be interested in doing more complex analyses looking at the relationships between different variables.

This document discusses some common relationships between variables, such as association, independence, correlation, and causation.

The webinar below is a presentation by Hannes Edinger and Rebecca Wortzman (Big River Analytics) on acquiring and working with data. For more webinars covering content in this toolkit click here.