datasets

Managing Data

Managing sample/population data

  • Class DataSet

    • contains a single sample or population data

  • Class DataSets

    • base class for managing multiple data sets

final class boring_math.probability_distributions.datasets.DataSet

Bases: object

Class containing a finite set of data

Data can be a sample or the population.

  • all data internally stored as floats (even integer data)

  • data sorted smallest to largest

  • methods provided to

    • read in data from a file

    • computing data statistics

    • add or remove data

classmethod read_data_from_file(file_name: str, sample: bool = False) Self

Create data set from data file

Read in data from a text file, calculate some statistics, and return a DataSet object. Fail fast if there is a problem with the data file.

The text file should

  • have one number (float) per line

    • if sample is true, calculate sample stats

    • if sample is false (default), calculate population stats

  • blank lines and lines beginning with ‘#’ are ignored

Parameters:
  • file_name – Path to file from which to read in data.

  • sample – If true treat data as a sample.

Returns:

A DataSet object containing the data from the file.

__init__(*data: int | float, sample: bool = False) None
_calculate_stats() None

Calculate data statistics

_calculate_mean() MayBe[float]

Calculate the mean of the data set, if it exists.

_calculate_stdev() MayBe[float]

From the data set, calculate & return the stdev if it exists.

  • If sample is True, calculate a sample standard deviation

  • If sample is False, calculate a population standard deviation

_calculate_quartiles() MayBe[tuple[float, float, float]]

Calculate first, second (median), and third quartiles

Using the “trimmed mid-range” of the data.

class boring_math.probability_distributions.datasets.DataSets

Bases: object

Class to manage data sets.

Base class for managing DataSet objects.

  • data sets can be samples or populations

  • child class should provide methods to

    • add or remove data sets

    • plot data sets

  • how data sets relate to each other is up to the child class

__init__() None