Posts Stats
Post
Cancel

Stats

Statistics

the mathematical science that deals with the collection, analysis, and presentation of data, which can be then be used as a basis for inference and induction

  • infer a conclusion

Data and Information

DataInformation
values with no meaning (raw facts, measurements of interest)once the data is processed it then is used for making a decision or can be used for a specific purpose.
  

Data Set and Database

Data SetDatabase
collection of data pointsrow / columns

Sources of Data

Primary DataSecondary Data
collected by self who is using the dataCollected by someone else
expensive, time consumingquick, readily available
reliablenot reliable
sole control over the datapublished on magazines, newspapers or any published data

Collection Methods (Primary Data)

ExperimentsDirect ObservationSurveys /Questionnaires
   

Biased Surveys: Choose one from below (Forcing the user to take survey which can lead to ambiguous answer)

Two Types of Data

| Qualitative | Quantitative | | — | — | | Classified by descriptive terms: For example, | Described by numerical values | | Marital Status | | | Political Party | |

Within Quantitative Data:

CountedMeasured
Number of ChildrenWeight
Defects per hour (Counted items)Voltage (Measured Characteristics)

Data by Level of Measurement

LevelDescriptionExample
NominalNo ranking allowedPostal Codes
OrdinalRanking Allowed but no measurable meaning to the number differenceEducation Level (PHD, Masters, Bachelors)
IntervalMeaningful but no zero pointsCalendar Year (2018, 2019)
RatioHas zero points to reference fromIncome ($80,000)

Time Series vs. Cross-Sectional Data

Time SeriesCross-Sectional Data
Over the multiple years (2010-2020)Within 2010 Data of (TX, AL, NY, CA, MN)
With time series we can observe trendCompare data at one particular point of time

Population vs. Sample

| Population | Sample | | — | — | | all possible subjects | refers to a portion of the population (represents the population) |

Parameter vs Statistics

| Parameter | Statistics | | —– | — | | Values calculated from population | Values computed from sample |

Inferential Statistics

  • Biased Samples
    • a sample that does not represent the population

Ways to Misuse Statistics

  • Changing the graph scale
  • Choosing biased samples

Branches of Statistics:

  • Descriptive
    • collecting, summarizing and displaying data
  • Inferential
    • make conclusions/claims based on the sample data
  • Predictive
    • take data from the past and predict the future values and make decisions

Chebyshev’s Theorem

  • not commonly used.
  • we work mostly with Normal Distributions

Average is the best way to represent the entire group if there are no outliers.

Probability

  • Numerical value ranging from 0 to 1.
  • 0 being no chance of probability to 1 being 100% occurring of the event.

Experiment

Sample Space

  • All the possible outcomes.

Event

  • One of the outcome of an experiment.
  • Outcome is basically a subset of the sample space.

Simple Event

  • Single Outcome which the most basic form that cannot be further simplified.

Three Methods of Assigning Probability

  • Classical
  • Empirical
  • Subjective

Classical Probability

P(A) = Number of possible outcomes / Total number of possible outcomes

Experiment: Roll a die once Sample space = {1, 2, 3, 4, 5, 6}

P(A) = 1/6 = 0.167 or a 16.7% probability.

Empirical Probability

  • Conducting the experiment to observe the frequency with which an event occurs.

P(A) = Frequency in which Event A Occurs/Total number of observations

Law of Large Numbers Whenever the experiment is done more than

Subjective Probability

  • Used when classical and empirical probabilities are not available.
  • Example: The probability of inflation will be more than 4% next year.

Five Basic Properties of Probability

  • Event A must occur.
  • Even A will not occur.
  • Must range from 0 to 1.
  • The sum of all the probabilities for the simple events in the sample space must be equal to 1.
  • Complement to Event A is defined as all of the outcomes of

Formula for the complement rule: P(A) + P(A’) = 1

Baye’s Theorem

Qualitative and Quantitative

Nominal

  • deals with qualitative data

Ordinal

This post is licensed under CC BY 4.0 by the author.