Do you know about Statistics?

Source

Statistics is Important

  • Statistics is a prerequisite in most courses and books on applied machine learning.
  • Statistical methods are used at each step in an applied machine learning project.
  • Statistcal learning is the applied statistics equivalent of predictive modeling in machine learning.

Statistics is a collection of tools that you can use to get answers to important questions about data.

use  descriptive statistcal methods (描述统计方法) to transform raw observations into information that you can understand ans share.

use inferential statistical methods(推论统计方法) to reason from small samples of data to whole domains.

After reading this blog, you will know:

  • Statistics is generally considered a prerequisite to the field of applied machine learning.
  • We need statistics to help transform observations into information and to answer questions about samples of observations.
  • Statistics is a collection of tools developed over hundreds of years for summarizing data and quantifying properties of a domain given a sample of observations.

1.1 Statistics is required Prerequisite

Machine learning and statistics are two tightly related fields of study. So much so that statisticians refer to machine learning as applied statistics or statistical learning rather than the computer-science-centric name.

1.2 Why Learn Statistics ?

Statistical methods are required to find answers to the questions that we have about data. We can see that in order to both understand the data used to train a machine learning model and to interpret the results of testing different machine learning models, that statistical methods are required.

1.3 What is Statistics?

Statistics is a subfield of mathematics. It refers to a collection of methods for working with data and using data to answer questions.

divide the field of statistics into two large groups of methods: descriptive statistics for summarizing data and inferential statistics for drawing conclusions from samples of data.

1.3.1 Descriptive Statistics

Descriptive statistics refer to methods for summarizing raw observations into information that we can understand and share.

  • The calculation of statistical values
  • also cover graphical methods that can be used to visualize sample of data.

1.3.2 Inferential Statistics

Inferential statistics is a fancy name for methods that aid in quantifying properties of the domain or population from a smaller set of obtained observations called a sample.

More sophisticated statistical inference tools can be used to quantify the likelihood of observing data samples given an assumption. These are often referred to as tools for statistical hypothesis testing, where the base assumption of a test is called the null hypothesis.