A histogram is a tool of data representation that consists of bars arranged vertically to display the distribution of a given data set. It is usually a summary of what is entailed in the set of data. It is appropriate for summarizing because it offers an easier assessment by just viewing the heights of the bars that represent occurrence frequencies.
Z-score is a standardization tool that is used to put data into one scale before making comparisons. The z-score for a given data point (x) is calculated by finding the difference between the sample mean and the point (x), then dividing the outcome by the standard deviation. The z-score is used when the sample size to be analyzed has more than twenty-nine units, and the standard deviation is known.
Normality is an assumption in a regression that believes the error term in the statistics has a normal distribution with a mean of zero. Having a non-normal distribution would mean the variables are skewed and thus analyzing the set of data would be difficult. Normality is important in small samples (less than 200) since they do not use the central limit theorem to ensure normality.
A normal distribution is a set of data that is symmetrical and has continuous probability. Its density function yields a bell-shaped curve that has only one peak. The peak represents the mean of the distribution while the spread of the curve’s base is a representation of the standard deviation.
Y=a+bX is a representation of a simple regression line. Y represents the dependent variable while the independent one is represented by the variable X. The parameter ‘a’ is the y-intercept and parameter ‘b’ is the slope of the regression line. A change in ‘x’ would affect the value of ‘y’ at a rate determined by the value of ‘b.’ For example, if Y=2+4X and X=8, then Y=34. If the value of X increases to 10, then Y=42.