Inference and Hypothesis Testing

Sampling, confidence intervals, test decisions, p-values, and independence logic for Level I.

Inference is what turns a sample into a decision. Level I is usually not asking whether you can memorize a long test catalog. It is asking whether you understand the decision structure: what is being tested, what evidence you observed, and what conclusion the evidence actually supports.

From Sample To Statement

A sample mean is not the population mean. It is an estimate, and the exam wants you to respect the uncertainty around that estimate.

A common building block is the standard error:

$$ SE = \frac{s}{\sqrt{n}} $$

The interpretation matters more than the formula itself. Larger samples usually shrink sampling uncertainty. More uncertainty means a wider confidence interval and a weaker basis for sharp conclusions.

Hypothesis Testing Is A Decision Framework

PieceWhat it meansFrequent trap
Null hypothesisThe baseline claim being challengedTreating it as the claim you want to prove true
Alternative hypothesisThe competing claimForgetting the direction in one-tailed tests
Significance levelThe tolerated Type I error rateCalling it the probability the null is true
p-valueEvidence against the null under the null assumptionReading it as the probability the null is correct
Type I errorRejecting a true nullConfusing it with power
Type II errorFailing to reject a false nullTreating “fail to reject” as “accept”

The test statistic logic is conceptually simple:

$$ \text{test statistic} = \frac{\text{estimate} - \text{hypothesized value}}{SE} $$

You compare observed evidence against a threshold. If the evidence is strong enough, you reject the null. If it is not, you fail to reject. That wording matters. Level I often penalizes candidates who overstate the conclusion.

Tests Of Independence

When the curriculum moves into parametric and non-parametric tests of independence, the exam is usually asking whether two variables move together in a statistically meaningful way or whether category membership appears related across groups.

The important habit is to identify the data structure first:

  • numeric variables often lead toward correlation-style thinking
  • categorical groupings often lead toward contingency-table logic

How CFA-Style Questions Usually Test This

  • by mixing up economic importance and statistical significance
  • by asking what “fail to reject the null” actually means
  • by reversing the meaning of significance level and p-value
  • by giving a sampling setup and asking which conclusion is too strong

Mini-Case

Suppose an analyst tests whether a manager’s mean excess return is greater than zero. The p-value is larger than the chosen significance level. A weak candidate says the manager has no skill. A stronger candidate says the sample did not provide enough evidence to reject the null at that significance level.

That is the Level I habit: draw the conclusion the evidence supports, not the conclusion you wish the test had proven.

Common Traps

  • saying the null hypothesis has been proved
  • confusing a lower p-value with a lower probability of error in every sense
  • forgetting that significance level is chosen before seeing the sample
  • interpreting statistical insignificance as proof of no effect

Sample CFA-Style Question

A test result is not statistically significant at the 5% level. Which interpretation is strongest?

Best answer: The sample does not provide enough evidence to reject the null hypothesis at the 5% significance level.

Why: Level I routinely tests the language of inference. “Fail to reject” is deliberately weaker than “accept.”

Continue In This Chapter