Suppose we are given the set of data points

(x1, y1), (x2, y2),..., (xn, yn)
and that we are interested in finding a straigt line that "best" fits that data. We will begin with the linear model
Yi = α-β(xi - x̄) + εi
where we assume that for a partiular value of x, that the value of Y will differ from its mean by some random ammount ε and that the distribution of ε is N(0,σ). It can be show that the estimate for α is
α̂ =ȳ
and that the estimate for β is
̂β = ∑iyi(xi - x̄)2/ ∑i(xi - x̄)2

x:
y: