DISCLAIMER: Expressed views on this blog are my own.

Anyone ever think about why the residuals need to be independent and identically normally distributed around mu = 0 and variance = 1? Well, I thought about it for a while and figured if you look at what linear regression does then you'll connect it with the reason why we like normally distributed residuals. Linear Regression attempts to fit a line to the data while minimizing the distance to each point. Any point not on the line is considered to have an error or residual (remains), so a reasonable assumption would be to say that the residuals should be as close to 0 as possible. Given that assumption a normal distribution with mean 0 accomplishes this task because ideally normally distributed errors would cancel each other out. You know with e1 = -1 e2 = 0 and e3 =1, add them up and it is 0. Besides most of the errors would be centered around the mean of 0, which is good therefore minimizing the prediction error. Variance is 1? I'm not completely sure, but if you think about decreasing variance you can see that your curve is still bell shaped, but more pointy, as you could say, whereas increasing the variance makes for a more flat bell shaped curve. (its easy to imagine this)

At this point, you could imagine assuming different distributions for the error terms such as chi-square or gamma to see why a skewed distribution would screw things up. with a gamma distribution you'd end up with a left or right skewed distribution curve meaning crazy big errors could happen and they wouldn't be minimized or let's say the error terms wouldn't be balanced.