Thursday, 19 November 2015

Negative diagonal elements in the covariance matrix returned by numpy.polyfit

I ran into a strange issue fitting a line to a small number of data points using numpy.polyfit that I thought was worth documenting.

I ran a command of the form:

p, cov = np.polyfit(x,y,1,w,cov=True),

where x, y and w were arrays of length 3.

The command returned the correct slope and y-intercept values, however the covariance matrix, cov, had strictly negative diagonal terms. This is apparantly because numpy scales the covariance matrix as described in here.

The scaling applied is a factor such that

factor = resids / (len(x) - order - 2.0)

If, like me, you are making a first order polynomial fit to a dataset of 3 values, the denominator has the effect of multiplying the expected matrix by -1. If I was unlucky enough to have 4 points, it would have thrown bigger errors.

In my case, looking at the results here, I could recover the correct values just by multiplying the matrix by minus one. This is a strange weighting to apply to a small dataset - I assume it makes sense if you have many points and the developers wanted to keep numpy.polyfit consistent.

1 comment:

  1. I have to say this. How f****** ridiculous is that. Not only there is no explanation of this in the scipy webpage, but from what you are writing, estimating errors is borderline miracle. Just write a function which will use chi^2 for fitting. At least we will have errors. What's the use of a value with no error???!!!

    ReplyDelete

Sending an SMS via Intent in Android

Alchemy allows users to donate to their chose charities via text message, which it used to do automatically using Android's SmsManager ...