On Tue, 13 Nov 2007 17:11:36 -0800 Stephen Hemminger shemminger@linux-foundation.org wrote:
On Tue, 13 Nov 2007 19:52:17 -0500 Chuck Ebbert cebbert@redhat.com wrote:
On 11/13/2007 04:12 PM, Alan Cox wrote:
Bug fixing is not about finding someone to blame, it's about getting the bug fixed.
Partly - its also about understanding why the bug occurred and making it not happen again.
Very few people think about that part.
Why does the kernel have very few useful tests?
Tests would of course be nice, but they aren't very useful(!)
Looking at this list which Natalie has generated I see around thirty which are dependent on the right hardware and ten which are not. This ratio is typical, I think. In fact I'd say that more than 75% of reported bugs are dependent on hardware.
So the best test of all for the kernel is "run it on a different machine". This is why we are sooooo dependent upon our volunteer testers/reporters to be able to do kernel development.
Lack of interest? resources? expertise? Ideally each new feature would just be a small add on to an existing test.
Sure. For system-call-visible features it would be good to do that.
But this tends not to be where bugs get exposed. Because the original developer can 100% exercise such code. That isn't the case with driver/arch/platform changes.
Unlike developing new features which seems to grow well with more developers. Bug fixing also seems to be a scarcity process. There often seems to be a very few people that understand the problem well enough or have the necessary hardware to reproduce and fix the problem.
We're 100% dead if "having the hardware" is a prerequisite to fixing a bug. The terminal state there is that the kernel runs on about 200 machines worldwide. We have to work with reporters via email to fix these sorts of things. As we of course do.
Recent changes like tickless and scheduler rework were well thought out and caused very little impact to 90% of the users. The problem is the 10% who do have problems. Worse, the developers often only hear about the a small sample of those.
Yes. An unknown number of people just shrug and go back to an old kernel.