[alsa-devel] [BUG] New Kernel Bugs
This is the listing of the open bugs that are relatively new, around 2.6.22 and up. They are vaguely classified by specific area. (not a full list, there are more :)
The good part is that reporters of the bugs below are still around and haven't dissipated, or disposed of their hardware, so it is a good time to get the bugs. Those bugzillas that have been started as regressions on Rafael's list are not mentioned here so far, since they are being tracked as new regressions already.
It would be appreciated if the corresponding maintenance team could take a look, close off any which are fixed and see if they can fix any which aren't.
NOTE: when replying to this email, please add the bug number to the Subject in the form [Bug 1234] so that bugzilla will capture the discussion. Thanks.
ACPI====================================================================
System does not load without acpi=off ide=nodma noapic http://bugzilla.kernel.org/show_bug.cgi?id=9358 Kernel: 2.6.23.1
ACPI Error attaching device data http://bugzilla.kernel.org/show_bug.cgi?id=9354 Kernel: 2.6.24-rc2
/proc/acpi/battery displays Incorrect voltages http://bugzilla.kernel.org/show_bug.cgi?id=9341 Kernel: 2.6.23.1
PATA scan: ACPI Exception AE_AML_PACKAGE_LIMIT... is beyond end of object http://bugzilla.kernel.org/show_bug.cgi?id=9320 Kernel: 2.6.24-rc2 (Tejun: calling _GTF without calling _STM first. _GTM doesn't have any prerequisite (it can't). Can someone familiar with ACPI tell me why the method is failing? At any rate, libata should work fine regardless of ACPI failures. Maybe it's time to start blacklist to skip ATA-ACPI for some boards to avoid those annoying messages during boot)
ACPI Battery Info in /sys but not /proc/acpi http://bugzilla.kernel.org/show_bug.cgi?id=9183 Kernel: 2.6.23-rc8-mm2
When using ACPI on a Compaq Presario V6221EU the laptop goes into deadlock after a random amount of time http://bugzilla.kernel.org/show_bug.cgi?id=9118 Kernel: 2.6.23-rc6
ACPI video driver should validate brightness level before setting it via _BCM http://bugzilla.kernel.org/show_bug.cgi?id=9277 Kernel: 2.6.23
VIDEO/DVB
dvb driver reboot system http://bugzilla.kernel.org/show_bug.cgi?id=9357 Kernel: 2.6.21.5
PLATFORM===============================================================
xipImage is built so that uBoot cant run it (ARM) http://bugzilla.kernel.org/show_bug.cgi?id=9356 Kernel: 2.6.21
Samsung R20 - ACPI: PCI Root Bridge [PCI0] (0000:00) http://bugzilla.kernel.org/show_bug.cgi?id=9339 Kernel: 2.6.24 (boot is very long ..MP-BIOS bug: 8254 timer not connected to IO-APIC then the boot stop at : ACPI: PCI Root Bridge [PCI0] (0000:00) (during 3 minutes, and boot continue)
system_64.h: switch_to inline asm should be more robbust wrt optimizations http://bugzilla.kernel.org/show_bug.cgi?id=9302 Kernel: 2.6.24-rc1
with CONFIG_NO_HZ and/or CONFIG_HPET_TIMER set kernel 2.6.23 doesn't boot (ARM, Timer) http://bugzilla.kernel.org/show_bug.cgi?id=9229 Kernel: 2.6.23
NETWORKING===========================================================
RTNLGRP_ND_USEROPT does not report ifindex (IPv6) http://bugzilla.kernel.org/show_bug.cgi?id=9349 Kernel: 2.6.24+
a kernel error happend in the func: __skb_dequeue when using in pfifo_fast_dequeue http://bugzilla.kernel.org/show_bug.cgi?id=9342 Kernel: 2.6.11.1 - reporter asked to try recent kernel
e100 does not work after boot http://bugzilla.kernel.org/show_bug.cgi?id=9336 Kernel: 2.6.23.1
2.6.23.1-smp kernel panic (network-related) http://bugzilla.kernel.org/show_bug.cgi?id=9318 Kernel: 2.6.23.1 Infiniband panic
sundance -> 4port D-Link System Inc DFE-580TX -> Log errors http://bugzilla.kernel.org/show_bug.cgi?id=9311 Kernel: 2.6.22.9
via-rhine driver stalls with: PHY status 786d, resetting... http://bugzilla.kernel.org/show_bug.cgi?id=9300 Kernel: 2.6.23+
Weird network problems with 2.6.23-rc2 http://bugzilla.kernel.org/show_bug.cgi?id=9080 http://lkml.org/lkml/2007/8/11/40 - description
rt2500pci: low TCP throughput (wireless) http://bugzilla.kernel.org/show_bug.cgi?id=9273 Kernel: 2.6.24-rc1 This is a regression
Unable to build wifi network between zd1201 and b43 http://bugzilla.kernel.org/show_bug.cgi?id=9237 Kernel: 2.6.24-rc1
Crash after module unload in b43 (wireless) http://bugzilla.kernel.org/show_bug.cgi?id=9233 Kernel: 2.6.24-rc1
(net typhoon) "no descs for cmd, had (needed) 0 (1) cmd, 31 (7) resp" http://bugzilla.kernel.org/show_bug.cgi?id=9225 Kernel: 2.6.23.1
IDE/SATA=========================================================
pata_pdc202xx_old excessive ATA bus errors http://bugzilla.kernel.org/show_bug.cgi?id=9337 2.6.24-rc2
Drive seagate ST380011AS needs to be blacklisted http://bugzilla.kernel.org/show_bug.cgi?id=9309 Kernel: 2.6.22.X
DVD-RAM umount and disk free bug http://bugzilla.kernel.org/show_bug.cgi?id=9265 Kernel: 2.6.15 (asked to try current kernel)
FILE SYSTEMS=======================================================
ext4: delalloc space accounting problem drops data http://bugzilla.kernel.org/show_bug.cgi?id=9329 Kernel: 2.6.24-rc1
POSIX Access Control Lists cause bogus file system check errors http://bugzilla.kernel.org/show_bug.cgi?id=9241 Kernel: 2.6.23.1
MEMORY MANAGEMENT================================================
My system hangs when it has no more free memory to allocate via malloc() http://bugzilla.kernel.org/show_bug.cgi?id=9316 Kernel: 2.6.23 User program, "My system hangs when it has no more free memory to allocate via malloc()"
BUG: unable to handle kernel paging request at virtual address 26121228/kswapd0[231] exited with preempt_count 1 http://bugzilla.kernel.org/show_bug.cgi?id=9305 EIP is at free_block+0x6d/0xe4 Kernel: 2.6.22.6
POWER MANAGEMENT==================================================
IBM X41 looses time after Suspend2Disk http://bugzilla.kernel.org/show_bug.cgi?id=9314 Kernel: 2.6.23
Suspend to RAM resume hangs on a tickless (NO_HZ) kernel http://bugzilla.kernel.org/show_bug.cgi?id=9275 Kernel: 2.6.23 This is HP notebook nc6320 T2400 945GM
VIDEO DRIVERS========================================================
No text consoles with FRAMEBUFFER_CONSOLE_DETECT_PRIMARY http://bugzilla.kernel.org/show_bug.cgi?id=9310 Kernel: 2.6.24-rc1 This is a regression
PARALLEL PORT========================================================
LPC IT8705 POST port making noise on parallel port http://bugzilla.kernel.org/show_bug.cgi?id=9306 Kernel: 2.6.16+
I/O STORAGE===========================================================
kernel bug from pktcdvd http://bugzilla.kernel.org/show_bug.cgi?id=9294 Kernel: 2.6.23
After pci-e video card was installed, pci add-on usb card & firewire card fail http://bugzilla.kernel.org/show_bug.cgi?id=9223 Kernel: 2.6.20 (testing of latest kernel requested)
SCSI==================================================================
qla2xxx: driver initialization does not complete when booting with Port connected http://bugzilla.kernel.org/show_bug.cgi?id=9267 Kernel: 2.6.23.1
SOUND ALSA============================================================
Unable to load snd-hda-intel module: Unknown symbol in module, or unknown parameter http://bugzilla.kernel.org/show_bug.cgi?id=9242 Kernel: 2.6.24-rc1
usbaudio microphone: regular sound distortion on several Logitech Webcams http://bugzilla.kernel.org/show_bug.cgi?id=9230 Kernel: 2.6.22.9
HID====================================================================
Kernel NULL pointer dereference at :usbhid:hiddev_ioctl+0x2f/0xabc http://bugzilla.kernel.org/show_bug.cgi?id=9216 Kernel: 2.6.23.1 Looks like this is a regression
On Mon, 12 Nov 2007 22:42:32 -0800 "Natalie Protasevich" protasnb@gmail.com wrote:
This is the listing of the open bugs that are relatively new, around 2.6.22 and up. They are vaguely classified by specific area. (not a full list, there are more :)
The good part is that reporters of the bugs below are still around and haven't dissipated, or disposed of their hardware, so it is a good time to get the bugs. Those bugzillas that have been started as regressions on Rafael's list are not mentioned here so far, since they are being tracked as new regressions already.
Thanks.
It would be appreciated if the corresponding maintenance team could take a look, close off any which are fixed and see if they can fix any which aren't.
NOTE: when replying to this email, please add the bug number to the Subject in the form [Bug 1234] so that bugzilla will capture the discussion. Thanks.
You're optimistic.
ACPI====================================================================
System does not load without acpi=off ide=nodma noapic http://bugzilla.kernel.org/show_bug.cgi?id=9358 Kernel: 2.6.23.1
One response from a developer
ACPI Error attaching device data http://bugzilla.kernel.org/show_bug.cgi?id=9354 Kernel: 2.6.24-rc2
Zero responses from developers
/proc/acpi/battery displays Incorrect voltages http://bugzilla.kernel.org/show_bug.cgi?id=9341 Kernel: 2.6.23.1
Zero responses from developers
PATA scan: ACPI Exception AE_AML_PACKAGE_LIMIT... is beyond end of object http://bugzilla.kernel.org/show_bug.cgi?id=9320 Kernel: 2.6.24-rc2 (Tejun: calling _GTF without calling _STM first. _GTM doesn't have any prerequisite (it can't). Can someone familiar with ACPI tell me why the method is failing? At any rate, libata should work fine regardless of ACPI failures. Maybe it's time to start blacklist to skip ATA-ACPI for some boards to avoid those annoying messages during boot)
Tejun doing stuff
ACPI Battery Info in /sys but not /proc/acpi http://bugzilla.kernel.org/show_bug.cgi?id=9183 Kernel: 2.6.23-rc8-mm2
Marked as a duplicate of an already-resolved bug.
When using ACPI on a Compaq Presario V6221EU the laptop goes into deadlock after a random amount of time http://bugzilla.kernel.org/show_bug.cgi?id=9118 Kernel: 2.6.23-rc6
Someone called Jike Song is trying to help out. Regular developers awol.
ACPI video driver should validate brightness level before setting it via _BCM http://bugzilla.kernel.org/show_bug.cgi?id=9277 Kernel: 2.6.23
Zero responses from developers
VIDEO/DVB
dvb driver reboot system http://bugzilla.kernel.org/show_bug.cgi?id=9357 Kernel: 2.6.21.5
Mauro thinks it might be bad hardware
PLATFORM===============================================================
xipImage is built so that uBoot cant run it (ARM) http://bugzilla.kernel.org/show_bug.cgi?id=9356 Kernel: 2.6.21
Zero responses from developers
Samsung R20 - ACPI: PCI Root Bridge [PCI0] (0000:00) http://bugzilla.kernel.org/show_bug.cgi?id=9339 Kernel: 2.6.24 (boot is very long ..MP-BIOS bug: 8254 timer not connected to IO-APIC then the boot stop at : ACPI: PCI Root Bridge [PCI0] (0000:00) (during 3 minutes, and boot continue)
No response from developers
system_64.h: switch_to inline asm should be more robbust wrt optimizations http://bugzilla.kernel.org/show_bug.cgi?id=9302 Kernel: 2.6.24-rc1
Not really a bug.
with CONFIG_NO_HZ and/or CONFIG_HPET_TIMER set kernel 2.6.23 doesn't boot (ARM, Timer) http://bugzilla.kernel.org/show_bug.cgi?id=9229 Kernel: 2.6.23
No response from developers
NETWORKING===========================================================
RTNLGRP_ND_USEROPT does not report ifindex (IPv6) http://bugzilla.kernel.org/show_bug.cgi?id=9349 Kernel: 2.6.24+
No response from developers
a kernel error happend in the func: __skb_dequeue when using in pfifo_fast_dequeue http://bugzilla.kernel.org/show_bug.cgi?id=9342 Kernel: 2.6.11.1 - reporter asked to try recent kernel
No response from developers
e100 does not work after boot http://bugzilla.kernel.org/show_bug.cgi?id=9336 Kernel: 2.6.23.1
No response from developers
2.6.23.1-smp kernel panic (network-related) http://bugzilla.kernel.org/show_bug.cgi?id=9318 Kernel: 2.6.23.1 Infiniband panic
No response from developers
sundance -> 4port D-Link System Inc DFE-580TX -> Log errors http://bugzilla.kernel.org/show_bug.cgi?id=9311 Kernel: 2.6.22.9
No response from developers
via-rhine driver stalls with: PHY status 786d, resetting... http://bugzilla.kernel.org/show_bug.cgi?id=9300 Kernel: 2.6.23+
No response from developers
Weird network problems with 2.6.23-rc2 http://bugzilla.kernel.org/show_bug.cgi?id=9080 http://lkml.org/lkml/2007/8/11/40 - description
No response from developers
rt2500pci: low TCP throughput (wireless) http://bugzilla.kernel.org/show_bug.cgi?id=9273 Kernel: 2.6.24-rc1 This is a regression
No response from developers
Unable to build wifi network between zd1201 and b43 http://bugzilla.kernel.org/show_bug.cgi?id=9237 Kernel: 2.6.24-rc1
No response from developers
Crash after module unload in b43 (wireless) http://bugzilla.kernel.org/show_bug.cgi?id=9233 Kernel: 2.6.24-rc1
No response from developers
(net typhoon) "no descs for cmd, had (needed) 0 (1) cmd, 31 (7) resp" http://bugzilla.kernel.org/show_bug.cgi?id=9225 Kernel: 2.6.23.1
No response from developers
IDE/SATA=========================================================
pata_pdc202xx_old excessive ATA bus errors http://bugzilla.kernel.org/show_bug.cgi?id=9337 2.6.24-rc2
No response from developers
Drive seagate ST380011AS needs to be blacklisted http://bugzilla.kernel.org/show_bug.cgi?id=9309 Kernel: 2.6.22.X
Jeff and Tehun did some stuff.
DVD-RAM umount and disk free bug http://bugzilla.kernel.org/show_bug.cgi?id=9265 Kernel: 2.6.15 (asked to try current kernel)
No response from developers
FILE SYSTEMS=======================================================
ext4: delalloc space accounting problem drops data http://bugzilla.kernel.org/show_bug.cgi?id=9329 Kernel: 2.6.24-rc1
No response from developers
POSIX Access Control Lists cause bogus file system check errors http://bugzilla.kernel.org/show_bug.cgi?id=9241 Kernel: 2.6.23.1
Andreas did some work, seemed to lose interest.
MEMORY MANAGEMENT================================================
My system hangs when it has no more free memory to allocate via malloc() http://bugzilla.kernel.org/show_bug.cgi?id=9316 Kernel: 2.6.23 User program, "My system hangs when it has no more free memory to allocate via malloc()"
Rafael poked Thomas a week ago, to no effect. Thomas has been travelling.
BUG: unable to handle kernel paging request at virtual address 26121228/kswapd0[231] exited with preempt_count 1 http://bugzilla.kernel.org/show_bug.cgi?id=9305 EIP is at free_block+0x6d/0xe4 Kernel: 2.6.22.6
No response from developers
POWER MANAGEMENT==================================================
IBM X41 looses time after Suspend2Disk http://bugzilla.kernel.org/show_bug.cgi?id=9314 Kernel: 2.6.23
Rafael poked Thomas a week ago, to no effect. Thomas has been travelling.
Suspend to RAM resume hangs on a tickless (NO_HZ) kernel http://bugzilla.kernel.org/show_bug.cgi?id=9275 Kernel: 2.6.23 This is HP notebook nc6320 T2400 945GM
No response from developers
VIDEO DRIVERS========================================================
No text consoles with FRAMEBUFFER_CONSOLE_DETECT_PRIMARY http://bugzilla.kernel.org/show_bug.cgi?id=9310 Kernel: 2.6.24-rc1 This is a regression
No response from developers
PARALLEL PORT========================================================
LPC IT8705 POST port making noise on parallel port http://bugzilla.kernel.org/show_bug.cgi?id=9306 Kernel: 2.6.16+
No response from developers
I/O STORAGE===========================================================
kernel bug from pktcdvd http://bugzilla.kernel.org/show_bug.cgi?id=9294 Kernel: 2.6.23
I think we might have fixed this.
After pci-e video card was installed, pci add-on usb card & firewire card fail http://bugzilla.kernel.org/show_bug.cgi?id=9223 Kernel: 2.6.20 (testing of latest kernel requested)
No response from developers
SCSI==================================================================
qla2xxx: driver initialization does not complete when booting with Port connected http://bugzilla.kernel.org/show_bug.cgi?id=9267 Kernel: 2.6.23.1
No response from developers
SOUND ALSA============================================================
Unable to load snd-hda-intel module: Unknown symbol in module, or unknown parameter http://bugzilla.kernel.org/show_bug.cgi?id=9242 Kernel: 2.6.24-rc1
Takashi has responded
usbaudio microphone: regular sound distortion on several Logitech Webcams http://bugzilla.kernel.org/show_bug.cgi?id=9230 Kernel: 2.6.22.9
Clemens responded
HID====================================================================
Kernel NULL pointer dereference at :usbhid:hiddev_ioctl+0x2f/0xabc http://bugzilla.kernel.org/show_bug.cgi?id=9216 Kernel: 2.6.23.1 Looks like this is a regression
No response from developers
So I count around seven reports which people are doing something with and twenty seven which have been just ignored.
Three of these reports have been identified as regressions. All three of those remain unresponded to.
On Tue, Nov 13 2007, Andrew Morton wrote:
I/O STORAGE===========================================================
kernel bug from pktcdvd http://bugzilla.kernel.org/show_bug.cgi?id=9294 Kernel: 2.6.23
I think we might have fixed this.
It's fixed and merged, I just forgot to close the bugzilla. Did so now.
From: Andrew Morton akpm@linux-foundation.org Date: Tue, 13 Nov 2007 03:15:53 -0800
NETWORKING===========================================================
RTNLGRP_ND_USEROPT does not report ifindex (IPv6) http://bugzilla.kernel.org/show_bug.cgi?id=9349 Kernel: 2.6.24+
No response from developers
That's funny, then how come there was a proper patch fix posted and it's now in my tree ready to go to Linus?
I think you like just saying "No response from developers" over and over again to make some of point about how developers are ignoring lots of bugs. That's fine, but at least be accurate about it :-)
On Tue, 13 Nov 2007 03:39:46 -0800 (PST) David Miller davem@davemloft.net wrote:
From: Andrew Morton akpm@linux-foundation.org Date: Tue, 13 Nov 2007 03:15:53 -0800
NETWORKING===========================================================
RTNLGRP_ND_USEROPT does not report ifindex (IPv6) http://bugzilla.kernel.org/show_bug.cgi?id=9349 Kernel: 2.6.24+
No response from developers
That's funny, then how come there was a proper patch fix posted and it's now in my tree ready to go to Linus?
I think you like just saying "No response from developers" over and over again to make some of point about how developers are ignoring lots of bugs. That's fine, but at least be accurate about it :-)
Do you believe that our response to bug reports is adequate?
From: Andrew Morton akpm@linux-foundation.org Date: Tue, 13 Nov 2007 03:49:16 -0800
Do you believe that our response to bug reports is adequate?
Do you feel that making us feel and look like shit helps?
I guess I'm just masterbating here all night long with the 46 bug fixes I've reviewed fully and queued up into my tree. Along with all the 10 or so -stable submissions I did tonight as well.
When someone like me is bug fixing full time, I take massive offense to the impression you're trying to give especially when it's directed at the networking.
So turn it down a notch Andrew.
I bet if you did things like list explicitly by name every single person who adds a bug fix (however trivial) to an -mm release instead of a new feature, you'll better achieve your goal than what you're doing here.
On Tue, 13 Nov 2007 03:58:24 -0800 (PST) David Miller davem@davemloft.net wrote:
From: Andrew Morton akpm@linux-foundation.org Date: Tue, 13 Nov 2007 03:49:16 -0800
Do you believe that our response to bug reports is adequate?
Do you feel that making us feel and look like shit helps?
That doesn't answer my question.
See, first we need to work out whether we have a problem. If we do this, then we can then have a think about what to do about it.
I tried to convince the 2006 KS attendees that we have a problem and I resoundingly failed. People seemed to think that we're doing OK.
But it appears that data such as this contradicts that belief.
This is not a minor matter. If the kernel _is_ slowly deteriorating then this won't become readily apparent until it has been happening for a number of years. By that stage there will be so much work to do to get us back to an acceptable level that it will take a huge effort. And it will take a long time after that for the kerel to get its reputation back.
So it is important that we catch deterioration *early* if it is happening.
From: Andrew Morton akpm@linux-foundation.org Date: Tue, 13 Nov 2007 04:12:59 -0800
On Tue, 13 Nov 2007 03:58:24 -0800 (PST) David Miller davem@davemloft.net wrote:
From: Andrew Morton akpm@linux-foundation.org Date: Tue, 13 Nov 2007 03:49:16 -0800
Do you believe that our response to bug reports is adequate?
Do you feel that making us feel and look like shit helps?
That doesn't answer my question.
See, first we need to work out whether we have a problem. If we do this, then we can then have a think about what to do about it.
I tried to convince the 2006 KS attendees that we have a problem and I resoundingly failed. People seemed to think that we're doing OK.
But it appears that data such as this contradicts that belief.
This is not a minor matter. If the kernel _is_ slowly deteriorating then this won't become readily apparent until it has been happening for a number of years. By that stage there will be so much work to do to get us back to an acceptable level that it will take a huge effort. And it will take a long time after that for the kerel to get its reputation back.
So it is important that we catch deterioration *early* if it is happening.
You tell me what I should spend my time working on, and I promise to do it OK? :-)
For example, if I have a choice between a TCP crash just about anyone can hit and some obscure issue only reported with some device nearly nobody has, which one should I analyze and work on?
That's the problem. All of us prioritize and it means the chaff collects at the bottom. You cannot fix that except by getting more bug fixers so that the chaff pile has a chance to get smaller.
Luckily if the report being ignored isn't chaff, it will show up again (and again and again) and this triggers a reprioritization because not only is the bug no longer chaff, it also now got a lot of information tagged to it so it's a double worthwhile investment to work on the problem.
I think a lot of bugs that "aren't getting looked at" are simply sitting in some early stage of this process.
On Tue, 13 Nov 2007 04:32:07 -0800 (PST) David Miller davem@davemloft.net wrote:
From: Andrew Morton akpm@linux-foundation.org Date: Tue, 13 Nov 2007 04:12:59 -0800
On Tue, 13 Nov 2007 03:58:24 -0800 (PST) David Miller davem@davemloft.net wrote:
From: Andrew Morton akpm@linux-foundation.org Date: Tue, 13 Nov 2007 03:49:16 -0800
Do you believe that our response to bug reports is adequate?
Do you feel that making us feel and look like shit helps?
That doesn't answer my question.
See, first we need to work out whether we have a problem. If we do this, then we can then have a think about what to do about it.
I tried to convince the 2006 KS attendees that we have a problem and I resoundingly failed. People seemed to think that we're doing OK.
But it appears that data such as this contradicts that belief.
This is not a minor matter. If the kernel _is_ slowly deteriorating then this won't become readily apparent until it has been happening for a number of years. By that stage there will be so much work to do to get us back to an acceptable level that it will take a huge effort. And it will take a long time after that for the kerel to get its reputation back.
So it is important that we catch deterioration *early* if it is happening.
You tell me what I should spend my time working on, and I promise to do it OK? :-)
My suggestion: regressions.
If we're really active in chasing down the regressions then I think we can be confident that the kernel isn't deteriorating. Probably it will be improving as we also fix some always-been-there bugs.
I think that we're fairly good about working the regressions in Adrian/Michal/Rafael's lists but once Linus releases 2.6.x we tend to let the unsolved ones slide, and we don't pay as much attention to the regressions which 2.6.x testers report.
For example, if I have a choice between a TCP crash just about anyone can hit and some obscure issue only reported with some device nearly nobody has, which one should I analyze and work on?
That's the problem. All of us prioritize and it means the chaff collects at the bottom. You cannot fix that except by getting more bug fixers so that the chaff pile has a chance to get smaller.
Luckily if the report being ignored isn't chaff, it will show up again (and again and again) and this triggers a reprioritization because not only is the bug no longer chaff, it also now got a lot of information tagged to it so it's a double worthwhile investment to work on the problem.
I think a lot of bugs that "aren't getting looked at" are simply sitting in some early stage of this process.
Yes, that's a useful technique. If multiple people are being hurt a lot by a bug then that's a more important one to fix than the single-person minor-irritant bug.
otoh that doesn't work very well with driver/platform bugs. Often these are regressions which only a single person can reproduce within the time window which we have in which we can fix it. If we don't fix it in that window it'll go out to distros and presumably some more people will hit it.
So I don't see much alternative here to the traditional work-with-the-originator way of resolving it.
git bisection should really help us with these regressions but it doesn't appear that people are using as much as one would like. I'm hoping that the very good http://www.kernel.org/doc/local/git-quick.html will help us out here. Thanks to the mystery person who prepared that.
On Tue, Nov 13, 2007 at 04:32:07AM -0800, David Miller wrote:
Luckily if the report being ignored isn't chaff, it will show up again (and again and again) and this triggers a reprioritization because not only is the bug no longer chaff, it also now got a lot of information tagged to it so it's a double worthwhile investment to work on the problem.
Strongly agree.
This is exactly what happened to that ARM NO_HZ bug report. The report in bugzilla was rather lacking (and wrong) in ways that have already been described. HPET on ARM? 8)
Then on the morning of 6th November, someone reported on the mailing list that "pxa270 doesn't work with oneshot timer" and that was the trigger to getting the bug resolved - because it was a narrowly defined bug report.
Since it was a narrowly defined bug report, it became very easy to investigate and resolve. About half an hour of time for an initial patch.
There's another issue I want to raise concerning bugzilla. We have the classic case of "not enough people reading bugzilla bugs" - which is one of the biggest problems with bugzilla. Virtually no one in the ARM community looks for ARM bugs in bugzilla.
Let's not forget that it would be a waste of time for people to manually check bugzilla for ARM bugs. There's soo few people reporting ARM bugs into bugzilla that a weekly manual check by every maintainer would just return the same old boring results for months and months at a time.
It would be far more productive if the ARM category was deleted from bugzilla and the few people who use bugzilla reported their bugs on the mailing list. We've a couple of thousand people on the ARM kernel mailing list at the moment - that's 3 orders of magnitude more of eyes than look at bugzilla.
(I'm not saying that if the ARM NO_HZ bug as reported in bugzilla had been reported on the correct mailing list would've been solved earlier; I doubt there'd be much difference. However, the probability of a question being asked of the reporter would've been much higher, and _that_ might have led to an earlier resolution.)
On Tue, Nov 13, 2007 at 07:32:19PM +0000, Russell King wrote:
... There's another issue I want to raise concerning bugzilla. We have the classic case of "not enough people reading bugzilla bugs" - which is one of the biggest problems with bugzilla. Virtually no one in the ARM community looks for ARM bugs in bugzilla.
Let's not forget that it would be a waste of time for people to manually check bugzilla for ARM bugs. There's soo few people reporting ARM bugs into bugzilla that a weekly manual check by every maintainer would just return the same old boring results for months and months at a time. ...
What about having all ARM bugs in Bugzilla by default assigned to linux-arm-kernel@lists.arm.linux.org.uk? [1]
Russell King
cu Adrian
[1] Either directly or through a pseudo address, but that's just a technical detail.
On Tue, Nov 13, 2007 at 09:13:19PM +0100, Adrian Bunk wrote:
On Tue, Nov 13, 2007 at 07:32:19PM +0000, Russell King wrote:
... There's another issue I want to raise concerning bugzilla. We have the classic case of "not enough people reading bugzilla bugs" - which is one of the biggest problems with bugzilla. Virtually no one in the ARM community looks for ARM bugs in bugzilla.
Let's not forget that it would be a waste of time for people to manually check bugzilla for ARM bugs. There's soo few people reporting ARM bugs into bugzilla that a weekly manual check by every maintainer would just return the same old boring results for months and months at a time. ...
What about having all ARM bugs in Bugzilla by default assigned to linux-arm-kernel@lists.arm.linux.org.uk? [1]
That would also work, probably much better than setting up yet another list.
My experience of trying to get mbligh to do this when I stopped looking after PCMCIA stuff was *extremely* painful. Wonder if it's become any easier of late?
On Tue, 13 Nov 2007 23:29:54 +0000 Russell King rmk+lkml@arm.linux.org.uk wrote:
On Tue, Nov 13, 2007 at 09:13:19PM +0100, Adrian Bunk wrote:
On Tue, Nov 13, 2007 at 07:32:19PM +0000, Russell King wrote:
... There's another issue I want to raise concerning bugzilla. We have the classic case of "not enough people reading bugzilla bugs" - which is one of the biggest problems with bugzilla. Virtually no one in the ARM community looks for ARM bugs in bugzilla.
Let's not forget that it would be a waste of time for people to manually check bugzilla for ARM bugs. There's soo few people reporting ARM bugs into bugzilla that a weekly manual check by every maintainer would just return the same old boring results for months and months at a time. ...
What about having all ARM bugs in Bugzilla by default assigned to linux-arm-kernel@lists.arm.linux.org.uk? [1]
That would also work, probably much better than setting up yet another list.
cpufreq (at least) does it this way. I don't know how well it is turning out in practice.
It's useful if the initial report makes it clear (ie; to me) that the report has already gone to a mailing list so I don't go and forward a duplicate.
But there are so few arm reports in bugzilla that this is all rather moot.
My experience of trying to get mbligh to do this when I stopped looking after PCMCIA stuff was *extremely* painful. Wonder if it's become any easier of late?
He's a bad, bad man ;)
But he's been turning these things around pretty rapidly lately.
On Tue, 13 Nov 2007 19:32:19 +0000 Russell King rmk+lkml@arm.linux.org.uk wrote:
There's another issue I want to raise concerning bugzilla. We have the classic case of "not enough people reading bugzilla bugs" - which is one of the biggest problems with bugzilla. Virtually no one in the ARM community looks for ARM bugs in bugzilla.
Nor should they.
Let's not forget that it would be a waste of time for people to manually check bugzilla for ARM bugs. There's soo few people reporting ARM bugs into bugzilla that a weekly manual check by every maintainer would just return the same old boring results for months and months at a time.
I screen all bugzilla reports. 100% of them.
- I'll try to establish whether it is a regression
- I'll solicit any extra information which I believe the reveloper will need
- I'll ensure that an appropriate developer has seen the report
And yes, the number of arm-specific reports in there is very small.
It would be far more productive if the ARM category was deleted from bugzilla and the few people who use bugzilla reported their bugs on the mailing list. We've a couple of thousand people on the ARM kernel mailing list at the moment - that's 3 orders of magnitude more of eyes than look at bugzilla.
Is that linux-arm-kernel@lists.arm.linux.org.uk?
If so, MANITAINERS claims that it is subscribers-only. That would cause some bug reporters to give up and go away.
On Tue, Nov 13, 2007 at 12:52:22PM -0800, Andrew Morton wrote:
On Tue, 13 Nov 2007 19:32:19 +0000 Russell King rmk+lkml@arm.linux.org.uk wrote:
There's another issue I want to raise concerning bugzilla. We have the classic case of "not enough people reading bugzilla bugs" - which is one of the biggest problems with bugzilla. Virtually no one in the ARM community looks for ARM bugs in bugzilla.
Nor should they.
So what you're saying is...
Let's not forget that it would be a waste of time for people to manually check bugzilla for ARM bugs. There's soo few people reporting ARM bugs into bugzilla that a weekly manual check by every maintainer would just return the same old boring results for months and months at a time.
I screen all bugzilla reports. 100% of them.
I'll try to establish whether it is a regression
I'll solicit any extra information which I believe the reveloper will need
I'll ensure that an appropriate developer has seen the report
And yes, the number of arm-specific reports in there is very small.
that just because you do this everyone in a select clique, who you include me in, should be doing this as well.
No. Thank. You.
It would be far more productive if the ARM category was deleted from bugzilla and the few people who use bugzilla reported their bugs on the mailing list. We've a couple of thousand people on the ARM kernel mailing list at the moment - that's 3 orders of magnitude more of eyes than look at bugzilla.
Is that linux-arm-kernel@lists.arm.linux.org.uk?
Yes.
If so, MANITAINERS claims that it is subscribers-only. That would cause some bug reporters to give up and go away.
Find some other mailing list; I'm not hosting *nor* am I willing to run a non-subscribers only mailing list. Period. Not negotiable, so don't even try to change my mind.
On Tue, 13 Nov 2007 22:18:01 +0000 Russell King rmk+lkml@arm.linux.org.uk wrote:
On Tue, Nov 13, 2007 at 12:52:22PM -0800, Andrew Morton wrote:
On Tue, 13 Nov 2007 19:32:19 +0000 Russell King rmk+lkml@arm.linux.org.uk wrote:
There's another issue I want to raise concerning bugzilla. We have the classic case of "not enough people reading bugzilla bugs" - which is one of the biggest problems with bugzilla. Virtually no one in the ARM community looks for ARM bugs in bugzilla.
Nor should they.
So what you're saying is...
Let's not forget that it would be a waste of time for people to manually check bugzilla for ARM bugs. There's soo few people reporting ARM bugs into bugzilla that a weekly manual check by every maintainer would just return the same old boring results for months and months at a time.
I screen all bugzilla reports. 100% of them.
I'll try to establish whether it is a regression
I'll solicit any extra information which I believe the reveloper will need
I'll ensure that an appropriate developer has seen the report
And yes, the number of arm-specific reports in there is very small.
that just because you do this everyone in a select clique, who you include me in, should be doing this as well.
No. Thank. You.
No, I don't mean that at all and this was very plainly obviously from my very clearly written email. Let me try again.
No, no subsystem developer needs to monitor new bugzilla reports. This is because *I do it for them*. I will actively make them aware of new reports which I believe are legitimate and which contain sufficient information for them to be able to take further action.
It would be far more productive if the ARM category was deleted from bugzilla and the few people who use bugzilla reported their bugs on the mailing list. We've a couple of thousand people on the ARM kernel mailing list at the moment - that's 3 orders of magnitude more of eyes than look at bugzilla.
Is that linux-arm-kernel@lists.arm.linux.org.uk?
Yes.
If so, MANITAINERS claims that it is subscribers-only. That would cause some bug reporters to give up and go away.
Find some other mailing list; I'm not hosting *nor* am I willing to run a non-subscribers only mailing list. Period. Not negotiable, so don't even try to change my mind.
Making a list subscribers-only will cause some bug reports to be lost.
Tradeoffs are involved, against which decisions must be made. You have made yours.
On Tue, Nov 13, 2007 at 02:32:01PM -0800, Andrew Morton wrote:
On Tue, 13 Nov 2007 22:18:01 +0000 Russell King rmk+lkml@arm.linux.org.uk wrote:
On Tue, Nov 13, 2007 at 12:52:22PM -0800, Andrew Morton wrote:
On Tue, 13 Nov 2007 19:32:19 +0000 Russell King rmk+lkml@arm.linux.org.uk wrote:
There's another issue I want to raise concerning bugzilla. We have the classic case of "not enough people reading bugzilla bugs" - which is one of the biggest problems with bugzilla. Virtually no one in the ARM community looks for ARM bugs in bugzilla.
Nor should they.
So what you're saying is...
Let's not forget that it would be a waste of time for people to manually check bugzilla for ARM bugs. There's soo few people reporting ARM bugs into bugzilla that a weekly manual check by every maintainer would just return the same old boring results for months and months at a time.
I screen all bugzilla reports. 100% of them.
I'll try to establish whether it is a regression
I'll solicit any extra information which I believe the reveloper will need
I'll ensure that an appropriate developer has seen the report
And yes, the number of arm-specific reports in there is very small.
that just because you do this everyone in a select clique, who you include me in, should be doing this as well.
No. Thank. You.
No, I don't mean that at all and this was very plainly obviously from my very clearly written email. Let me try again.
If you screen all bugzilla reports then you'll know that bug #9356 arrived at about 1400 GMT yesterday. It's hardly surprising then that your utterly crappy responses to Natalie's message (which, incidentally, wasn't copied to me) sent within 24 hours of that report cause *great* annoyance.
No, no subsystem developer needs to monitor new bugzilla reports. This is because *I do it for them*. I will actively make them aware of new reports which I believe are legitimate and which contain sufficient information for them to be able to take further action.
On the whole you do an excellent job with feeding the bug reports to people, and while I recognise that you're only human, things do occasionally go wrong. For instance, sending clearly marked Samsung S3C bugs to me rather than Ben Dooks (who's in MAINTAINERS for those platforms.)
It would be far more productive if the ARM category was deleted from bugzilla and the few people who use bugzilla reported their bugs on the mailing list. We've a couple of thousand people on the ARM kernel mailing list at the moment - that's 3 orders of magnitude more of eyes than look at bugzilla.
Is that linux-arm-kernel@lists.arm.linux.org.uk?
Yes.
If so, MANITAINERS claims that it is subscribers-only. That would cause some bug reporters to give up and go away.
Find some other mailing list; I'm not hosting *nor* am I willing to run a non-subscribers only mailing list. Period. Not negotiable, so don't even try to change my mind.
Making a list subscribers-only will cause some bug reports to be lost.
Tradeoffs are involved, against which decisions must be made. You have made yours.
So how are they lost when they're held in a moderation queue and are either accepted, a useful response given to the original poster, or are forwarded to someone who can deal with the issue.
I don't think "subscribers only" describes my lists - we don't devnull stuff just because the poster is not a subscriber.
On Tue, 13 Nov 2007 23:09:37 +0000 Russell King rmk+lkml@arm.linux.org.uk wrote:
On Tue, Nov 13, 2007 at 02:32:01PM -0800, Andrew Morton wrote:
On Tue, 13 Nov 2007 22:18:01 +0000 Russell King rmk+lkml@arm.linux.org.uk wrote:
On Tue, Nov 13, 2007 at 12:52:22PM -0800, Andrew Morton wrote:
On Tue, 13 Nov 2007 19:32:19 +0000 Russell King rmk+lkml@arm.linux.org.uk wrote:
No, I don't mean that at all and this was very plainly obviously from my very clearly written email. Let me try again.
If you screen all bugzilla reports then you'll know that bug #9356 arrived at about 1400 GMT yesterday. It's hardly surprising then that your utterly crappy responses to Natalie's message (which, incidentally, wasn't copied to me) sent within 24 hours of that report cause *great* annoyance.
No, no subsystem developer needs to monitor new bugzilla reports. This is because *I do it for them*. I will actively make them aware of new reports which I believe are legitimate and which contain sufficient information for them to be able to take further action.
On the whole you do an excellent job with feeding the bug reports to people, and while I recognise that you're only human, things do occasionally go wrong. For instance, sending clearly marked Samsung S3C bugs to me rather than Ben Dooks (who's in MAINTAINERS for those platforms.)
Well whatever, sorry. But this is in the noise floor. Point is: many bug reports aren't being attended to.
It would be far more productive if the ARM category was deleted from bugzilla and the few people who use bugzilla reported their bugs on the mailing list. We've a couple of thousand people on the ARM kernel mailing list at the moment - that's 3 orders of magnitude more of eyes than look at bugzilla.
Is that linux-arm-kernel@lists.arm.linux.org.uk?
Yes.
If so, MANITAINERS claims that it is subscribers-only. That would cause some bug reporters to give up and go away.
Find some other mailing list; I'm not hosting *nor* am I willing to run a non-subscribers only mailing list. Period. Not negotiable, so don't even try to change my mind.
Making a list subscribers-only will cause some bug reports to be lost.
Tradeoffs are involved, against which decisions must be made. You have made yours.
So how are they lost when they're held in a moderation queue and are either accepted, a useful response given to the original poster, or are forwarded to someone who can deal with the issue.
I don't think "subscribers only" describes my lists - we don't devnull stuff just because the poster is not a subscriber.
Oh, OK, as long as there really is a human paying attention to those things then that's fine. When one is on the sending end of these things one never knows how long it will take, not whether it will even happen.
From: Andrew Morton akpm@linux-foundation.org Date: Tue, 13 Nov 2007 14:32:01 -0800
On Tue, 13 Nov 2007 22:18:01 +0000 Russell King rmk+lkml@arm.linux.org.uk wrote:
Find some other mailing list; I'm not hosting *nor* am I willing to run a non-subscribers only mailing list. Period. Not negotiable, so don't even try to change my mind.
Making a list subscribers-only will cause some bug reports to be lost.
Tradeoffs are involved, against which decisions must be made. You have made yours.
Russell doesn't have to worry any more, he doesn't have to host it, and he doesn't have to be willing to run a non-subscribers-only mailing list.
Because I am.
I've created linux-arm@vger.kernel.org
Enjoy.
On Tue, 13 Nov 2007 17:55:51 -0800 (PST) David Miller davem@davemloft.net wrote:
I've created linux-arm@vger.kernel.org
Let me just say - I'm astonished at how little spam gets though the vger lists. Considering how many times those email addresses must have been added to spam databases.
It must be a lot of work, and whoever is doing it does it well.
I don't even know. Is it Matti? You?
<contemplates linux-kernel@lists.sourceforge.net. Shudders.>
From: Andrew Morton akpm@linux-foundation.org Date: Tue, 13 Nov 2007 18:27:00 -0800
Let me just say - I'm astonished at how little spam gets though the vger lists. Considering how many times those email addresses must have been added to spam databases.
It must be a lot of work, and whoever is doing it does it well.
I don't even know. Is it Matti? You?
Matti gets all the credit for setting up the bayesian et al. filters we have and training it as needed.
<contemplates linux-kernel@lists.sourceforge.net. Shudders.>
Yes, sourceforge is a complete joke.
On Tue, Nov 13, 2007 at 06:27:00PM -0800, Andrew Morton wrote:
On Tue, 13 Nov 2007 17:55:51 -0800 (PST) David Miller davem@davemloft.net wrote:
I've created linux-arm@vger.kernel.org
Let me just say - I'm astonished at how little spam gets though the vger lists. Considering how many times those email addresses must have been added to spam databases.
It must be a lot of work, and whoever is doing it does it well.
I don't even know. Is it Matti? You?
<contemplates linux-kernel@lists.sourceforge.net. Shudders.>
Martin's changed the owner for ARM bugs last night to the mailing list so the whole issue is now redundant.
On Tue, Nov 13, 2007 at 05:55:51PM -0800, David Miller wrote:
From: Andrew Morton akpm@linux-foundation.org Date: Tue, 13 Nov 2007 14:32:01 -0800
On Tue, 13 Nov 2007 22:18:01 +0000 Russell King rmk+lkml@arm.linux.org.uk wrote:
Find some other mailing list; I'm not hosting *nor* am I willing to run a non-subscribers only mailing list. Period. Not negotiable, so don't even try to change my mind.
Making a list subscribers-only will cause some bug reports to be lost.
Tradeoffs are involved, against which decisions must be made. You have made yours.
Russell doesn't have to worry any more, he doesn't have to host it, and he doesn't have to be willing to run a non-subscribers-only mailing list.
Because I am.
I've created linux-arm@vger.kernel.org
By doing so you've just said (implicitly) that you can not tolerate someone having a different opinion from your own.
While I accept *your* right to run *your* lists how you please, you are unable to accept *my* right to run *my* lists how I see fit.
Time will tell which lists will survive. Whatever, I suspect that by doing what you've just done, you're going to create more confusion and problems. Instead of having one focused place for discussions and bug reports, they're going to be spread more thinly, meaning less people looking at such things, meaning more bugs get ignored. Thus making the issue worse.
So, when are you creating a replacement alsa-devel mailing list on vger? That's also subscribers-only.
From: Russell King rmk+lkml@arm.linux.org.uk Date: Wed, 14 Nov 2007 09:55:07 +0000
On Tue, Nov 13, 2007 at 05:55:51PM -0800, David Miller wrote:
I've created linux-arm@vger.kernel.org
By doing so you've just said (implicitly) that you can not tolerate someone having a different opinion from your own.
I created a mailing list on a machine where I provide such services.
People can choose to use or not use the new list, it is their choice.
While I accept *your* right to run *your* lists how you please, you are unable to accept *my* right to run *my* lists how I see fit.
I didn't tell you to take your list down or to run it in some other way. I didn't tell you to unsubscribe everyone and move them over to the new list either.
I've provided an alternative, and people can pick and choose how they see fit. I'm letting natural selection run it's course. Are you able to cope with the fact that people might not want to use your list any longer? Perhaps that is what bugs you so much about my giving people a alternative choice.
So, when are you creating a replacement alsa-devel mailing list on vger? That's also subscribers-only.
The operative term is "alternative" rather than "replacement". Perhaps this misunderstanding is what you're so upset about.
And yes, that alsa list bugs the crap out of me too. I'm more than happy to provide an alternative for that one as well.
In fact, *poof*, there it is, linux-alsa@vger.kernel.org is there and available for anyone who wants to use it.
Have a nice day Russell.
On 14-11-07 11:07, David Miller wrote:
Added Jaroslav and Takashi to the already extensive CC....
From: Russell King rmk+lkml@arm.linux.org.uk
So, when are you creating a replacement alsa-devel mailing list on vger? That's also subscribers-only.
The operative term is "alternative" rather than "replacement". Perhaps this misunderstanding is what you're so upset about.
And yes, that alsa list bugs the crap out of me too. I'm more than happy to provide an alternative for that one as well.
alsa-devel@alsa-project.org is not subscriber-only. Same as that arm list, it's _moderated_ for non-subscribers and given that I and other moderators have been doing our best to moderate quickly (I tend to stay logged in to the moderation interface all day for example) what specifically bugged the crap out of you? It's not something a poster needs to concern himself with.
Also for alsa-devel the moderators tend to add any valid non-subcribers to a whitelist after landing in the queue the first time meaning even a delay is just a one-time thing normally. So what's the trouble? Basically, noone need even notice...
In fact, *poof*, there it is, linux-alsa@vger.kernel.org is there and available for anyone who wants to use it.
Not that I think that moving alsa-devel over to vger wouldn't be a good idea mind you; when the list moved from sourceforge, asking you to host it was my preferred option. I do somewhat suspect that Jaroslav would like to keep the alsa-devel@ name (and I'd like to ask you to then also host alsa-user@) and would then rewrite mail to those lists @alsa-project.org to vger.
But what is the problem you speak of with the alsa-devel list? While I would not mind loosing it, moderation hasn't been overly laborious and I'm not aware of any serious problems.
Rene.
From: Rene Herman rene.herman@keyaccess.nl Date: Wed, 14 Nov 2007 12:46:24 +0100
alsa-devel@alsa-project.org is not subscriber-only. Same as that arm list, it's _moderated_ for non-subscribers and given that I and other moderators have been doing our best to moderate quickly (I tend to stay logged in to the moderation interface all day for example) what specifically bugged the crap out of you? It's not something a poster needs to concern himself with.
The fact that it farts at me every time I post to this thread. That's rude and annoying.
Also for alsa-devel the moderators tend to add any valid non-subcribers to a whitelist after landing in the queue the first time meaning even a delay is just a one-time thing normally. So what's the trouble? Basically, noone need even notice...
That sucks for new people taking part in the conversation.
There is no reason for moderation at all, it isn't necessary for spam prevention and it does nothing but annoy new posters and make work for the moderator.
From: David Miller davem@davemloft.net Date: Wed, 14 Nov 2007 03:56:57 -0800 (PST)
The fact that it farts at me every time I post to this thread.
See? I got another one and I have received at least 10 of the following over the past 2 days.
That's rediculious.
And because a human adds the whitelist this is always going to happen to someone when they start posting to the alsa list for the first time.
/me gets ready for the 11th copy in response to this one...
-------------------- Subject: Your message to Alsa-devel awaits moderator approval From: alsa-devel-bounces@alsa-project.org To: davem@davemloft.net Date: Wed, 14 Nov 2007 12:57:06 +0100 Sender: alsa-devel-bounces@alsa-project.org
Your mail to 'Alsa-devel' with the subject
Re: [alsa-devel] [BUG] New Kernel Bugs
Is being held until the list moderator can review it for approval.
The reason it is being held:
Too many recipients to the message
Either the message will get posted to the list, or you will receive notification of the moderator's decision. If you would like to cancel this posting, please visit the following URL:
http://mailman.alsa-project.org/mailman/confirm/alsa-devel/12dd3bd077bbf9cd1...
At Wed, 14 Nov 2007 04:01:31 -0800 (PST), David Miller wrote:
From: David Miller davem@davemloft.net Date: Wed, 14 Nov 2007 03:56:57 -0800 (PST)
The fact that it farts at me every time I post to this thread.
See? I got another one and I have received at least 10 of the following over the past 2 days.
That's rediculious.
And because a human adds the whitelist this is always going to happen to someone when they start posting to the alsa list for the first time.
... if you give too many recipients in your post. That is often really annoying thing to me, together with keeping the unrelated subject line ;)
I personally don't care whether it's a moderated or open list. We chose it simply due to too bad S/N ratio at that time. So, if the current list annoys your or many others and the list management on vger is so good, it'd be basically a good move, of course. I'll appreciate it.
The only confusion would be the change of ML address, but we can do it slowly, too.
Takashi
On 14-11-07 09:25, Takashi Iwai wrote:
At Wed, 14 Nov 2007 04:01:31 -0800 (PST), David Miller wrote:
From: David Miller davem@davemloft.net Date: Wed, 14 Nov 2007 03:56:57 -0800 (PST)
The fact that it farts at me every time I post to this thread.
See? I got another one and I have received at least 10 of the following over the past 2 days.
That's rediculious.
And because a human adds the whitelist this is always going to happen to someone when they start posting to the alsa list for the first time.
... if you give too many recipients in your post. That is often really annoying thing to me, together with keeping the unrelated subject line ;)
I personally don't care whether it's a moderated or open list. We chose it simply due to too bad S/N ratio at that time. So, if the current list annoys your or many others and the list management on vger is so good, it'd be basically a good move, of course. I'll appreciate it.
The only confusion would be the change of ML address, but we can do it slowly, too.
I'd love the lists at vger. Amazing spam-filtering. I'd like to request the name alsa-devel@vger.kernel.org (and alsa-user@vger.kernel.org if at all possible so we can open that one up as well) though.
There wouldn't need to be a forced ML address change if Jaroslov would then just rewrite alsa-{devel,user}@alsa-project.org to vger.kernel.org same as he did for alsa-devel and does for alsa-user to @lists.sf.net.
Rene.
At Wed, 14 Nov 2007 13:21:30 +0100, Rene Herman wrote:
On 14-11-07 09:25, Takashi Iwai wrote:
At Wed, 14 Nov 2007 04:01:31 -0800 (PST), David Miller wrote:
From: David Miller davem@davemloft.net Date: Wed, 14 Nov 2007 03:56:57 -0800 (PST)
The fact that it farts at me every time I post to this thread.
See? I got another one and I have received at least 10 of the following over the past 2 days.
That's rediculious.
And because a human adds the whitelist this is always going to happen to someone when they start posting to the alsa list for the first time.
... if you give too many recipients in your post. That is often really annoying thing to me, together with keeping the unrelated subject line ;)
I personally don't care whether it's a moderated or open list. We chose it simply due to too bad S/N ratio at that time. So, if the current list annoys your or many others and the list management on vger is so good, it'd be basically a good move, of course. I'll appreciate it.
The only confusion would be the change of ML address, but we can do it slowly, too.
I'd love the lists at vger. Amazing spam-filtering. I'd like to request the name alsa-devel@vger.kernel.org (and alsa-user@vger.kernel.org if at all possible so we can open that one up as well) though.
I think alsa-user can stay as is. It's no place for dragging many other addresses like alsa-devel.
BTW, I also prefer keeping the name alsa-devel@. It's been so.
There wouldn't need to be a forced ML address change if Jaroslov would then just rewrite alsa-{devel,user}@alsa-project.org to vger.kernel.org same as he did for alsa-devel and does for alsa-user to @lists.sf.net.
If it works, then I'm for it, too.
thanks,
Takashi
From: Takashi Iwai tiwai@suse.de Date: Wed, 14 Nov 2007 10:47:39 +0100
I think alsa-user can stay as is. It's no place for dragging many other addresses like alsa-devel.
BTW, I also prefer keeping the name alsa-devel@. It's been so.
That's fine with me, I've changed it alsa-devel@vger.kernel.org
On 15-11-07 00:23, David Miller wrote:
From: Takashi Iwai tiwai@suse.de
BTW, I also prefer keeping the name alsa-devel@. It's been so.
That's fine with me, I've changed it alsa-devel@vger.kernel.org
Great, thanks. Jaroslav -- given that this list won't need moderation I'd consider it the main/only alsa-devel. The alsa-devel subscriber database was cleansed only a couple of months ago when moving from sourceforge so it should now be okay to just transfer all subscriptions.
Or maybe you're already moving things; mailman.alsa-project.org seems to be down at least....
Rene.
[Unrelated lists cut from Cc]
At Wed, 14 Nov 2007 15:23:15 -0800 (PST), David Miller wrote:
From: Takashi Iwai tiwai@suse.de Date: Wed, 14 Nov 2007 10:47:39 +0100
I think alsa-user can stay as is. It's no place for dragging many other addresses like alsa-devel.
BTW, I also prefer keeping the name alsa-devel@. It's been so.
That's fine with me, I've changed it alsa-devel@vger.kernel.org
Great, thanks David!
I believe it'd be really better to move alsa-devel to vger. The spam filters on vger does obviously better (faster) job than human being. We'll be released from some labors at least.
Jaroslav, let us know your opinion. alsa-devel ML is being hosted on your machine right now, so it's your decision...
thanks,
Takashi
On Thu, 15 Nov 2007, Takashi Iwai wrote:
[Unrelated lists cut from Cc]
At Wed, 14 Nov 2007 15:23:15 -0800 (PST), David Miller wrote:
From: Takashi Iwai tiwai@suse.de Date: Wed, 14 Nov 2007 10:47:39 +0100
I think alsa-user can stay as is. It's no place for dragging many other addresses like alsa-devel.
BTW, I also prefer keeping the name alsa-devel@. It's been so.
That's fine with me, I've changed it alsa-devel@vger.kernel.org
Great, thanks David!
I believe it'd be really better to move alsa-devel to vger. The spam filters on vger does obviously better (faster) job than human being. We'll be released from some labors at least.
Jaroslav, let us know your opinion. alsa-devel ML is being hosted on your machine right now, so it's your decision...
I would like to keep mailing list history in mailman, so I subscribed alsa-devel-archive@alsa-project.org to alsa-devel@vger.kernel.org mailing list.
David, could I send you the list of subscribers on the alsa-devel list (in a private e-mail) to move mailing list to vger without bothering all users to resubscribe again (we did it several months ago, so the subscriber list is quite clean)? I will redirect alsa-devel@alsa-project.org address then to alsa-devel@vger.kernel.org (if it does not break any vger rules or spam filtering on vger) to preserve alsa-devel list e-mail.
Thanks, Jaroslav
----- Jaroslav Kysela perex@perex.cz Linux Kernel Sound Maintainer ALSA Project
From: Jaroslav Kysela perex@perex.cz Date: Thu, 15 Nov 2007 14:56:38 +0100 (CET)
David, could I send you the list of subscribers on the alsa-devel list (in a private e-mail) to move mailing list to vger without bothering all users to resubscribe again (we did it several months ago, so the subscriber list is quite clean)? I will redirect alsa-devel@alsa-project.org address then to alsa-devel@vger.kernel.org (if it does not break any vger rules or spam filtering on vger) to preserve alsa-devel list e-mail.
We don't do this because:
1) If the user doesn't do the subscribe, they won't know how to unsubscribe.
If you think this isn't an issue, after 10 years of co-running vger I can tell you it's a huge issue as they just spam the list owner, with requests to be removed.
2) Users generally get upset when you subscribe them to a site they did not get asked to be subscribed to. And yes this is true even if it's the same mailing list just at a new location.
I apply this rule unilaterally to all new mailing lists hosted at vger, I'm not just singling out this case, believe me :-)
So please ask the folks on the existing list to subscribe to the new one.
Thank you.
Not trying to sound like a jerk or an idiot, but if you require us to resubscribe, why don't you post the link in your email? It would make our lives that much easier.
Thanks,
Tobin
On Thu, 2007-11-15 at 13:34 -0800, David Miller wrote:
From: Jaroslav Kysela perex@perex.cz Date: Thu, 15 Nov 2007 14:56:38 +0100 (CET)
David, could I send you the list of subscribers on the alsa-devel list (in a private e-mail) to move mailing list to vger without bothering all users to resubscribe again (we did it several months ago, so the subscriber list is quite clean)? I will redirect alsa-devel@alsa-project.org address then to alsa-devel@vger.kernel.org (if it does not break any vger rules or spam filtering on vger) to preserve alsa-devel list e-mail.
We don't do this because:
If the user doesn't do the subscribe, they won't know how to unsubscribe.
If you think this isn't an issue, after 10 years of co-running vger I can tell you it's a huge issue as they just spam the list owner, with requests to be removed.
Users generally get upset when you subscribe them to a site they did not get asked to be subscribed to. And yes this is true even if it's the same mailing list just at a new location.
I apply this rule unilaterally to all new mailing lists hosted at vger, I'm not just singling out this case, believe me :-)
So please ask the folks on the existing list to subscribe to the new one.
Thank you. _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
From: Tobin Davis tdavis@dsl-only.net Date: Thu, 15 Nov 2007 13:57:18 -0800
Not trying to sound like a jerk or an idiot, but if you require us to resubscribe, why don't you post the link in your email? It would make our lives that much easier.
Sure thing:
http://vger.kernel.org/majordomo-info.html#subscription
Enjoy.
On 15-11-07 23:28, David Miller wrote:
From: Tobin Davis tdavis@dsl-only.net Date: Thu, 15 Nov 2007 13:57:18 -0800
Not trying to sound like a jerk or an idiot, but if you require us to resubscribe, why don't you post the link in your email? It would make our lives that much easier.
Sure thing:
That is:
mailto:majordomo@vger.kernel.org?body=subscribe alsa-devel
and then follow instructions in (ie, reply to) the confirmation mail you'll be sent.
Rene.
On 14-11-07 13:01, David Miller wrote:
From: David Miller davem@davemloft.net Date: Wed, 14 Nov 2007 03:56:57 -0800 (PST)
The fact that it farts at me every time I post to this thread.
See? I got another one and I have received at least 10 of the following over the past 2 days.
Nah, in this case you are not even getting them to not being a non-subcriber but due to too many CCs. I got one as well. That just needs to be disabled, does not have anything to do with non-subscribers (and you're in the white list) but is just a retarted bit of list configuration...
(no, I can't personally change it, needs Jaroslav Kysela)
Rene.
On 14-11-07 12:56, David Miller wrote:
From: Rene Herman rene.herman@keyaccess.nl Date: Wed, 14 Nov 2007 12:46:24 +0100
alsa-devel@alsa-project.org is not subscriber-only. Same as that arm list, it's _moderated_ for non-subscribers and given that I and other moderators have been doing our best to moderate quickly (I tend to stay logged in to the moderation interface all day for example) what specifically bugged the crap out of you? It's not something a poster needs to concern himself with.
The fact that it farts at me every time I post to this thread. That's rude and annoying.
It certainly is. I only experienced that now due to the "too many recipients to message" moderation notice that I got from my own message.
Jaroslav -- please disable that junk or if possible, make it a "at most once per address per month" thing or somesuch. This is complete crap.
Also for alsa-devel the moderators tend to add any valid non-subcribers to a whitelist after landing in the queue the first time meaning even a delay is just a one-time thing normally. So what's the trouble? Basically, noone need even notice...
That sucks for new people taking part in the conversation.
There is no reason for moderation at all, it isn't necessary for spam prevention and it does nothing but annoy new posters and make work for the moderator.
Yes there is. It's necessary for lists that do not have the human and other resouces behind it that vger does. alsa-devel was drowning in spam and dying as a result back when it was at sourceforge. Upon moving, my preference was to ask the lists to be hosted at vger but given that (it seems) Jaroslav wanted to keep them locally, moderation was very necessary. I moderate out quite a bit of spam every day.
vger is doing an amazing job at spam filtering -- if it's an option to move to vger, than sure, no need. But otherwise, the "no need" needs a list admin with enough bandwidth and skill.
As to the "new people": it's not optimal, but (upto this thread I'll admit -- I woke up to a huge number of posts in the queue) it's not been a _real_ problem. alsa-devel is not high-volume enough for it to be.
Rene.
On Wed, Nov 14, 2007 at 12:46:24PM +0100, Rene Herman wrote:
On 14-11-07 11:07, David Miller wrote:
Added Jaroslav and Takashi to the already extensive CC....
From: Russell King rmk+lkml@arm.linux.org.uk
So, when are you creating a replacement alsa-devel mailing list on vger? That's also subscribers-only.
The operative term is "alternative" rather than "replacement". Perhaps this misunderstanding is what you're so upset about. And yes, that alsa list bugs the crap out of me too. I'm more than happy to provide an alternative for that one as well.
alsa-devel@alsa-project.org is not subscriber-only. Same as that arm list, it's _moderated_ for non-subscribers and given that I and other moderators have been doing our best to moderate quickly (I tend to stay logged in to the moderation interface all day for example) what specifically bugged the crap out of you? It's not something a poster needs to concern himself with.
Totally unrelated - I sent something to the kolab mailing list a couple of days ago (it's moderated for non subscribers) informing them that I had found the cause of some Cyrus bugs that they had problems with in the past and providing a link to my post to the cyrus list with the patches attached.
It sat in the moderation queue and then was rejected with "non subscriber post to subscription only list". Not only was the reponse a day later when I had moved on to other things, but it got me really pissed off that I had put some effort into providing a good quality post that outlined the specific issues and how they applied to their project, and had been summarily dismissed, probably without the effort being put in.
There's no way for a non-subscriber to know in advance if the list they are trying to post to will do that to them, completely negating the effort put in to writing something worthwhile to inform that community. It's insular, and it sucks.
So yeah, my attitude now is that the Kolab folks can go screw themselves and track down the fix on their own or wait until I've convinced upstream to accept the fixes (likely) and they have moved to the new version (unlikely for a long time, and meanwhile they're missing out on the performance increases that having a more stable skiplist library would give them)
I'm sure if I had something that I considered worth informing the ALSA project of, I'd be wary of spending the same effort writing a good post knowing it may be dropped in between the by a list moderator just selecing all and bouncing them.
Bron.
On 15-11-07 05:16, Bron Gondwana wrote:
Totally unrelated - I sent something to the kolab mailing list a couple
[ ... ]
I'm sure if I had something that I considered worth informing the ALSA project of, I'd be wary of spending the same effort writing a good post knowing it may be dropped in between the by a list moderator just selecing all and bouncing them.
Totally unrelated indeed so why are spouting crap? If the kohab list has a problem take it up with them but keep ALSA out of it. alsa-devel has only ever moderated out spam -- nothing else.
ene
On Thu, Nov 15, 2007 at 06:59:34AM +0100, Rene Herman wrote:
On 15-11-07 05:16, Bron Gondwana wrote:
Totally unrelated - I sent something to the kolab mailing list a couple
[ ... ]
I'm sure if I had something that I considered worth informing the ALSA project of, I'd be wary of spending the same effort writing a good post knowing it may be dropped in between the by a list moderator just selecing all and bouncing them.
Totally unrelated indeed so why are spouting crap? If the kohab list has a problem take it up with them but keep ALSA out of it. alsa-devel has only ever moderated out spam -- nothing else.
As an outsider to the list, how do I know what your policy will be other than "I've been rejected out of hand by someone else's list, so my experience is that member only lists aren't willing to listen to something I have to say unless I make the effort to sign up and have yet another folder accumulating unread messages". I don't.
Well, ok - maybe I do here since I've let myself be dragged in to the debate. Oops.
I get the same information from both project websites: "moderated for non-members, public archives" - no way of knowing that ALSA will accept me informing them of something they would be interested without committing to reading or bit-bucketing their list.
The alternative is to subscribe just long enough to send something and then unsubscribe again.... or cold-email a member and ask them to pass a message along. Or post and hope it doesn't get rejected, not even knowing for a day or so.
Bron.
On 15-11-07 13:02, Bron Gondwana wrote:
I get the same information from both project websites: "moderated for non-members, public archives" - no way of knowing that ALSA will accept me informing them of something they would be interested without committing to reading or bit-bucketing their list.
Can you please just shelve this crap? You have a way of knowing that "ALSA will accept you" and that is knowing or assuming that the ALSA project doesn't consist of drooling retards.
When a project list goes to the difficulty of moderating non-subscribers it has made the explicit choice to _not_ become subscriber only. Then refusing valid non-subscribers after all makes no sense whatsoever. I'm sorry you got your feelings hurt by that other list but it was no doubt an accident; take it up with them.
Rene.
On Thu, 15 November 2007 13:26:51 +0100, Rene Herman wrote:
Can you please just shelve this crap? You have a way of knowing that "ALSA will accept you" and that is knowing or assuming that the ALSA project doesn't consist of drooling retards.
Well, my experience with moderation has been that moderated mails are stuck in some queue for weeks. Two seperate lists, neither of them was alsa. If also is doing a better job, great. But it still has to live with the general reputation of non-subscriber moderation.
When a project list goes to the difficulty of moderating non-subscribers it has made the explicit choice to _not_ become subscriber only. Then refusing valid non-subscribers after all makes no sense whatsoever. I'm sorry you got your feelings hurt by that other list but it was no doubt an accident; take it up with them.
Been there, done that. In spite of people not being drooling retards, the amount of time and effort they invest into either moderation or improving the ruleset is quite limited. Problems persist.
And even without mails being held hostage for weeks, every single moderation mail is annoying. Like the one I'm sure to receive after sending this out.
Jörn
On 15-11-07 14:00, Jörn Engel wrote:
And even without mails being held hostage for weeks, every single moderation mail is annoying. Like the one I'm sure to receive after sending this out.
Certainly. Upto this thread I wasn't actually aware the list was doing that. While it might be informative once, getting it each time quickly gets old. Don't know if mailman can do anything like it but I'd suggest anyone running a non-subscriber-moderation list configure it to send such messages at most once a <time-period> per address or some such. And just disable the message if it cannot do that.
Fortunately, alsa-devel is (almost) no longer such a list anyway as it's moving to vger. Hurrah. David -- thanks.
Rene.
On Thu, Nov 15, 2007 at 06:59:34AM +0100, Rene Herman wrote:
Totally unrelated indeed so why are spouting crap? If the kohab list has a problem take it up with them but keep ALSA out of it. alsa-devel has only ever moderated out spam -- nothing else.
That is incorrect. Hopefully it is the case now though, since my experience of the subject was years ago.
OG.
At Thu, 15 Nov 2007 14:17:27 +0100, Olivier Galibert wrote:
On Thu, Nov 15, 2007 at 06:59:34AM +0100, Rene Herman wrote:
Totally unrelated indeed so why are spouting crap? If the kohab list has a problem take it up with them but keep ALSA out of it. alsa-devel has only ever moderated out spam -- nothing else.
That is incorrect. Hopefully it is the case now though, since my experience of the subject was years ago.
Yeah, it was really years ago that we once switched to the open list. Funny that people never forget such a thing :)
Takashi
On Wed, Nov 14, 2007 at 02:07:06AM -0800, David Miller wrote:
From: Russell King rmk+lkml@arm.linux.org.uk Date: Wed, 14 Nov 2007 09:55:07 +0000
On Tue, Nov 13, 2007 at 05:55:51PM -0800, David Miller wrote:
I've created linux-arm@vger.kernel.org
By doing so you've just said (implicitly) that you can not tolerate someone having a different opinion from your own.
I created a mailing list on a machine where I provide such services.
People can choose to use or not use the new list, it is their choice.
While I accept *your* right to run *your* lists how you please, you are unable to accept *my* right to run *my* lists how I see fit.
I didn't tell you to take your list down or to run it in some other way. I didn't tell you to unsubscribe everyone and move them over to the new list either.
I didn't say that you were.
I've provided an alternative, and people can pick and choose how they see fit. I'm letting natural selection run it's course. Are you able to cope with the fact that people might not want to use your list any longer? Perhaps that is what bugs you so much about my giving people a alternative choice.
Absolutely, and if you'd have read my message you'd have seen that I'd said effectively the same thing that you're saying here.
Having been flamed for not reading emails properly by AKPM shall I flame you for not reading my emails properly? Oh no, it's merely human to occasionally have such misunderstandings. Unless you're rmk.
If so, MANITAINERS claims that it is subscribers-only. That would cause some bug reporters to give up and go away.
Find some other mailing list; I'm not hosting *nor* am I willing to run a non-subscribers only mailing list. Period. Not negotiable, so don't even try to change my mind.
The postmasters at vger is pretty good at running mailing lists. For linux-kbuild my effort so far has been to request it. Thats not a big deal.
So if they accept it you could have linux-arm@vger.kernel.org for zero overhead for you.
Sam
On Wed, Nov 14, 2007 at 06:56:06AM +0100, Sam Ravnborg wrote:
If so, MANITAINERS claims that it is subscribers-only. That would cause some bug reporters to give up and go away.
Find some other mailing list; I'm not hosting *nor* am I willing to run a non-subscribers only mailing list. Period. Not negotiable, so don't even try to change my mind.
The postmasters at vger is pretty good at running mailing lists. For linux-kbuild my effort so far has been to request it. Thats not a big deal.
So if they accept it you could have linux-arm@vger.kernel.org for zero overhead for you.
And in a later mail I saw davem already created it.
Sam
From: Sam Ravnborg sam@ravnborg.org Date: Wed, 14 Nov 2007 06:56:06 +0100
If so, MANITAINERS claims that it is subscribers-only. That would cause some bug reporters to give up and go away.
Find some other mailing list; I'm not hosting *nor* am I willing to run a non-subscribers only mailing list. Period. Not negotiable, so don't even try to change my mind.
The postmasters at vger is pretty good at running mailing lists. For linux-kbuild my effort so far has been to request it. Thats not a big deal.
So if they accept it you could have linux-arm@vger.kernel.org for zero overhead for you.
I already did, get a little deeper in your mailbox before replying :-)
* Andrew Morton akpm@linux-foundation.org wrote:
Do you believe that our response to bug reports is adequate?
Do you feel that making us feel and look like shit helps?
That doesn't answer my question.
See, first we need to work out whether we have a problem. If we do this, then we can then have a think about what to do about it.
I tried to convince the 2006 KS attendees that we have a problem and I resoundingly failed. People seemed to think that we're doing OK.
But it appears that data such as this contradicts that belief.
This is not a minor matter. If the kernel _is_ slowly deteriorating then this won't become readily apparent until it has been happening for a number of years. By that stage there will be so much work to do to get us back to an acceptable level that it will take a huge effort. And it will take a long time after that for the kerel to get its reputation back.
So it is important that we catch deterioration *early* if it is happening.
yes, yes, yes, and i agree with you that there is a problem. I tried to make this point at the 2007 KS: not only is degradation in quality not apparent for years, slow degradation in quality can give kernel developers the exact _opposite_ perception! (Fewer testers means fewer bugreports and that results in apparent "improved" quality and fewer reported regressions - while exactly the opposite is happening and testers are leaving us without giving us any indication that this is happening. We just dont notice.)
I'm not moaning about bugs that slip through - those are unavoidable facts of a high flux codebase. I'm moaning about reoccuring, avoidable bugs, i'm moaning about hostility towards testers, i'm moaning about hostility towards automated testing, i'm moaning about unnecessary hoops a willing (but unskilled) tester has to go through to help us out.
I tried to make the point that the only good approach is to remove our current subjective bias from quality metrics and to at least realize what a cavalier attitude we still have to QA. The moment we are able to _measure_ how bad we are, kernel developers will adopt in a second and will improve those metrics. Lets use more debug tools, both static and dynamic ones. Lets measure tester base and we need to measure _lost_ early adopters and the reasons why they are lost. Regression metrics are a very important first step too and i'm very happy about the increasing effort that is being spent on this.
This is all QA-101 that _cannot be argued against on a rational basis_, it's just that these sorts of things have been largely ignored for years, in favor of the all-too-easy "open source means many eyeballs and that is our QA" answer, which is a _good_ answer but by far not the most intelligent answer! Today "many eyeballs" is simply not good enough and nature (and other OS projects) will route us around if we dont change.
We kernel developers have been spoiled by years of abundance in testing resources. We squander tons of resources in this area, and we could be so much more economic about this without hindering our development model in any way. We could be so much better about QA and everyone would benefit without having to compromize on the incoming flux of changes - it's so much easier to write new features for a high quality kernel.
My current guesstimation is that we are utilizing our current testing resources at around 10% efficiency. (i.e. if we did an 'ideal' job we could fix 10 times as many bugs with the same size of tester effort!) It used to be around 5%. (and i mainly attribute the increase from 5% to 10% to Andrew and the many other people who do kernel QA - kudos!) 10% is still awful and we very much suck.
Paradoxically, the "end product" is still considerably good quality in absolute terms because other pieces of our infrastructure are so good and powerful, but QA is still a 'weak link' of our path to the user that reduces the quality of the end result. We could _really_ be so much better without any compromises that hurt.
(and this is in no way directed at the networking folks - it holds for all of us. I have one main complaint about networking: the separate netdev list is a bad idea - networking regressions should be discussed and fixed on lkml, like most other subsystems are. Any artificial split of the lk discussion space is bad.)
Ingo
Ingo Molnar wrote: ..
This is all QA-101 that _cannot be argued against on a rational basis_, it's just that these sorts of things have been largely ignored for years, in favor of the all-too-easy "open source means many eyeballs and that is our QA" answer, which is a _good_ answer but by far not the most intelligent answer! Today "many eyeballs" is simply not good enough and nature (and other OS projects) will route us around if we dont change.
..
QA-101 and "many eyeballs" are not at all in opposition. The latter is how we find out about bugs on uncommon hardware, and the former is what we need to track them and overall quality.
A HUGE problem I have with current "efforts", is that once someone reports a bug, the onus seems to be 99% on the *reporter* to find the exact line of code or commit. Ghad what a repressive method.
And if the "developer" who broke the damn thing, or who at least "claims" to be supporting that code, cannot "reproduce" the bug, they drop it completely.
Contrast that flawed approach with how Linus does things.. he thinks through the symptoms, matches them to the code, and figures out what the few possibilities might be, and feeds back some trial balloon patches for the bug reporter to try.
MUCH better.
Linus also asks for a git bisect, but doesn't insist upon the reporter learning an entire new (poorly documented) toolset just to to report a bug.
Blah!
And remember, *I'm* an old-time Linux kernel developer.. just think about the people reporting bugs who haven't been around here since 1992..
-ml
Mark Lord wrote:
Ingo Molnar wrote: ..
This is all QA-101 that _cannot be argued against on a rational basis_, it's just that these sorts of things have been largely ignored for years, in favor of the all-too-easy "open source means many eyeballs and that is our QA" answer, which is a _good_ answer but by far not the most intelligent answer! Today "many eyeballs" is simply not good enough and nature (and other OS projects) will route us around if we dont change.
..
QA-101 and "many eyeballs" are not at all in opposition. The latter is how we find out about bugs on uncommon hardware, and the former is what we need to track them and overall quality.
A HUGE problem I have with current "efforts", is that once someone reports a bug, the onus seems to be 99% on the *reporter* to find the exact line of code or commit. Ghad what a repressive method.
As a long time kernel tester, I see some problem with the newer "new development model". In the short merge windows, after to much time, there are to many patches. So there are problem to bisect bugs, and to have attention of developers. My impression is that in a week there are many more messages in lkml and to much bugs to be handled in these few days.
I've two proposal:
- better patch quality. I would like that every commit would compile. So an automatic commit test and public blames could increase the quality of first commits. [bisecting with non compilable point it is not a trivial task]
- a slow down the patch inclusion on the merge windows (aka: not to much big changes in the first days). As tester I prefer that some big changes would be included in a "secondary window" (pre o rc release), in an other period as the big patch rush.
ciao cate
On Nov 13, 2007 7:24 AM, Giacomo A. Catenazzi cate@cateee.net wrote:
As a long time kernel tester, I see some problem with the newer "new development model". In the short merge windows, after to much time, there are to many patches.
I think the root issue there is that it's hard to get all testers to run a bisect, but easy to ask them to test snapshots. Right now the snapshots are generated nightly, but I think it would make more sense if they were generated every N patches, for some value of N...
Of course, for that to really work, we have to ensure that the result is always compilable, which has been getting better, but not perfect.
Ray
On Tue, Nov 13, 2007 at 07:57:54AM -0800, Ray Lee wrote:
On Nov 13, 2007 7:24 AM, Giacomo A. Catenazzi cate@cateee.net wrote:
As a long time kernel tester, I see some problem with the newer "new development model". In the short merge windows, after to much time, there are to many patches.
I think the root issue there is that it's hard to get all testers to run a bisect, but easy to ask them to test snapshots. Right now the snapshots are generated nightly, but I think it would make more sense if they were generated every N patches, for some value of N... ...
I don't see a point in doing that - that would be a more manual bisecting, and the result would not be one guilty commit.
Testers are not expected to be able to hack a kernel, but it's reasonable to expect testers to be able to build their own kernels (and your proposal wouldn't change that).
The small instruction below is enough for everyone who is able to build his own kernel to do a git bisect.
Ray
cu Adrian
<-- snip -->
# install git
# clone Linus' tree: git clone \ git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
# start bisecting: cd linux-2.6 git bisect start git bisect bad v2.6.21 git bisect good v2.6.20 cp /path/to/.config .
# start a round make oldconfig make # install kernel, check whether it's good or bad, then: git bisect [bad|good] # start next round
After at about 10-15 reboots you'll have found the guilty commit ("... is first bad commit").
More information on git bisecting: man git-bisect
I jump in this discussion hoping to have some more insight on git and to report my experience as a tester. I consider myself as half-literate in this (I am here since 1991, more or less, and I am able to compile a kernel and even hand-apply a patch, although I am in no way a kernel programmer).
On Tue, 2007-11-13 at 18:01 +0100, Adrian Bunk wrote:
The small instruction below is enough for everyone who is able to build his own kernel to do a git bisect.
# start bisecting: cd linux-2.6 git bisect start git bisect bad v2.6.21 git bisect good v2.6.20 cp /path/to/.config .
This was what I did in my (in the end almost successful) bisecting when trying to find the mmc problem (see the thread named "2.6.24-rc1 eat my SD card"). This is true in theory, but it has some problem. The "this commit does not compile is the easiest and in man git-bisect it's explained how to solve it. The changes in .config options, added or removed, are another problem when jumping back and forth from version (I was bitten by the gadzillions new options added to hda-intel alsa driver, but well, that is solvable with a bit of attention).
The main problem I had, and that stopped me to arrive to a definite is this situation:
j version-bad i h g unrelated (but similar) bug corrected f e d unrelated (but similar) bug introduced c b a version-good
(d was the series to change drivers to use sg helpers, and g was a "fix fallout from sg helpers" patch). Now I have a series of kernels (d, e, f) that did not work at all and so I cannot mark them good or bad. With the number of patches added in the free-for-all week, this is a very probable scenario. There is a way out from this using bisect?
Romano
PS as a suggestion, I think that added a "Reported-by", or "Tested-by", or "Debugged-by" attribution in the repository, as happened to be in the MMC case, is a nice an d welcomed reward for the effort.
Romano Giannetti wrote:
This was what I did in my (in the end almost successful) bisecting when trying to find the mmc problem (see the thread named "2.6.24-rc1 eat my SD card"). This is true in theory, but it has some problem. The "this commit does not compile is the easiest and in man git-bisect it's explained how to solve it. The changes in .config options, added or removed, are another problem when jumping back and forth from version.
The main problem I had, and that stopped me to arrive to a definite is this situation:
[...]
(d was the series to change drivers to use sg helpers, and g was a "fix fallout from sg helpers" patch). Now I have a series of kernels (d, e, f) that did not work at all and so I cannot mark them good or bad. With the number of patches added in the free-for-all week, this is a very probable scenario. There is a way out from this using bisect?
I think there are three strategies you can use in this case: - create a kernel config that is as simple as possible, but still supports your hardware and reproduces your problem; a simpler config will often avoid compilation issues in parts of the kernel that you're not using anyway and has the benefit of speeding up the compiles too
- if you know/suspect in what part of the tree the bug is, first limit the bisection to that; you will have to verify that you did indeed find the correct (broken) change by doing a compile for the "last good commit + 1"
- if you find a broken commit, use 'git-reset --hard' to try to jump past the bad set of commits, but of course that does not help in the case: g version-bad f unrelated bug corrected e d the broken commit that caused your problem c b unrelated bug that breaks compilation or system introduced a version-good in that case the best you can reasonably be expected to do is report that you narrowed it down to "between a and g" and leave the rest to the developers
Cheers, FJP
On Nov 13, 2007 3:08 PM, Mark Lord liml@rtr.ca wrote:
Ingo Molnar wrote: ..
This is all QA-101 that _cannot be argued against on a rational basis_, it's just that these sorts of things have been largely ignored for years, in favor of the all-too-easy "open source means many eyeballs and that is our QA" answer, which is a _good_ answer but by far not the most intelligent answer! Today "many eyeballs" is simply not good enough and nature (and other OS projects) will route us around if we dont change.
..
QA-101 and "many eyeballs" are not at all in opposition. The latter is how we find out about bugs on uncommon hardware, and the former is what we need to track them and overall quality.
A HUGE problem I have with current "efforts", is that once someone reports a bug, the onus seems to be 99% on the *reporter* to find the exact line of code or commit. Ghad what a repressive method.
Btw, I used to test every -mm kernel. But since I've switched distros (gentoo->ubuntu) and I have less time, I feel it's harder to test -rc or -mm kernels (I know this isn't a lkml problem but more a distro problem, but I would love having an ubuntu blessed repo with current dev kernel for the latest stable ubuntu release).
For debugging, maybe it's time someone does an amazon ec2+s3 service to automate the bisecting and create .deb/.rpm from git, I don't know how much it would cost though.
regards,
Benoit
* Benoit Boissinot bboissin@gmail.com wrote:
For debugging, maybe it's time someone does an amazon ec2+s3 service to automate the bisecting and create .deb/.rpm from git, I don't know how much it would cost though.
a few months ago i estimated the costs of this and it's just a few terabytes so within arm's reach. As long as the .deb/.rpm's are built by tracking -git in a rolling fashion CPU time should not be a big issue. The only limit is download bandwidth - but even that problem might be solvable via a huge git repository of ready-to-boot .o's that are linked together on the tester's machine.
Ingo
On Tue, Nov 13, 2007 at 04:52:32PM +0100, Benoit Boissinot wrote:
Btw, I used to test every -mm kernel. But since I've switched distros (gentoo->ubuntu) and I have less time, I feel it's harder to test -rc or -mm kernels (I know this isn't a lkml problem but more a distro problem, but I would love having an ubuntu blessed repo with current dev kernel for the latest stable ubuntu release).
There are two parts to this. One is a Ubuntu development kernel which we can give to large numbers of people to expand our testing pool. But if we don't do a better job of responding to bug reports that would be generated by expanded testing this won't necessarily help us.
The other an automated set of standard pre-built bisection points so that testers can more easily localize a bug down to a few hundred commits without needing to learn how to use "git bisect" (think Ubuntu users).
So for the first, I've actually been playing with some plans to put together an unofficial kernel that basically "what Ted is using on his laptop". It generally has emergency bug fixes that haven't made it into mainline, plus some other trees where I've been more aggressive since I want to latest in wireless and powersaving technology, etc. It has the property that "if it breaks, you get to keep both pieces --- and I've helpfully included the git ID in the package name so you can do the bisection yourself". If you want to try it, the first such kernel is here:
http://www.kernel.org/~tytso/tbek
I wasn't planning on talking about it until it was more fully baked, but if people want something vaguely stable based on 2.6.24-rc2, this might be interesting.
As for the second, I was just talking to Arjan over pizza and beer last night, and we reached the same conclusion as Ingo, which is this really isn't that hard. It wouldn't be that hard to set up infrastructure to do this, and it's just a matter of getting the disk space and the network bandwidth togehter in the right place, plus a relatively small amount of prgramming at least for the simplest iteration of the idea. (As is quite common when doing designs over beer, we talked about some more gradious web-based schemes to do custom built kernels that was tied to the kernel bugzilla, but first things first. :-)
- Ted
The other an automated set of standard pre-built bisection points so that testers can more easily localize a bug down to a few hundred commits without needing to learn how to use "git bisect" (think Ubuntu users).
Before that you want a flowchart or instruction list of boot options to try. A lot of errors can be localised simply by asking the reported to boot with things like "iommu=off", "pci=routeirq", "apci=off" etc
That takes a lot less time to run through and can be very informative.
Alan
Theodore Tso wrote:
On Tue, Nov 13, 2007 at 04:52:32PM +0100, Benoit Boissinot wrote:
Btw, I used to test every -mm kernel. But since I've switched distros (gentoo->ubuntu) and I have less time, I feel it's harder to test -rc or -mm kernels (I know this isn't a lkml problem but more a distro problem, but I would love having an ubuntu blessed repo with current dev kernel for the latest stable ubuntu release).
There are two parts to this. One is a Ubuntu development kernel which we can give to large numbers of people to expand our testing pool. But if we don't do a better job of responding to bug reports that would be generated by expanded testing this won't necessarily help us.
I'm very encouraged to read of your expanded testing efforts. As a bcm43xx developer, Ubuntu has been our problem distro, mostly because your standard kernels have debugging turned off for bcm43xx. When a Ubuntu user reports a problem and we ask for the relevant output from dmesg, they have no information. I ask two things of all distros: (1) Turn on debugging - we don't spam the logs that badly, and (2) forward any bugs found by your testing to the maintainer, and/or the bcm43xx mailing list.
Thanks,
Larry
On Tue, Nov 13, 2007 at 11:33:44AM -0600, Larry Finger wrote:
I'm very encouraged to read of your expanded testing efforts. As a bcm43xx developer, Ubuntu has been our problem distro, mostly because your standard kernels have debugging turned off for bcm43xx. When a Ubuntu user reports a problem and we ask for the relevant output from dmesg, they have no information. I ask two things of all distros: (1) Turn on debugging - we don't spam the logs that badly, and (2) forward any bugs found by your testing to the maintainer, and/or the bcm43xx mailing list.
Heh. I hadn't enabled CONFIG_BCM43XX_DEBUG myself, but I just changed it for my next kernel build. This is a slightly different issue, which is that sometimes _DEBUG options shouldn't be turned on by default (because they really trash performance and bloat log size), and sometimes they are painless to turn on and don't cost much.
If that is the case, I'd suggest removing the option and just making it compiled in by default with a run-time option to enable it.
- Ted
Theodore Tso wrote:
Heh. I hadn't enabled CONFIG_BCM43XX_DEBUG myself, but I just changed it for my next kernel build. This is a slightly different issue, which is that sometimes _DEBUG options shouldn't be turned on by default (because they really trash performance and bloat log size), and sometimes they are painless to turn on and don't cost much.
If that is the case, I'd suggest removing the option and just making it compiled in by default with a run-time option to enable it.
I am taking your suggestion and will produce the necessary patches for ssb, b43 and b43legacy. As bcm43xx is likely to be removed from 2.6.25, which is the earliest such a non-bug fix patch would be accepted, I hope that your future distribution and testing kernels will include the debug option.
Thanks,
Larry
On Tue, Nov 13, 2007 at 12:13:56PM -0500, Theodore Tso wrote:
On Tue, Nov 13, 2007 at 04:52:32PM +0100, Benoit Boissinot wrote:
Btw, I used to test every -mm kernel. But since I've switched distros (gentoo->ubuntu) and I have less time, I feel it's harder to test -rc or -mm kernels (I know this isn't a lkml problem but more a distro problem, but I would love having an ubuntu blessed repo with current dev kernel for the latest stable ubuntu release).
There are two parts to this. One is a Ubuntu development kernel which we can give to large numbers of people to expand our testing pool. But if we don't do a better job of responding to bug reports that would be generated by expanded testing this won't necessarily help us. ...
The main problem aren't missing testers [1] - we already have relatively experienced people testing kernels and/or reporting bugs, and we slowly scare them away due to the many bug reports without any reaction.
The main problem is finding experienced developers who spend time on looking into bug reports.
Getting many relatively unexperienced users (who need more guidance for debugging issues) as additional testers is therefore IMHO not necessarily a good idea.
- Ted
cu Adrian
[1] and e.g. when Greg says he has a few hundred people who want to write drivers it would most likely be possible to find a few dozen additional -rc testers among them
Adrian Bunk wrote:
On Tue, Nov 13, 2007 at 12:13:56PM -0500, Theodore Tso wrote:
On Tue, Nov 13, 2007 at 04:52:32PM +0100, Benoit Boissinot wrote:
Btw, I used to test every -mm kernel. But since I've switched distros (gentoo->ubuntu) and I have less time, I feel it's harder to test -rc or -mm kernels (I know this isn't a lkml problem but more a distro problem, but I would love having an ubuntu blessed repo with current dev kernel for the latest stable ubuntu release).
There are two parts to this. One is a Ubuntu development kernel which we can give to large numbers of people to expand our testing pool. But if we don't do a better job of responding to bug reports that would be generated by expanded testing this won't necessarily help us. ...
The main problem is finding experienced developers who spend time on looking into bug reports.
There are already. IMO the problem is the development model.
There are tons new features in each new kernel release and 'tons new bugs' which are not fixed during the release cycle nor in the .XX stable kernels.
Maybe after XX kernel releases there should be one just with bug-fixes _without_ any new features , eg: cleaning bugs from bugzilla , know regressions , cleaning up code , removing broken drivers and the like.
cu Adrian
Gabriel
On Tuesday 13 November 2007 11:57, Gabriel C wrote:
The main problem is finding experienced developers who spend time on looking into bug reports.
There are already. IMO the problem is the development model.
There are tons new features in each new kernel release and 'tons new bugs' which are not fixed during the release cycle nor in the .XX stable kernels.
Maybe after XX kernel releases there should be one just with bug-fixes _without_ any new features , eg: cleaning bugs from bugzilla , know regressions , cleaning up code , removing broken drivers and the like.
Won't work. You cannot force people to work on things they don't find interesting, long-term. -- vda
On Tuesday 13 November 2007 10:56, Adrian Bunk wrote:
On Tue, Nov 13, 2007 at 12:13:56PM -0500, Theodore Tso wrote:
On Tue, Nov 13, 2007 at 04:52:32PM +0100, Benoit Boissinot wrote:
Btw, I used to test every -mm kernel. But since I've switched distros (gentoo->ubuntu) and I have less time, I feel it's harder to test -rc or -mm kernels (I know this isn't a lkml problem but more a distro problem, but I would love having an ubuntu blessed repo with current dev kernel for the latest stable ubuntu release).
There are two parts to this. One is a Ubuntu development kernel which we can give to large numbers of people to expand our testing pool. But if we don't do a better job of responding to bug reports that would be generated by expanded testing this won't necessarily help us. ...
The main problem aren't missing testers [1] - we already have relatively experienced people testing kernels and/or reporting bugs, and we slowly scare them away due to the many bug reports without any reaction.
The main problem is finding experienced developers who spend time on looking into bug reports.
Getting many relatively unexperienced users (who need more guidance for debugging issues) as additional testers is therefore IMHO not necessarily a good idea.
And where experienced developrs are coming from? They are not born with Linux kernel skills. They grow up from within user base.
Bigger user base -> more developers (eventually) -- vda
On Tue, Nov 13, 2007 at 05:39:45PM -0700, Denys Vlasenko wrote:
On Tuesday 13 November 2007 10:56, Adrian Bunk wrote:
On Tue, Nov 13, 2007 at 12:13:56PM -0500, Theodore Tso wrote:
On Tue, Nov 13, 2007 at 04:52:32PM +0100, Benoit Boissinot wrote:
Btw, I used to test every -mm kernel. But since I've switched distros (gentoo->ubuntu) and I have less time, I feel it's harder to test -rc or -mm kernels (I know this isn't a lkml problem but more a distro problem, but I would love having an ubuntu blessed repo with current dev kernel for the latest stable ubuntu release).
There are two parts to this. One is a Ubuntu development kernel which we can give to large numbers of people to expand our testing pool. But if we don't do a better job of responding to bug reports that would be generated by expanded testing this won't necessarily help us. ...
The main problem aren't missing testers [1] - we already have relatively experienced people testing kernels and/or reporting bugs, and we slowly scare them away due to the many bug reports without any reaction.
The main problem is finding experienced developers who spend time on looking into bug reports.
Getting many relatively unexperienced users (who need more guidance for debugging issues) as additional testers is therefore IMHO not necessarily a good idea.
And where experienced developrs are coming from? They are not born with Linux kernel skills. They grow up from within user base.
Bigger user base -> more developers (eventually)
You missed the following in my email: "we slowly scare them away due to the many bug reports without any reaction."
The problem is that bug reports take time. If you go away from easy things like compile errors then even things like describing what does no longer work, ideally producing a scenario where you can reproduce it and verifying whether it was present in previous kernels can easily take many hours that are spent before the initial bug report.
If the bug report then gets ignored we discourage the person who sent the bug report to do any work related to the kernel again.
vda
cu Adrian
On Wednesday 14 November 2007 00:27, Adrian Bunk wrote:
You missed the following in my email: "we slowly scare them away due to the many bug reports without any reaction."
The problem is that bug reports take time. If you go away from easy things like compile errors then even things like describing what does no longer work, ideally producing a scenario where you can reproduce it and verifying whether it was present in previous kernels can easily take many hours that are spent before the initial bug report.
If the bug report then gets ignored we discourage the person who sent the bug report to do any work related to the kernel again.
Cannot agree more. I am in a similar position right now. My patch to aic7xxx driver was ubmitted four times with not much reaction from scsi guys.
Finally they replied and asked to rediff it against their git tree. I did that and sent patches back. No reply since then.
And mind you, the patch is not trying to do anything complex, it mostly moves code around, removes 'inline', adds 'const'. What should I think about it? -- vda
On Wed, Nov 14, 2007 at 12:46:20AM -0700, Denys Vlasenko wrote:
Finally they replied and asked to rediff it against their git tree. I did that and sent patches back. No reply since then.
And mind you, the patch is not trying to do anything complex, it mostly moves code around, removes 'inline', adds 'const'. What should I think about it?
I'm waiting for an ACK/NAK from Hannes, the maintainer. What should I do?
Matthew Wilcox wrote:
On Wed, Nov 14, 2007 at 12:46:20AM -0700, Denys Vlasenko wrote:
Finally they replied and asked to rediff it against their git tree. I did that and sent patches back. No reply since then.
And mind you, the patch is not trying to do anything complex, it mostly moves code around, removes 'inline', adds 'const'. What should I think about it?
I'm waiting for an ACK/NAK from Hannes, the maintainer. What should I do?
I haven't actually been able to test it here (too busy, sorry). If someone else confirms it does it's job then
Acked-by: Hannes Reinecke hare@suse.de
Cheers,
Hannes
Denys Vlasenko wrote:
On Wednesday 14 November 2007 00:27, Adrian Bunk wrote:
You missed the following in my email: "we slowly scare them away due to the many bug reports without any reaction."
The problem is that bug reports take time. If you go away from easy things like compile errors then even things like describing what does no longer work, ideally producing a scenario where you can reproduce it and verifying whether it was present in previous kernels can easily take many hours that are spent before the initial bug report.
If the bug report then gets ignored we discourage the person who sent the bug report to do any work related to the kernel again.
Cannot agree more. I am in a similar position right now. My patch to aic7xxx driver was ubmitted four times with not much reaction from scsi guys.
Finally they replied and asked to rediff it against their git tree. I did that and sent patches back. No reply since then.
And mind you, the patch is not trying to do anything complex, it mostly moves code around, removes 'inline', adds 'const'. What should I think about it?
this has nothing to do with the bugs on bugzilla.
you're trying to send a janitor patch. It should be logical that the response to that is not heated or receiving a joyous reception :)
If you have a problem getting your cleanup patch to the driver maintainer, send it to the subsystem maintainer instead, or even the janitors, or even Adrian Bunk who will gladly push it to everyone. Or, even to Andrew Morton who will carry it in -mm for a while and then harrasses the subsystem maintainer to merge it for you!
Cheers,
Auke
On Tue, 13 Nov 2007, Theodore Tso wrote:
There are two parts to this. One is a Ubuntu development kernel which we can give to large numbers of people to expand our testing pool. But if we don't do a better job of responding to bug reports that would be generated by expanded testing this won't necessarily help us.
The other an automated set of standard pre-built bisection points so that testers can more easily localize a bug down to a few hundred commits without needing to learn how to use "git bisect" (think Ubuntu users).
I don't see any reason that we couldn't have a tool accessible to Ubuntu users that does a real "git bisect". Git is really good at being scripted by fancy GUIs. It should be easy enough to have a drop down with all of the Ubuntu kernel package releases, where the user selects what works and what doesn't. Then the tool clones a git repository with flags to only get relevant parts, and then leads a bisect run, where it's also configuring, building, and installing the kernels (as a different grub entry), and providing instructions in general. Fundamentally, "git bisect" is a really low-interaction process: you tell it a couple of commits, and then it does stuff, and then you tell it "I tested, and it worked" or "I tested, and it had the problem" or "Something else went wrong", and it asks you something new. Other than that, it just takes time (and a build system hook, which this tool would handle for the kernel). Eventually, it tells you what to report, and you do so.
-Daniel *This .sig left intentionally blank*
On Wed, Nov 14, 2007 at 06:23:34PM -0500, Daniel Barkalow wrote:
I don't see any reason that we couldn't have a tool accessible to Ubuntu users that does a real "git bisect". Git is really good at being scripted by fancy GUIs. It should be easy enough to have a drop down with all of the Ubuntu kernel package releases, where the user selects what works and what doesn't.
It's possible users who haven't yet downloaded a git repository have to surmount some obstacles that might cause them to lose interest. First, they have to download some 190 megs of git repository, and if they have a slow link, that can take a while, and then they have to build each kernel, which can take a while. A full kernel build with everything selected can take good 30 minutes or more, and that's on a fast dual-core machine with 4gigs of memory and 7200rpm disk drives. On a slower, memory limited laptop, doing a single kernel build can take more time than the user has patiences; multiply that by 7 or 8 build and test boots, and it starts to get tiresome.
And then on top of that there are the issues about whether there is enough support for dealing with hitting kernel revisions that fail due to other bugs getting merged in during the -rc1 process, etc.
I agree that a tool that automated the bisection process and walked the user through it would be helpful, but I believe it would be possible for us do better.
- Ted
On Thu, 15 Nov 2007, Theodore Tso wrote:
On Wed, Nov 14, 2007 at 06:23:34PM -0500, Daniel Barkalow wrote:
I don't see any reason that we couldn't have a tool accessible to Ubuntu users that does a real "git bisect". Git is really good at being scripted by fancy GUIs. It should be easy enough to have a drop down with all of the Ubuntu kernel package releases, where the user selects what works and what doesn't.
It's possible users who haven't yet downloaded a git repository have to surmount some obstacles that might cause them to lose interest. First, they have to download some 190 megs of git repository, and if they have a slow link, that can take a while, and then they have to build each kernel, which can take a while.
It should be possible for it to clone only the portion that they actually care about based on where the known-good version is. It should also (in theory, anyway) be possible to put off some amount of the download until it's actually going to be relevant.
A full kernel build with everything selected can take good 30 minutes or more, and that's on a fast dual-core machine with 4gigs of memory and 7200rpm disk drives. On a slower, memory limited laptop, doing a single kernel build can take more time than the user has patiences; multiply that by 7 or 8 build and test boots, and it starts to get tiresome.
None of this is going to take as long, even on a slow link and a slow computer, as waiting for a response to a mailing list post. It'd annoy users who are specifically waiting for it, but if the interface is that the user says "kernel package X didn't work but the current kernel does", and it says "I'll let you know when I've got something to test", and the user watches a DVD, and afterward finds a message saying there's something to test, and tries it, and reports how it went, and the process repeats until it narrows it down to a single commit after a couple of days of the user getting occasional responses, it's not that different from asking for help online.
And then on top of that there are the issues about whether there is enough support for dealing with hitting kernel revisions that fail due to other bugs getting merged in during the -rc1 process, etc.
Could have a distro-provided mask of things that aren't worth testing and possibly back-ported fixes for revisions in particular ranges.
I agree that a tool that automated the bisection process and walked the user through it would be helpful, but I believe it would be possible for us do better.
That would probably help for giving the user something to try right away. I still think that the main cost to the user is the number of times that the user has to stop doing stuff to reboot with a kernel to test, whether the test kernels are available quickly from the distro site, slowly built locally, or slowly as suggested by humans helping online.
-Daniel *This .sig left intentionally blank*
* Mark Lord liml@rtr.ca wrote:
Ingo Molnar wrote: ..
This is all QA-101 that _cannot be argued against on a rational basis_, it's just that these sorts of things have been largely ignored for years, in favor of the all-too-easy "open source means many eyeballs and that is our QA" answer, which is a _good_ answer but by far not the most intelligent answer! Today "many eyeballs" is simply not good enough and nature (and other OS projects) will route us around if we dont change.
..
QA-101 and "many eyeballs" are not at all in opposition.
yes, absolutely so - that's why i used the "good" qualifier. "Good is not good enough" calls for additional efforts to make it more efficient, not for the abolition of the many eyeballs concept (which would be absurd). So what i wanted to say is that _sole_ reliance on the large numbers of eyeballs is a fundamental mistake. It's even sometimes used as an excuse to merge questionable stuff. "we'll find any bugs, many eyeballs will make bugs shallow". In reality the many eyeballs are not infinite, nor should they be taken for granted if they are used for bogus things. We have to make sure the eyeballs stay 'many', and we also have to make sure they are not wasted. It's a physical resource that must be intelligently handled. Its positive effects can be easily wasted and we do that today.
for example git-bisect was godsent. I remember that years ago bisection of a bug was a very laborous task so that it was only used as a final, last-ditch approach for really nasty bugs. Today we can autonomouly bisect build bugs via a simple shell command around "git-bisect run", without any human interaction! This freed up testing resources enormously and made bisection one of the _first_ things that are tried when bugs are met. We just need more of this (distros should offer pre-built kernel rpm 'farms' for every important commit point and automated tools for users to easily specify breakage points, without them having to install those kernels individually) , and everyone should be aware of the fact that we still suck (we merge too much crap and still dont have good enough tools to de-crappify what we merge) and that we are losing testers.
Ingo
Ingo Molnar wrote:
for example git-bisect was godsent. I remember that years ago bisection of a bug was a very laborous task so that it was only used as a final, last-ditch approach for really nasty bugs. Today we can autonomouly bisect build bugs via a simple shell command around "git-bisect run", without any human interaction! This freed up testing resources
..
It's only a godsend for the few people who happen to be kernel developers and who happen to already use git.
It's a 540MByte download over a slow link for everyone else.
-ml
On Tue, Nov 13, 2007 at 12:50:08PM -0500, Mark Lord wrote:
Ingo Molnar wrote:
for example git-bisect was godsent. I remember that years ago bisection of a bug was a very laborous task so that it was only used as a final, last-ditch approach for really nasty bugs. Today we can autonomouly bisect build bugs via a simple shell command around "git-bisect run", without any human interaction! This freed up testing resources
..
It's only a godsend for the few people who happen to be kernel developers
It's also godsend for users who want a regression they observe fixed.
If you can tell which patch broke it you often turned a very hard to debug problem into a relatively easy fixable problem.
As an example, [1] was an issue a normal user could discover, and bisecting made the difference between "nearly undebuggable" and "easily fixable by revertng a commit".
and who happen to already use git.
As already said in thread, the required instructions for bisecting are relatively short and simple (assuming the user can build his own kernels).
It's a 540MByte download over a slow link for everyone else.
Not everyone has a slow connection.
For me, the speed of cloning a tree from git.kernel.org is completely cpu bound and limited by the speed of the 1.8 Ghz Athlon in my computer...
But if there is a real life problem like people with extremely slow and expensive internet connections not being able to bisect bugs these problems should be named and fixed (e.g. by sending CDs).
-ml
cu Adrian
[1] http://lkml.org/lkml/2007/11/12/154
Adrian Bunk wrote:
On Tue, Nov 13, 2007 at 12:50:08PM -0500, Mark Lord wrote:
Ingo Molnar wrote:
for example git-bisect was godsent. I remember that years ago bisection of a bug was a very laborous task so that it was only used as a final, last-ditch approach for really nasty bugs. Today we can autonomouly bisect build bugs via a simple shell command around "git-bisect run", without any human interaction! This freed up testing resources
..
It's only a godsend for the few people who happen to be kernel developers
It's also godsend for users who want a regression they observe fixed.
If you can tell which patch broke it you often turned a very hard to debug problem into a relatively easy fixable problem.
..
Oh yes, definitely. When that use happens to be a kernel dev + git user, it saves the *fool who broke it* a hell of a lot of time, because they can slough it off onto the poor bloke who notices it.
Mind you, no arguing that this is effective when that poor bloke has a day free to download the git-tree and build/reboot a dozen times.
On Tue, Nov 13, 2007 at 01:18:43PM -0500, Mark Lord wrote:
Adrian Bunk wrote:
On Tue, Nov 13, 2007 at 12:50:08PM -0500, Mark Lord wrote:
Ingo Molnar wrote:
for example git-bisect was godsent. I remember that years ago bisection of a bug was a very laborous task so that it was only used as a final, last-ditch approach for really nasty bugs. Today we can autonomouly bisect build bugs via a simple shell command around "git-bisect run", without any human interaction! This freed up testing resources
..
It's only a godsend for the few people who happen to be kernel developers
It's also godsend for users who want a regression they observe fixed.
If you can tell which patch broke it you often turned a very hard to debug problem into a relatively easy fixable problem.
..
Oh yes, definitely. When that use happens to be a kernel dev + git user, it saves the *fool who broke it* a hell of a lot of time, because they can slough it off onto the poor bloke who notices it.
"fool who broke it" are hard works. Bugs are part of software development, so you'd have to name everyone who develops software a fool.
But the main point is that often you don't know who broke it until you know which commit broke it.
Mind you, no arguing that this is effective when that poor bloke has a day free to download the git-tree and build/reboot a dozen times.
I did bisecting myself, and I know that it costs time and work.
But the first point is the above one that it makes otherwise nearly undebuggable problems debuggable and fixable.
Another point is that it shifts the work from the few experienced developers to the many users. Users (and voluntary testers) we have many, but developer time for debugging bug reports is a quite scarce resource.
And why "poor bloke"? Bisecting takes time, but that's not different from e.g. writing code or cleaning up code or going through bug reports.
cu Adrian
Adrian Bunk wrote: ...
I did bisecting myself, and I know that it costs time and work.
But the first point is the above one that it makes otherwise nearly undebuggable problems debuggable and fixable.
..
Definitely useful, no question.
But the problem is now that kernel devs are addicted to it, many won't even consider resolving a problem any other way.
That's not "maintaining" (or supporting) one's code.
And when a "maintainer" is too busy to find/fix their own bugs, that could be a sign that they've bitten off too big of a chunk of the kernel, and it's time for them to distribute code maintainership.
Cheers
On Tue, Nov 13, 2007 at 01:47:10PM -0500, Mark Lord wrote:
Adrian Bunk wrote: ...
I did bisecting myself, and I know that it costs time and work.
But the first point is the above one that it makes otherwise nearly undebuggable problems debuggable and fixable.
..
Definitely useful, no question.
But the problem is now that kernel devs are addicted to it, many won't even consider resolving a problem any other way.
That's not "maintaining" (or supporting) one's code.
What you replaced with two dots contained the answer to this:
Another point is that it shifts the work from the few experienced developers to the many users. Users (and voluntary testers) we have many, but developer time for debugging bug reports is a quite scarce resource.
And when a "maintainer" is too busy to find/fix their own bugs, that could be a sign that they've bitten off too big of a chunk of the kernel, and it's time for them to distribute code maintainership.
The problem is: Maintainers don't grow on trees.
You need people who are both technically capable and willing to spend time on the non-sexy task of debugging problems.
Where do you plan to find them?
If you don't believe me, please find a maintainer for the currently unmaintained parallel port support.
Or if you want a harder task, find a maintainer for the floppy driver...
Cheers
cu Adrian
Adrian Bunk wrote:
On Tue, Nov 13, 2007 at 01:47:10PM -0500, Mark Lord wrote:
Adrian Bunk wrote: ...
I did bisecting myself, and I know that it costs time and work.
But the first point is the above one that it makes otherwise nearly undebuggable problems debuggable and fixable.
..
Definitely useful, no question.
But the problem is now that kernel devs are addicted to it, many won't even consider resolving a problem any other way.
That's not "maintaining" (or supporting) one's code.
What you replaced with two dots contained the answer to this:
Another point is that it shifts the work from the few experienced developers to the many users. Users (and voluntary testers) we have many, but developer time for debugging bug reports is a quite scarce resource.
And when a "maintainer" is too busy to find/fix their own bugs, that could be a sign that they've bitten off too big of a chunk of the kernel, and it's time for them to distribute code maintainership.
The problem is: Maintainers don't grow on trees.
You need people who are both technically capable and willing to spend time on the non-sexy task of debugging problems.
Where do you plan to find them?
If you don't believe me, please find a maintainer for the currently unmaintained parallel port support.
Or if you want a harder task, find a maintainer for the floppy driver...
..
Again, the problem is:
But the problem is now that kernel devs are addicted to it, many won't even consider resolving a problem any other way.
And that's simply not good enough.
On Tue, Nov 13, 2007 at 02:12:57PM -0500, Mark Lord wrote:
Adrian Bunk wrote:
On Tue, Nov 13, 2007 at 01:47:10PM -0500, Mark Lord wrote:
Adrian Bunk wrote: ...
I did bisecting myself, and I know that it costs time and work.
But the first point is the above one that it makes otherwise nearly undebuggable problems debuggable and fixable.
..
Definitely useful, no question.
But the problem is now that kernel devs are addicted to it, many won't even consider resolving a problem any other way.
That's not "maintaining" (or supporting) one's code.
What you replaced with two dots contained the answer to this:
Another point is that it shifts the work from the few experienced developers to the many users. Users (and voluntary testers) we have many, but developer time for debugging bug reports is a quite scarce resource.
And when a "maintainer" is too busy to find/fix their own bugs, that could be a sign that they've bitten off too big of a chunk of the kernel, and it's time for them to distribute code maintainership.
The problem is: Maintainers don't grow on trees.
You need people who are both technically capable and willing to spend time on the non-sexy task of debugging problems.
Where do you plan to find them?
If you don't believe me, please find a maintainer for the currently unmaintained parallel port support.
Or if you want a harder task, find a maintainer for the floppy driver...
..
Again, the problem is:
But the problem is now that kernel devs are addicted to it, many won't even consider resolving a problem any other way.
And that's simply not good enough.
There is this silly limit that noone can work more than 168 hours per week on the Linux kernel, and some kernel developers seem to take the liberty of spending even less time on kernel development...
Considering our problems to cope with the amount of incoming bug reports, everything that would require a kernel developer to spend more time for getting a bug fixed would be a horrible mistake.
cu Adrian
On Tue, Nov 13, 2007 at 08:30:35PM +0100, Adrian Bunk wrote:
There is this silly limit that noone can work more than 168 hours per week on the Linux kernel, and some kernel developers seem to take the liberty of spending even less time on kernel development...
That limit of 168 hours applies all around the world to everyone. Moreover, not all kernel developers are employed to hack on the kernel for 168 hours a week.
For me, personally, that figure is in reality about 24 hours a week. Yes, just 24. The rest of the time (like *now*) is time I'm volunteering because I happen to be reading my email...
... and happen to be wasting replying to discussions like this rather than reading that message which has just arrived on the ARM kernel mailing list from someone having problems using copy_from_user() with a kernel pointer.
So, please, stop this idea that somehow kernel developers can somehow spend infinite amounts of time solving lots and lots of bugs.
On Tue, Nov 13, 2007 at 07:46:49PM +0000, Russell King wrote:
On Tue, Nov 13, 2007 at 08:30:35PM +0100, Adrian Bunk wrote:
There is this silly limit that noone can work more than 168 hours per week on the Linux kernel, and some kernel developers seem to take the liberty of spending even less time on kernel development...
That limit of 168 hours applies all around the world to everyone. Moreover, not all kernel developers are employed to hack on the kernel for 168 hours a week.
For me, personally, that figure is in reality about 24 hours a week. Yes, just 24. The rest of the time (like *now*) is time I'm volunteering because I happen to be reading my email...
... and happen to be wasting replying to discussions like this rather than reading that message which has just arrived on the ARM kernel mailing list from someone having problems using copy_from_user() with a kernel pointer.
So, please, stop this idea that somehow kernel developers can somehow spend infinite amounts of time solving lots and lots of bugs.
Sorry, that happens when using irony in a non-native language...
What I wanted to express: Noone has unlimited time for kernel development.
Russell King
cu Adrian
Adrian Bunk wrote:
On Tue, Nov 13, 2007 at 01:47:10PM -0500, Mark Lord wrote:
Adrian Bunk wrote:
..
Another point is that it shifts the work from the few experienced developers to the many users. Users (and voluntary testers) we have many, but developer time for debugging bug reports is a quite scarce resource.
And when a "maintainer" is too busy to find/fix their own bugs, that could be a sign that they've bitten off too big of a chunk of the kernel, and it's time for them to distribute code maintainership.
The problem is: Maintainers don't grow on trees.
..
Hey, if somebody has time to break things, then they damn well ought to be able to make time to fix them again. And the best developers here on LKML do just that (fix what they break).
You broke it, you fix it. A simple rule.
Translation for the particularly daft:
If you've been making significant updates to a driver/subsystem, and people are reporting that it is now broken for them, then it's your job to make it right. The reporters can help, and many may even git-bisect or send patches.
But you cannot *expect* or *insist* upon them doing your job.
On Tue, Nov 13, 2007 at 02:26:05PM -0500, Mark Lord wrote:
Adrian Bunk wrote:
On Tue, Nov 13, 2007 at 01:47:10PM -0500, Mark Lord wrote:
Adrian Bunk wrote:
..
Another point is that it shifts the work from the few experienced developers to the many users. Users (and voluntary testers) we have many, but developer time for debugging bug reports is a quite scarce resource.
And when a "maintainer" is too busy to find/fix their own bugs, that could be a sign that they've bitten off too big of a chunk of the kernel, and it's time for them to distribute code maintainership.
The problem is: Maintainers don't grow on trees.
..
Hey, if somebody has time to break things, then they damn well ought to be able to make time to fix them again. And the best developers here on LKML do just that (fix what they break).
You broke it, you fix it. A simple rule.
Translation for the particularly daft:
If you've been making significant updates to a driver/subsystem, and people are reporting that it is now broken for them,
What are "significant updates"?
Sometimes one person makes one small patch and this patch contains a typo.
then it's your job to make it right.
We have some open drivers/ata/ regressions.
I see some person named "Mark Lord" being responsible for 4 commits.
What pubishment do you plan for him if 2.6.24 ships with any libata regressions?
Let George W. Bush wrongly accuse him of possessing weapons of mass destructions and invade Canada?
The reporters can help, and many may even git-bisect or send patches. But you cannot *expect* or *insist* upon them doing your job.
Bullshit.
Bug fixing is not about finding someone to blame, it's about getting the bug fixed.
The bug reporter is the person who can reproduce the problem, and if it's a regression then bisecting is the natural way of getting nearer at getting it fixed.
cu Adrian
Adrian Bunk wrote:
On Tue, Nov 13, 2007 at 02:26:05PM -0500, Mark Lord wrote:
..
If you've been making significant updates to a driver/subsystem, and people are reporting that it is now broken for them,
What are "significant updates"?
Sometimes one person makes one small patch and this patch contains a typo.
..
Then that person should double check their changes against the problems reported, and re-convince themselves that the breakage wasn't from those. Simple.
then it's your job to make it right.
We have some open drivers/ata/ regressions.
..
Yup, but they're more specific than just that entire subsystem, and the maintainers are actively pursuing the problems. Exactly what should be happening.
I see some person named "Mark Lord" being responsible for 4 commits.
What pubishment do you plan for him if 2.6.24 ships with any libata regressions?
..
If the code I'm touching breaks, then I'll fix it ASAP, exactly what the users of that code might expect.
The reporters can help, and many may even git-bisect or send patches. But you cannot *expect* or *insist* upon them doing your job.
Bullshit.
Bug fixing is not about finding someone to blame, it's about getting the bug fixed.
..
It's not about blame, it's about paying attention to breakages in code that a person claims to be supporting, and then doing their best to resolve the issues.
Again, if one has the time to actively write/modify code such that something breaks, then that person should also make time to fix the breakages.
The bug reporter is the person who can reproduce the problem, and if it's a regression then bisecting is the natural way of getting nearer at getting it fixed.
.. For the third time, no disagreement here. git-bsect can help in many cases, but not in all cases. And it requires a great time commitment from somebody who's system used to work and now doesn't work. The person who broke it has a fair bit of responsibility there, too.
cheers
On Tue, Nov 13, 2007 at 03:13:46PM -0500, Mark Lord wrote:
Adrian Bunk wrote:
On Tue, Nov 13, 2007 at 02:26:05PM -0500, Mark Lord wrote:
..
If you've been making significant updates to a driver/subsystem, and people are reporting that it is now broken for them,
What are "significant updates"?
Sometimes one person makes one small patch and this patch contains a typo.
..
Then that person should double check their changes against the problems reported, and re-convince themselves that the breakage wasn't from those. Simple.
Simple?
Everything you have in mind with "should double check their changes" is simply not realistic with dozens of known unfixed regressions within more than half a million changed or new lines of code written by more than 800 people - all numbers only counted since 2.6.23.
...
The reporters can help, and many may even git-bisect or send patches. But you cannot *expect* or *insist* upon them doing your job.
Bullshit.
Bug fixing is not about finding someone to blame, it's about getting the bug fixed.
..
It's not about blame, it's about paying attention to breakages in code that a person claims to be supporting, and then doing their best to resolve the issues.
Maintainers are just humans with limited time.
You were the one who suggested to "distribute code maintainership", so you should explain how to find the additional maintainers.
Again, if one has the time to actively write/modify code such that something breaks, then that person should also make time to fix the breakages.
code writer != subsystem maintainer
And git-bisect is the tool that tells you who broke it.
The bug reporter is the person who can reproduce the problem, and if it's a regression then bisecting is the natural way of getting nearer at getting it fixed.
.. For the third time, no disagreement here. git-bsect can help in many cases, but not in all cases. And it requires a great time commitment from somebody who's system used to work and now doesn't work. The person who broke it has a fair bit of responsibility there, too.
git-bisect can help only for regressions, and it can help for most regressions.
And you shouldn't try to make a problem out of something that isn't a problem:
Bug submitters are either volunteers who test -rc or even -git or -mm kernels for finding bugs or people who want a problem they experience fixed.
In both cases the submitters are usually willing to invest some time for helping to get the bug fixed.
cheers
cu Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed
On Tue, 13 Nov 2007 19:52:17 -0500 Chuck Ebbert cebbert@redhat.com wrote:
On 11/13/2007 04:12 PM, Alan Cox wrote:
Bug fixing is not about finding someone to blame, it's about getting the bug fixed.
Partly - its also about understanding why the bug occurred and making it not happen again.
Very few people think about that part.
Why does the kernel have very few useful tests? Lack of interest? resources? expertise? Ideally each new feature would just be a small add on to an existing test.
Unlike developing new features which seems to grow well with more developers. Bug fixing also seems to be a scarcity process. There often seems to be a very few people that understand the problem well enough or have the necessary hardware to reproduce and fix the problem.
Recent changes like tickless and scheduler rework were well thought out and caused very little impact to 90% of the users. The problem is the 10% who do have problems. Worse, the developers often only hear about the a small sample of those.
On Tue, 13 Nov 2007 17:11:36 -0800 Stephen Hemminger shemminger@linux-foundation.org wrote:
On Tue, 13 Nov 2007 19:52:17 -0500 Chuck Ebbert cebbert@redhat.com wrote:
On 11/13/2007 04:12 PM, Alan Cox wrote:
Bug fixing is not about finding someone to blame, it's about getting the bug fixed.
Partly - its also about understanding why the bug occurred and making it not happen again.
Very few people think about that part.
Why does the kernel have very few useful tests?
Tests would of course be nice, but they aren't very useful(!)
Looking at this list which Natalie has generated I see around thirty which are dependent on the right hardware and ten which are not. This ratio is typical, I think. In fact I'd say that more than 75% of reported bugs are dependent on hardware.
So the best test of all for the kernel is "run it on a different machine". This is why we are sooooo dependent upon our volunteer testers/reporters to be able to do kernel development.
Lack of interest? resources? expertise? Ideally each new feature would just be a small add on to an existing test.
Sure. For system-call-visible features it would be good to do that.
But this tends not to be where bugs get exposed. Because the original developer can 100% exercise such code. That isn't the case with driver/arch/platform changes.
Unlike developing new features which seems to grow well with more developers. Bug fixing also seems to be a scarcity process. There often seems to be a very few people that understand the problem well enough or have the necessary hardware to reproduce and fix the problem.
We're 100% dead if "having the hardware" is a prerequisite to fixing a bug. The terminal state there is that the kernel runs on about 200 machines worldwide. We have to work with reporters via email to fix these sorts of things. As we of course do.
Recent changes like tickless and scheduler rework were well thought out and caused very little impact to 90% of the users. The problem is the 10% who do have problems. Worse, the developers often only hear about the a small sample of those.
Yes. An unknown number of people just shrug and go back to an old kernel.
From: Mark Lord liml@rtr.ca Date: Tue, 13 Nov 2007 13:18:43 -0500
Mind you, no arguing that this is effective when that poor bloke has a day free to download the git-tree and build/reboot a dozen times.
Like the internet, this time spent is beneficial because it's pushing the work out to the end nodes. In fact git bisect is an awesome example of the end node principle in action for software development and QA.
For the end-user wanting their bug fixed and the developer it's a win win situation because the reporter is actually able to do something proactive which will help get the bug they want fixed faster.
So I don't agree with framing this person as a "poor bloke". Our testers are more empowered than ever to lead the process towards a fix.
On Tue, 2007-11-13 at 12:50 -0500, Mark Lord wrote:
Ingo Molnar wrote:
for example git-bisect was godsent. I remember that years ago bisection of a bug was a very laborous task so that it was only used as a final, last-ditch approach for really nasty bugs. Today we can autonomouly bisect build bugs via a simple shell command around "git-bisect run", without any human interaction! This freed up testing resources
...
It's only a godsend for the few people who happen to be kernel developers and who happen to already use git.
It's a 540MByte download over a slow link for everyone else.
Oh, common. Leeching CDs is so yesterday. These days some distributions don't even offer CDs anymore in favour of DVDs.
I'd be amazed if a lot of the testers would still be on slownet, its impossible to keep up with the latest distros without broadband.
On Tue, Nov 13, 2007 at 12:50:08PM -0500, Mark Lord wrote:
It's a 540MByte download over a slow link for everyone else.
Where do you get this number from? $ du -sh .git/objects/pack/ 249M .git/objects/pack/ $ du -sh .git/objects/ 253M .git/objects/
ie about half what you claim.
Matthew Wilcox wrote:
On Tue, Nov 13, 2007 at 12:50:08PM -0500, Mark Lord wrote:
It's a 540MByte download over a slow link for everyone else.
Where do you get this number from? $ du -sh .git/objects/pack/ 249M .git/objects/pack/ $ du -sh .git/objects/ 253M .git/objects/
ie about half what you claim.
..
No, it's from earlier in this very thread:
Adrian Bunk wrote:
The small instruction below is enough for everyone who is able to build his own kernel to do a git bisect.
..
<-- snip -->
# install git
# clone Linus' tree: git clone \ git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
..
mkdir t cd t git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git (wait half an hour) /usr/bin/du -s linux-2.6 522732 linux-2.6
On Tue, Nov 13, 2007 at 01:43:53PM -0500, Mark Lord wrote:
Matthew Wilcox wrote:
ie about half what you claim.
..
No, it's from earlier in this very thread:
Adrian Bunk wrote:
git clone \ git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
..
mkdir t cd t git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git (wait half an hour) /usr/bin/du -s linux-2.6 522732 linux-2.6
You're assuming that everything in linux-2.6 was downloaded; that's not true. Everything in linux-2.6/.git was downloaded; but then you do a checkout which happens to approximately double the size of the linux-2.6 directory. If you do git-clone -n, you'll get a closer estimate to the size of the download.
I suppose git-clone should grow a -v option that it could pass to rsync to let us find out how many bytes are actually transferred, but i'm happy to go with 250MB as a close estimate to the amount of data to xfer.
When you compare it to the 60MB tarballs that are published, it's really not that bad.
Matthew Wilcox wrote:
On Tue, Nov 13, 2007 at 01:43:53PM -0500, Mark Lord wrote:
mkdir t cd t git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git (wait half an hour) /usr/bin/du -s linux-2.6 522732 linux-2.6
You're assuming that everything in linux-2.6 was downloaded; that's not true. Everything in linux-2.6/.git was downloaded; but then you do a checkout which happens to approximately double the size of the linux-2.6 directory.
..
Ah, I wondered why it took only half an hour to download.
..
When you compare it to the 60MB tarballs that are published, it's really not that bad.
..
The tarballs I download are only 45MB.
Cheers
On Tuesday, 13 of November 2007, Mark Lord wrote:
Matthew Wilcox wrote:
On Tue, Nov 13, 2007 at 01:43:53PM -0500, Mark Lord wrote:
mkdir t cd t git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git (wait half an hour) /usr/bin/du -s linux-2.6 522732 linux-2.6
You're assuming that everything in linux-2.6 was downloaded; that's not true. Everything in linux-2.6/.git was downloaded; but then you do a checkout which happens to approximately double the size of the linux-2.6 directory.
..
Ah, I wondered why it took only half an hour to download.
..
When you compare it to the 60MB tarballs that are published, it's really not that bad.
..
The tarballs I download are only 45MB.
You clone the git repo once. Afterwards, you only update it and that usually doesn't take that much time and a little effort.
Greetings, Rafael
* Mark Lord liml@rtr.ca wrote:
You're assuming that everything in linux-2.6 was downloaded; that's not true. Everything in linux-2.6/.git was downloaded; but then you do a checkout which happens to approximately double the size of the linux-2.6 directory.
..
Ah, I wondered why it took only half an hour to download.
and you can get even lower than the 260MB by downloading a shallow clone of v2.6.23 and then populating the git tree from tht point on. (see the --depth parameter of git-clone) [because most of the time you want to bisect back to the last stable release, not back to 2 years of git history.]
Ingo
Ingo Molnar wrote:
- Mark Lord liml@rtr.ca wrote:
You're assuming that everything in linux-2.6 was downloaded; that's not true. Everything in linux-2.6/.git was downloaded; but then you do a checkout which happens to approximately double the size of the linux-2.6 directory.
..
Ah, I wondered why it took only half an hour to download.
and you can get even lower than the 260MB by downloading a shallow clone of v2.6.23 and then populating the git tree from tht point on. (see the --depth parameter of git-clone) [because most of the time you want to bisect back to the last stable release, not back to 2 years of git history.]
When creating additional git trees (Linville's wireless-2.6 tree, for example) for driver development, you can save a lot of download bandwidth by using the --reference parameter of git-clone.
Larry
On Tue 2007-11-13 12:50:08, Mark Lord wrote:
Ingo Molnar wrote:
for example git-bisect was godsent. I remember that years ago bisection of a bug was a very laborous task so that it was only used as a final, last-ditch approach for really nasty bugs. Today we can autonomouly bisect build bugs via a simple shell command around "git-bisect run", without any human interaction! This freed up testing resources
..
It's only a godsend for the few people who happen to be kernel developers and who happen to already use git.
It's a 540MByte download over a slow link for everyone else.
Hmmm, clean-cg is 7.7G on my machine, and yes I tried git-prune-packed. What am I doing wrong? Pavel
On 18-11-07 13:44, Pavel Machek wrote:
On Tue 2007-11-13 12:50:08, Mark Lord wrote:
It's a 540MByte download over a slow link for everyone else.
Hmmm, clean-cg is 7.7G on my machine, and yes I tried git-prune-packed. What am I doing wrong?
clean-cg? But failure to run "git repack -a -d" every once in a while?
Rene.
On Sun, 2007-11-18 at 13:58 +0100, Rene Herman wrote:
On 18-11-07 13:44, Pavel Machek wrote:
On Tue 2007-11-13 12:50:08, Mark Lord wrote:
It's a 540MByte download over a slow link for everyone else.
Hmmm, clean-cg is 7.7G on my machine, and yes I tried git-prune-packed. What am I doing wrong?
clean-cg? But failure to run "git repack -a -d" every once in a while?
Actually, the best command is
git gc
which does a repack (into a single pack file rather than an incremenal), and then removes all the objects now in the pack. If, like me, you work on temporary branches which you keep rebasing, you can add a --prune to gc which will erase all unreferenced objects as it packs (use this one with care. I usually never use it but run a git prune -n just to see what would be removed, and then run git prune separately if it looks OK).
James
On 18-11-07 15:35, James Bottomley wrote:
clean-cg? But failure to run "git repack -a -d" every once in a while?
Actually, the best command is
git gc
which does a repack (into a single pack file rather than an incremenal), and then removes all the objects now in the pack. If, like me, you work on temporary branches which you keep rebasing, you can add a --prune to gc which will erase all unreferenced objects as it packs (use this one with care. I usually never use it but run a git prune -n just to see what would be removed, and then run git prune separately if it looks OK).
Thanks for the comment. That managed to indeed shave a few extra bytes off my already "repack -a -d" packed repo still.
Rene.
* Pavel Machek pavel@ucw.cz wrote:
On Tue 2007-11-13 12:50:08, Mark Lord wrote:
Ingo Molnar wrote:
for example git-bisect was godsent. I remember that years ago bisection of a bug was a very laborous task so that it was only used as a final, last-ditch approach for really nasty bugs. Today we can autonomouly bisect build bugs via a simple shell command around "git-bisect run", without any human interaction! This freed up testing resources
..
It's only a godsend for the few people who happen to be kernel developers and who happen to already use git.
It's a 540MByte download over a slow link for everyone else.
Hmmm, clean-cg is 7.7G on my machine, and yes I tried git-prune-packed. What am I doing wrong?
"git-repack -a -d" gives me ~220 MB:
$ du -s .git 222064 .git
anyone who can download a 43 MB tar.bz2 tarball for a kernel release should be able to afford a _one time_ download size of 250 MB (the size of the current kernel.org git repository). If not, burning a CD or DVD and carrying it home ought to do the trick. Git is very bandwidth-efficient after that point - lots of people behind narrow pipes are using it - it's just the initial clone that takes time. And given all the history and metadata that the git repository carries (full changelogs, annotations, etc.) it's a no-brainer that kernel developers should be using it.
(and you can shrink the 250 MB further down by using shallow clones, etc.)
yes, some people complained when distros stopped doing floppy installs. Some people complained when distros stopped doing CD installs. Yes, i've myself done a 250+ MB download over a 56 kbit modem in the past, and while it indeed took overnight to finish, it's very much doable. It's not really qualitatively different from the 1.5 hours a kernel tar.bz2 took to download.
Ingo
On Sun, Nov 18, 2007 at 03:56:11PM +0100, Ingo Molnar wrote:
- Pavel Machek pavel@ucw.cz wrote:
On Tue 2007-11-13 12:50:08, Mark Lord wrote:
Ingo Molnar wrote:
for example git-bisect was godsent. I remember that years ago bisection of a bug was a very laborous task so that it was only used as a final, last-ditch approach for really nasty bugs. Today we can autonomouly bisect build bugs via a simple shell command around "git-bisect run", without any human interaction! This freed up testing resources
..
It's only a godsend for the few people who happen to be kernel developers and who happen to already use git.
It's a 540MByte download over a slow link for everyone else.
Hmmm, clean-cg is 7.7G on my machine, and yes I tried git-prune-packed. What am I doing wrong?
"git-repack -a -d" gives me ~220 MB:
$ du -s .git 222064 .git
anyone who can download a 43 MB tar.bz2 tarball for a kernel release should be able to afford a _one time_ download size of 250 MB (the size of the current kernel.org git repository). If not, burning a CD or DVD and carrying it home ought to do the trick. Git is very bandwidth-efficient after that point - lots of people behind narrow pipes are using it - it's just the initial clone that takes time. And given all the history and metadata that the git repository carries (full changelogs, annotations, etc.) it's a no-brainer that kernel developers should be using it.
(and you can shrink the 250 MB further down by using shallow clones, etc.)
yes, some people complained when distros stopped doing floppy installs. Some people complained when distros stopped doing CD installs. Yes, i've myself done a 250+ MB download over a 56 kbit modem in the past, and while it indeed took overnight to finish, it's very much doable. It's not really qualitatively different from the 1.5 hours a kernel tar.bz2 took to download.
Probably that once in a while, we should set up a complete tree in a tar.bz2 format on kernel.org. It would help a lot of people behind small pipes. I have been encountering problems with git-clone when the link is unstable. After the smallest error, it erases everything and you have to retry from start, which is quite frustrating and expensive.
At least, downloading a tar.bz2 with FTP would be easier and a lot more reliable. Also, people could download it from their workplace and bring it home.
Willy
On Tue, Nov 13, 2007 at 09:08:32AM -0500, Mark Lord wrote:
Ingo Molnar wrote: ..
This is all QA-101 that _cannot be argued against on a rational basis_, it's just that these sorts of things have been largely ignored for years, in favor of the all-too-easy "open source means many eyeballs and that is our QA" answer, which is a _good_ answer but by far not the most intelligent answer! Today "many eyeballs" is simply not good enough and nature (and other OS projects) will route us around if we dont change.
..
QA-101 and "many eyeballs" are not at all in opposition. The latter is how we find out about bugs on uncommon hardware, and the former is what we need to track them and overall quality.
A HUGE problem I have with current "efforts", is that once someone reports a bug, the onus seems to be 99% on the *reporter* to find the exact line of code or commit. Ghad what a repressive method.
99% on the reporter? Is that why I always try to understand the reporters problem (*provided* it's in an area I know about) and come up with a patch to test a theory or fix the issue?
I'm _less_ inclined to provide such a "service" for lazy maintainers who've moved off into new and wonderfully exciting technologies, to churn out more patches for me to merge (and eventually provide a free to them bug fixing service for.)
That's "less" inclined, not "won't".
Russell King wrote:
On Tue, Nov 13, 2007 at 09:08:32AM -0500, Mark Lord wrote:
Ingo Molnar wrote: ..
This is all QA-101 that _cannot be argued against on a rational basis_, it's just that these sorts of things have been largely ignored for years, in favor of the all-too-easy "open source means many eyeballs and that is our QA" answer, which is a _good_ answer but by far not the most intelligent answer! Today "many eyeballs" is simply not good enough and nature (and other OS projects) will route us around if we dont change.
..
QA-101 and "many eyeballs" are not at all in opposition. The latter is how we find out about bugs on uncommon hardware, and the former is what we need to track them and overall quality.
A HUGE problem I have with current "efforts", is that once someone reports a bug, the onus seems to be 99% on the *reporter* to find the exact line of code or commit. Ghad what a repressive method.
99% on the reporter? Is that why I always try to understand the reporters problem (*provided* it's in an area I know about) and come up with a patch to test a theory or fix the issue?
..
Same here.
I just find it weird that something can be known broken for several -rc* kernels before I happen to install it, discover it's broken on my own machine, and then I track it down, fix it, and submit the patch, generally all within a couple of hours. Where the heck was the dude(ess) that broke it ?? AWOL.
And when I receive hostility from the "maintainers" of said code for fixing their bugs, well.. that really motivates me to continue reporting new ones..
I'm _less_ inclined to provide such a "service" for lazy maintainers who've moved off into new and wonderfully exciting technologies, to churn out more patches for me to merge (and eventually provide a free to them bug fixing service for.)
That's "less" inclined, not "won't".
On Tue, 13 November 2007 15:18:07 -0500, Mark Lord wrote:
I just find it weird that something can be known broken for several -rc* kernels before I happen to install it, discover it's broken on my own machine, and then I track it down, fix it, and submit the patch, generally all within a couple of hours. Where the heck was the dude(ess) that broke it ?? AWOL.
And when I receive hostility from the "maintainers" of said code for fixing their bugs, well.. that really motivates me to continue reporting new ones..
Given a decent bug report, I agree that having the bug not looked at is shameful. But what can a developer do if a bug report effectively reads "there is some bug somewhere in recent kernels"? How can I know that in this particular case it is my bug that I introduced? It could just as easily be 50 other people and none of them are eager to debug it unless they suspect it to be their bug.
This is a common problem and fairly unrelated to linux in general or the kernel in particular. Who is going to be the sucker that figures out which developer the bug belongs to? And I have yet to find a project, commercial or opensource, where volunteers flock to become such a sucker.
One option is to push this role to the bug reporter. Another is to strong-arm some developers into this role, by whatever means. A third would be for $LARGE_COMPANY to hire some people. If you have a better idea or would volunteer your time, I'd be grateful. Simply blaming one side, whether bug reporter or a random developer, for not being the sucker doesn't help anyone.
Jörn
On Tue, 13 Nov 2007 22:33:58 +0100 Jörn Engel joern@logfs.org wrote:
On Tue, 13 November 2007 15:18:07 -0500, Mark Lord wrote:
I just find it weird that something can be known broken for several -rc* kernels before I happen to install it, discover it's broken on my own machine, and then I track it down, fix it, and submit the patch, generally all within a couple of hours. Where the heck was the dude(ess) that broke it ?? AWOL.
And when I receive hostility from the "maintainers" of said code for fixing their bugs, well.. that really motivates me to continue reporting new ones..
Given a decent bug report, I agree that having the bug not looked at is shameful. But what can a developer do if a bug report effectively reads "there is some bug somewhere in recent kernels"? How can I know that in this particular case it is my bug that I introduced? It could just as easily be 50 other people and none of them are eager to debug it unless they suspect it to be their bug.
It's relatively common that a regression in subsystem A will manifest as a failure in subsystem B, and the report initially lands on the desk of the subsystem B developers.
But that's OK. The subsystem B people are the ones with the expertise to be able to work out where the bug resides and to help the subsystem A people understand what went wrong.
Alas, sometimes the B people will just roll eyes and do nothing because they know the problem wasn't in their code. Sometimes.
On Tue, 13 November 2007 13:56:58 -0800, Andrew Morton wrote:
It's relatively common that a regression in subsystem A will manifest as a failure in subsystem B, and the report initially lands on the desk of the subsystem B developers.
But that's OK. The subsystem B people are the ones with the expertise to be able to work out where the bug resides and to help the subsystem A people understand what went wrong.
Alas, sometimes the B people will just roll eyes and do nothing because they know the problem wasn't in their code. Sometimes.
And sometimes the A people will ignore the B people after the root cause has been worked out. Do you have a good idea how to shame A into action? Should I put you on Cc:? Right now I'm in the eye-rolling phase.
Jörn
On Tue, 13 Nov 2007 23:24:14 +0100 Jörn Engel joern@logfs.org wrote:
On Tue, 13 November 2007 13:56:58 -0800, Andrew Morton wrote:
It's relatively common that a regression in subsystem A will manifest as a failure in subsystem B, and the report initially lands on the desk of the subsystem B developers.
But that's OK. The subsystem B people are the ones with the expertise to be able to work out where the bug resides and to help the subsystem A people understand what went wrong.
Alas, sometimes the B people will just roll eyes and do nothing because they know the problem wasn't in their code. Sometimes.
And sometimes the A people will ignore the B people after the root cause has been worked out. Do you have a good idea how to shame A into action? Should I put you on Cc:? Right now I'm in the eye-rolling phase.
Well, that's the problem, isn't it?
The best I can come up with is to suggest that all the info be captured in a bugzilla report so that at least it doesn't get forgotten about.
I suppose that other options are
a) try to fix it yourself. I'll take the patch and as long as we make a big enough mess of it, someone who knows what they're doing might fix it for real.
b) If it was a regression, identify the offending commit and we'll just revert it.
On Tue, Nov 13, 2007 at 03:18:07PM -0500, Mark Lord wrote:
Russell King wrote:
On Tue, Nov 13, 2007 at 09:08:32AM -0500, Mark Lord wrote:
Ingo Molnar wrote: ..
This is all QA-101 that _cannot be argued against on a rational basis_, it's just that these sorts of things have been largely ignored for years, in favor of the all-too-easy "open source means many eyeballs and that is our QA" answer, which is a _good_ answer but by far not the most intelligent answer! Today "many eyeballs" is simply not good enough and nature (and other OS projects) will route us around if we dont change.
..
QA-101 and "many eyeballs" are not at all in opposition. The latter is how we find out about bugs on uncommon hardware, and the former is what we need to track them and overall quality.
A HUGE problem I have with current "efforts", is that once someone reports a bug, the onus seems to be 99% on the *reporter* to find the exact line of code or commit. Ghad what a repressive method.
99% on the reporter? Is that why I always try to understand the reporters problem (*provided* it's in an area I know about) and come up with a patch to test a theory or fix the issue?
..
Same here.
I just find it weird that something can be known broken for several -rc* kernels before I happen to install it, discover it's broken on my own machine, and then I track it down, fix it, and submit the patch, generally all within a couple of hours. Where the heck was the dude(ess) that broke it ?? AWOL.
Same thing can be said for compile breakages as well. Looking at the latest kautobuild output:
ARM ep93xx defconfig has been broken since 2.6.23-git1 due to:
drivers/net/arm/ep93xx_eth.c:420: error: implicit declaration of function '__netif_rx_schedule_prep'
caused by: [NET]: Make NAPI polling independent of struct net_device objects.
ARM netx defconfig has been broken since 2.6.23-git1 due to:
drivers/net/netx-eth.c: In function 'netx_eth_hard_start_xmit': drivers/net/netx-eth.c:131: error: 'dev' undeclared (first use in this function) drivers/net/netx-eth.c:131: error: (Each undeclared identifier is reported only once drivers/net/netx-eth.c:131: error: for each function it appears in.) drivers/net/netx-eth.c: In function 'netx_eth_receive': drivers/net/netx-eth.c:158: error: 'dev' undeclared (first use in this function)
caused by: [NET] drivers/net: statistics cleanup #1 -- save memory and shrink code
Haven't got a report for either of those, but Kautobuild lets people know if folk can be bothered to subscribe to its mailing list and/or look at the site occasionally.
I suspect the maintainers of the above drivers aren't aware that their drivers are broken.
From: Russell King rmk+lkml@arm.linux.org.uk Date: Tue, 13 Nov 2007 23:40:33 +0000
ARM ep93xx defconfig has been broken since 2.6.23-git1 due to:
drivers/net/arm/ep93xx_eth.c:420: error: implicit declaration of function '__netif_rx_schedule_prep'
caused by: [NET]: Make NAPI polling independent of struct net_device objects.
ARM netx defconfig has been broken since 2.6.23-git1 due to:
drivers/net/netx-eth.c: In function 'netx_eth_hard_start_xmit': drivers/net/netx-eth.c:131: error: 'dev' undeclared (first use in this function) drivers/net/netx-eth.c:131: error: (Each undeclared identifier is reported only once drivers/net/netx-eth.c:131: error: for each function it appears in.) drivers/net/netx-eth.c: In function 'netx_eth_receive': drivers/net/netx-eth.c:158: error: 'dev' undeclared (first use in this function)
caused by: [NET] drivers/net: statistics cleanup #1 -- save memory and shrink code
I'll fix these up, thanks for the report.
On Tuesday 13 November 2007 07:08, Mark Lord wrote:
Ingo Molnar wrote: ..
This is all QA-101 that _cannot be argued against on a rational basis_, it's just that these sorts of things have been largely ignored for years, in favor of the all-too-easy "open source means many eyeballs and that is our QA" answer, which is a _good_ answer but by far not the most intelligent answer! Today "many eyeballs" is simply not good enough and nature (and other OS projects) will route us around if we dont change.
..
QA-101 and "many eyeballs" are not at all in opposition. The latter is how we find out about bugs on uncommon hardware, and the former is what we need to track them and overall quality.
A HUGE problem I have with current "efforts", is that once someone reports a bug, the onus seems to be 99% on the *reporter* to find the exact line of code or commit. Ghad what a repressive method.
This is the only method that scales.
Developer has only 24 hours in each day, and sometimes he needs to eat, sleep, and maybe even pay attention to e.g. his kids.
But bug reporters are much more numerous and they have more hours in one day combined.
BUT - it means that developers should try to increase user base, not scare users away.
And if the "developer" who broke the damn thing, or who at least "claims" to be supporting that code, cannot "reproduce" the bug, they drop it completely.
Developer should let reporter know that reporter needs to help a bit here. Sometimes a bit of hand holding is needed, but it pays off because you breed more qualified testers/bug reporters.
Contrast that flawed approach with how Linus does things.. he thinks through the symptoms, matches them to the code, and figures out what the few possibilities might be, and feeds back some trial balloon patches for the bug reporter to try.
MUCH better.
And remember, *I'm* an old-time Linux kernel developer.. just think about the people reporting bugs who haven't been around here since 1992..
Yes. Developers should not grow more and more unhelpful and arrogant towards their users just because inexperienced users send incomplete/poorly written bug reports. They need to provide help, not humiliate/ignore.
I think we agree here. -- vda
On Tue, 13 Nov 2007 14:40:29 +0100 Ingo Molnar wrote:
- Andrew Morton akpm@linux-foundation.org wrote:
Do you believe that our response to bug reports is adequate?
Do you feel that making us feel and look like shit helps?
That doesn't answer my question.
See, first we need to work out whether we have a problem. If we do this, then we can then have a think about what to do about it.
I tried to convince the 2006 KS attendees that we have a problem and I resoundingly failed. People seemed to think that we're doing OK.
We were a minority.
But it appears that data such as this contradicts that belief.
This is not a minor matter. If the kernel _is_ slowly deteriorating then this won't become readily apparent until it has been happening for a number of years. By that stage there will be so much work to do to get us back to an acceptable level that it will take a huge effort. And it will take a long time after that for the kerel to get its reputation back.
So it is important that we catch deterioration *early* if it is happening.
[agree with most of Ingo's moaning]
(and this is in no way directed at the networking folks - it holds for all of us. I have one main complaint about networking: the separate netdev list is a bad idea - networking regressions should be discussed and fixed on lkml, like most other subsystems are. Any artificial split of the lk discussion space is bad.)
but here I disagree. LKML is already too busy and noisy. Major subsystems need their own discussion areas.
--- ~Randy
* Randy Dunlap rdunlap@xenotime.net wrote:
(and this is in no way directed at the networking folks - it holds for all of us. I have one main complaint about networking: the separate netdev list is a bad idea - networking regressions should be discussed and fixed on lkml, like most other subsystems are. Any artificial split of the lk discussion space is bad.)
but here I disagree. LKML is already too busy and noisy. Major subsystems need their own discussion areas.
That's a stupid argument. We lose much more by forced isolation of discussion than what we win by having less traffic! It's _MUCH_ easier to narrow down information (by filter by threads, by topics, by people, etc.) than it is to gobble information together from various fractured sources. We learned it _again and again_ that isolation of kernel discussions causes bad things.
In fact this thread is the very example: David points out that on netdev some of those bugs were already discussed and resolved. Had it been all on lkml we'd all be aware of it.
this is a single kernel project that is released together as one codebase, so a central place of discussion is obvious and common-sense.
so please stop this "too busy and too noisy" nonsense already. It was nonsense 10 years ago and it's nonsense today. In 10 years the kernel grew from a 1 million lines codebase to an 8 million lines codebase, so what? Deal with it and be intelligent about filtering your information influx instead of imposing a hard pre-filtering criteria that restricts intelligent processing of information.
Ingo
On Wed, 14 Nov 2007 15:08:47 +0100 Ingo Molnar wrote:
- Randy Dunlap rdunlap@xenotime.net wrote:
(and this is in no way directed at the networking folks - it holds for all of us. I have one main complaint about networking: the separate netdev list is a bad idea - networking regressions should be discussed and fixed on lkml, like most other subsystems are. Any artificial split of the lk discussion space is bad.)
but here I disagree. LKML is already too busy and noisy. Major subsystems need their own discussion areas.
That's a stupid argument. We lose much more by forced isolation of discussion than what we win by having less traffic! It's _MUCH_ easier to narrow down information (by filter by threads, by topics, by people, etc.) than it is to gobble information together from various fractured sources. We learned it _again and again_ that isolation of kernel discussions causes bad things.
In fact this thread is the very example: David points out that on netdev some of those bugs were already discussed and resolved. Had it been all on lkml we'd all be aware of it.
or had <someone> been on netdev.
this is a single kernel project that is released together as one codebase, so a central place of discussion is obvious and common-sense.
Central doesn't have to mean one-and-only-one-list-for-everything.
so please stop this "too busy and too noisy" nonsense already. It was nonsense 10 years ago and it's nonsense today. In 10 years the kernel grew from a 1 million lines codebase to an 8 million lines codebase, so what? Deal with it and be intelligent about filtering your information influx instead of imposing a hard pre-filtering criteria that restricts intelligent processing of information.
So you have a preferred method of handling email. Please don't force it on the rest of us.
I'll plan to use lkml-list-only when you have convinced DaveM to drop all of the other mailing lists at vger.kernel.org. Yeah, sure.
--- ~Randy
On Wed, Nov 14, 2007 at 09:38:20AM -0800, Randy Dunlap wrote:
On Wed, 14 Nov 2007 15:08:47 +0100 Ingo Molnar wrote:
so please stop this "too busy and too noisy" nonsense already. It was nonsense 10 years ago and it's nonsense today. In 10 years the kernel grew from a 1 million lines codebase to an 8 million lines codebase, so what? Deal with it and be intelligent about filtering your information influx instead of imposing a hard pre-filtering criteria that restricts intelligent processing of information.
So you have a preferred method of handling email. Please don't force it on the rest of us.
I'd be curious for any pointers on tools, actually. I "read" (ok, skim) lkml but still overlook relevant bug reports occasionally. (Fortunately, between Trond and Andrew and others forwarding things it's not actually a problem, but I'm still curious).
--b.
* Randy Dunlap rdunlap@xenotime.net wrote:
On Wed, 14 Nov 2007 15:08:47 +0100 Ingo Molnar wrote:
- Randy Dunlap rdunlap@xenotime.net wrote:
(and this is in no way directed at the networking folks - it holds for all of us. I have one main complaint about networking: the separate netdev list is a bad idea - networking regressions should be discussed and fixed on lkml, like most other subsystems are. Any artificial split of the lk discussion space is bad.)
but here I disagree. LKML is already too busy and noisy. Major subsystems need their own discussion areas.
That's a stupid argument. We lose much more by forced isolation of discussion than what we win by having less traffic! It's _MUCH_
^^^^^^^^^^^^
easier to narrow down information (by filter by threads, by topics,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
by people, etc.) than it is to gobble information together from
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
various fractured sources. We learned it _again and again_ that
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
isolation of kernel discussions causes bad things.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In fact this thread is the very example: David points out that on netdev some of those bugs were already discussed and resolved. Had it been all on lkml we'd all be aware of it.
or had <someone> been on netdev.
countered by the underlined sentences above, just in case you missed it.
Ingo
* Randy Dunlap rdunlap@xenotime.net wrote:
On Wed, 14 Nov 2007 21:16:39 +0100 Ingo Molnar wrote:
countered by the underlined sentences above, just in case you missed it.
I didn't miss your claim.
ok, then you conceded it by not replying to it? good ;-)
Ingo
* Randy Dunlap rdunlap@xenotime.net wrote:
so please stop this "too busy and too noisy" nonsense already. It was nonsense 10 years ago and it's nonsense today. In 10 years the kernel grew from a 1 million lines codebase to an 8 million lines codebase, so what? Deal with it and be intelligent about filtering your information influx instead of imposing a hard pre-filtering criteria that restricts intelligent processing of information.
So you have a preferred method of handling email. Please don't force it on the rest of us.
actually, posting to lkml is the preferred method of handling development for like 70% of all Linux kernel activities. It is _YOU_ who is the sore thumb sticking out, it is you who is forcing people to Cc: around various stupid fractured lists for no good reason. You dont have to read all of lkml, just like you dont read all of netdev either.
once someone decides to work on Linux, information should be fundamentally opt-out, not opt-in. And there should be a central place for people to go to. The "Subject:" line is enough of a filter key - in fact it's _far superior_ to the forced separation of topics that you advocate! Fact is, many regressions happen because they were posted to the wrong list and got ignored or under-handled. I claim that we'd have a much higher quality kernel if we had a single central mailing list instead of these elitist fractured lists. Every kernel topic would have global visibility, and it would be trivially easy to get the interest of other people, across subsystems.
damn, THINK ABOUT IT instead of just ignorantly dismissing my points without even answering them... Often when someone writes to the wrong list and he is told "wrong list", he'd have to repost to the "right list". Lots of extra bounces for the tester for _NO GOOD REASON_. All just because a few developers are too lazy to filter the lkml subjects for their main topic of interest.
I claim that we have far larger lack of testing resources than we have a lack of development resources. So we might as well set up our mailing lists to favor information sharing, instead of imposing this insane separation of lists that some subsystems still insist on. We can evidently throttle development activities by forcing _every subsystem_ to lkml and exposing them to the harsh combined realities of all the crap that we are are writing. Life might look nice and easy on an isolated list, and it's sure convenient not being exposed to ... users.
THAT is our main problem, not your bogus "lkml has too much traffic" argument.
Ingo
From: Ingo Molnar mingo@elte.hu Date: Wed, 14 Nov 2007 15:08:47 +0100
In fact this thread is the very example: David points out that on netdev some of those bugs were already discussed and resolved. Had it been all on lkml we'd all be aware of it.
That's a rediculious argument.
One other reason these bugs are resolved, is that the networking developers only need to subscribe to netdev and not have to listen to all the noise on lkml.
People who want to manage bugs know what list to look on and contact about problems.
Dumping even more crap on lkml is not the answer.
On Wed, 2007-11-14 at 11:56 -0800, David Miller wrote:
From: Ingo Molnar mingo@elte.hu Date: Wed, 14 Nov 2007 15:08:47 +0100
In fact this thread is the very example: David points out that on netdev some of those bugs were already discussed and resolved. Had it been all on lkml we'd all be aware of it.
That's a rediculious argument.
One other reason these bugs are resolved, is that the networking developers only need to subscribe to netdev and not have to listen to all the noise on lkml.
People who want to manage bugs know what list to look on and contact about problems.
Dumping even more crap on lkml is not the answer.
I agree totally with David, and this goes for SCSI too. If it's not reported on linux-scsi, there's a significant chance of us missing the bug report. The fact that some people notice bugs go past on LKML and forward them to linux-scsi is a happy accident and not necessarily something to rely on.
LKML has 10-20x the traffic of linux-scsi and a much smaller signal to noise ratio. Having a specialist list where all the experts in the field hangs out actually enhances our ability to fix bugs.
James
* James Bottomley James.Bottomley@HansenPartnership.com wrote:
On Wed, 2007-11-14 at 11:56 -0800, David Miller wrote:
From: Ingo Molnar mingo@elte.hu Date: Wed, 14 Nov 2007 15:08:47 +0100
In fact this thread is the very example: David points out that on netdev some of those bugs were already discussed and resolved. Had it been all on lkml we'd all be aware of it.
That's a rediculious argument.
One other reason these bugs are resolved, is that the networking developers only need to subscribe to netdev and not have to listen to all the noise on lkml.
People who want to manage bugs know what list to look on and contact about problems.
Dumping even more crap on lkml is not the answer.
I agree totally with David, and this goes for SCSI too. If it's not reported on linux-scsi, there's a significant chance of us missing the bug report. The fact that some people notice bugs go past on LKML and forward them to linux-scsi is a happy accident and not necessarily something to rely on.
LKML has 10-20x the traffic of linux-scsi and a much smaller signal to noise ratio. Having a specialist list where all the experts in the field hangs out actually enhances our ability to fix bugs.
you are actually proving my point. People have to scan lkml for SCSI regressions _anyway_, because otherwise _you_ would miss them. In the case a user is fortunate enough to realize that a regression is SCSI related, and he is lucky enough to pre-select the SCSI mailing list in the first go, he might get a fix from you. That already reduces the number of useful bugreports by about an order of magnitude.
Ingo
* David Miller davem@davemloft.net wrote:
From: Ingo Molnar mingo@elte.hu Date: Wed, 14 Nov 2007 15:08:47 +0100
In fact this thread is the very example: David points out that on netdev some of those bugs were already discussed and resolved. Had it been all on lkml we'd all be aware of it.
That's a rediculious argument.
One other reason these bugs are resolved, is that the networking developers only need to subscribe to netdev and not have to listen to all the noise on lkml.
what noise? If someone really wants networking discussions only, use this procmail rule:
:0 HBc * .*net: * sched-patches
to separate it into an extra folder and use "net: " as an agreed upon Subject line if you really want to narrow things down. (But there would still be all the other mail just in case the developer has to look at the wider picture. There would be no "I'm only subscribed to netdev" excuse. )
but there should still be one central repository for all kernel discussions - just like there is one central repository for all kernel code.
People who want to manage bugs know what list to look on and contact about problems.
i think that's the problem. Developers (and here i dont mean you) who want to do "development only", without being exposed to the global state of the kernel and without being exposed to bugs. I think that's the basic mindset difference. That is one of the factor that is causing assymetric allocation of developers and the increasing detachment from reality.
Dumping even more crap on lkml is not the answer.
that "crap" that i'd like to see dumped upon lkml would be netdev traffic mainly - most of the other kernel development lists (and i'm subscribed to many of them) are low-traffic. netdev is the main reason why we cannot do a "one common discussion forum" approach.
Ingo
On Wed, 14 Nov 2007, Ingo Molnar wrote:
Dumping even more crap on lkml is not the answer.
that "crap" that i'd like to see dumped upon lkml would be netdev traffic mainly - most of the other kernel development lists (and i'm subscribed to many of them) are low-traffic. netdev is the main reason why we cannot do a "one common discussion forum" approach.
hmm, how much work would it be to tweak the mail software on vger to have a linux-all@vger.kernel.org that got a copy of any linux-* list hosted by vger.
this would solve half the problem (people on linux-kernel not seeing discussions on the other lists)
David Lang
Andrew Morton wrote:
On Mon, 12 Nov 2007 22:42:32 -0800 "Natalie Protasevich" protasnb@gmail.com wrote:
..
with CONFIG_NO_HZ and/or CONFIG_HPET_TIMER set kernel 2.6.23 doesn't boot (ARM, Timer) http://bugzilla.kernel.org/show_bug.cgi?id=9229 Kernel: 2.6.23
No response from developers
..
Note: that same bug exists/existed on i386 back when NO_HZ was introduced (2.6.21?). I still see it from time to time on my Quad core system (very rare), but not any more on my Duo notebook where it used to happen about 1 in n boots (n < 10).
AFAICT no fix was ever released for it.
Suspend to RAM resume hangs on a tickless (NO_HZ) kernel http://bugzilla.kernel.org/show_bug.cgi?id=9275 Kernel: 2.6.23 This is HP notebook nc6320 T2400 945GM
No response from developers
..
I *still* get very slow resume-from-RAM quite often here (new in 2.6.22 kernel, wasn't there in early 2.6.23-rc*). Something eventually times out after a minute or so and it comes back. Cannot make it happen reliably, unless I'm in a hurry to get something done. :) I suspect USB here, probably the same loopy bug that we added a "loop limit failsafe" for back in 2.6.21(?).
Mark Lord wrote:
Andrew Morton wrote:
On Mon, 12 Nov 2007 22:42:32 -0800 "Natalie Protasevich" protasnb@gmail.com wrote:
..
..
Suspend to RAM resume hangs on a tickless (NO_HZ) kernel http://bugzilla.kernel.org/show_bug.cgi?id=9275 Kernel: 2.6.23 This is HP notebook nc6320 T2400 945GM
No response from developers
..
I *still* get very slow resume-from-RAM quite often here (new in 2.6.22 kernel, wasn't there in early 2.6.23-rc*).
..
Typo. That should have said:
(new in 2.6.23 kernel, wasn't there in early 2.6.23-rc*).
On Tue, 13 Nov 2007, Mark Lord wrote:
Mark Lord wrote:
Andrew Morton wrote:
On Mon, 12 Nov 2007 22:42:32 -0800 "Natalie Protasevich" protasnb@gmail.com wrote:
..
..
Suspend to RAM resume hangs on a tickless (NO_HZ) kernel http://bugzilla.kernel.org/show_bug.cgi?id=9275 Kernel: 2.6.23 This is HP notebook nc6320 T2400 945GM
No response from developers
..
I *still* get very slow resume-from-RAM quite often here (new in 2.6.22 kernel, wasn't there in early 2.6.23-rc*).
..
Typo. That should have said:
(new in 2.6.23 kernel, wasn't there in early 2.6.23-rc*).
Just asked that :) Is there a chance to bisect that ?
Thanks,
tglx
On Tue, 13 Nov 2007, Mark Lord wrote:
Andrew Morton wrote:
On Mon, 12 Nov 2007 22:42:32 -0800 "Natalie Protasevich" protasnb@gmail.com wrote:
..
with CONFIG_NO_HZ and/or CONFIG_HPET_TIMER set kernel 2.6.23 doesn't boot (ARM, Timer) http://bugzilla.kernel.org/show_bug.cgi?id=9229 Kernel: 2.6.23
No response from developers
..
The bug report is bogus. ARM has no CONFIG_HPET_TIMER.
Note: that same bug exists/existed on i386 back when NO_HZ was introduced (2.6.21?). I still see it from time to time on my Quad core system (very rare), but not any more on my Duo notebook where it used to happen about 1 in n boots (n < 10).
AFAICT no fix was ever released for it.
Hmm, at which point does the boot stop ?
Suspend to RAM resume hangs on a tickless (NO_HZ) kernel http://bugzilla.kernel.org/show_bug.cgi?id=9275 Kernel: 2.6.23 This is HP notebook nc6320 T2400 945GM
No response from developers
..
I *still* get very slow resume-from-RAM quite often here (new in 2.6.22 kernel, wasn't there in early 2.6.23-rc*).
Hmm. Which one 22 or 23 ?
Something eventually times out after a minute or so and it comes back. Cannot make it happen reliably, unless I'm in a hurry to get something done. :) I suspect USB here, probably the same loopy bug that we added a "loop limit failsafe" for back in 2.6.21(?).
Do you have a pointer to that please ?
Thanks,
tglx
Thomas Gleixner wrote:
On Tue, 13 Nov 2007, Mark Lord wrote:
..
I *still* get very slow resume-from-RAM quite often here (new in 2.6.23 kernel, wasn't there in early 2.6.23-rc*).
..
Something eventually times out after a minute or so and it comes back. Cannot make it happen reliably, unless I'm in a hurry to get something done. :) I suspect USB here, probably the same loopy bug that we added a "loop limit failsafe" for back in 2.6.21(?).
Do you have a pointer to that please ?
..
The "limit" added in the code below, which was for messages of this form:
hub 1-1:1.0: hub_port_status failed (err = -71) last message repeated 347 times
drivers/usb/hub.c:
static void hub_tt_kevent (struct work_struct *work) { struct usb_hub *hub = container_of(work, struct usb_hub, tt.kevent); unsigned long flags; int limit = 100;
spin_lock_irqsave (&hub->tt.lock, flags); while (--limit && !list_empty (&hub->tt.clear_list)) { ...
I'm not yet sure what's happening on resume now, but there's this huge long pause with a dark screen and then suddenly the USB subsystem comes to life (my mouse lights up) and the system finally resumes.
More when I know more. But it doesn't happen every time, or even most times, so git-bisect is not possible either.
This one actually requires a developer/maintainer to put in some effort and think about things. Currently, that's me.
-ml
Thomas Gleixner wrote:
On Tue, 13 Nov 2007, Mark Lord wrote:
Andrew Morton wrote:
On Mon, 12 Nov 2007 22:42:32 -0800 "Natalie Protasevich" protasnb@gmail.com wrote:
..
> with CONFIG_NO_HZ and/or CONFIG_HPET_TIMER set kernel 2.6.23 doesn't > boot (ARM, Timer) > http://bugzilla.kernel.org/show_bug.cgi?id=9229 > Kernel: 2.6.23
No response from developers
..
The bug report is bogus. ARM has no CONFIG_HPET_TIMER.
Note: that same bug exists/existed on i386 back when NO_HZ was introduced (2.6.21?). I still see it from time to time on my Quad core system (very rare), but not any more on my Duo notebook where it used to happen about 1 in n boots (n < 10).
AFAICT no fix was ever released for it.
Hmm, at which point does the boot stop ?
..
Just as it prints out these messages, sometimes one of them, sometimes both (or all four on the quad core):
kernel: switched to high resolution mode on cpu 1 kernel: switched to high resolution mode on cpu 0
On Tue, 13 Nov 2007, Mark Lord wrote:
Thomas Gleixner wrote:
On Tue, 13 Nov 2007, Mark Lord wrote:
Andrew Morton wrote:
On Mon, 12 Nov 2007 22:42:32 -0800 "Natalie Protasevich" protasnb@gmail.com wrote:
..
> > with CONFIG_NO_HZ and/or CONFIG_HPET_TIMER set kernel 2.6.23
doesn't
> > boot (ARM, Timer) > > http://bugzilla.kernel.org/show_bug.cgi?id=9229 > > Kernel: 2.6.23 > > No response from developers
..
The bug report is bogus. ARM has no CONFIG_HPET_TIMER.
Note: that same bug exists/existed on i386 back when NO_HZ was introduced (2.6.21?). I still see it from time to time on my Quad core system (very rare), but not any more on my Duo notebook where it used to happen about 1 in n boots (n < 10).
AFAICT no fix was ever released for it.
Hmm, at which point does the boot stop ?
..
Just as it prints out these messages, sometimes one of them, sometimes both (or all four on the quad core):
kernel: switched to high resolution mode on cpu 1 kernel: switched to high resolution mode on cpu 0
It's completely dead afterwards ?
tglx
Thomas Gleixner wrote:
On Tue, 13 Nov 2007, Mark Lord wrote:
Thomas Gleixner wrote:
On Tue, 13 Nov 2007, Mark Lord wrote:
Andrew Morton wrote:
> On Mon, 12 Nov 2007 22:42:32 -0800 "Natalie Protasevich" > protasnb@gmail.com wrote:
..
>>> with CONFIG_NO_HZ and/or CONFIG_HPET_TIMER set kernel 2.6.23 doesn't >>> boot (ARM, Timer) >>> http://bugzilla.kernel.org/show_bug.cgi?id=9229 >>> Kernel: 2.6.23 >>> No response from developers
..
The bug report is bogus. ARM has no CONFIG_HPET_TIMER.
Note: that same bug exists/existed on i386 back when NO_HZ was introduced (2.6.21?). I still see it from time to time on my Quad core system (very rare), but not any more on my Duo notebook where it used to happen about 1 in n boots (n < 10).
AFAICT no fix was ever released for it.
Hmm, at which point does the boot stop ?
..
Just as it prints out these messages, sometimes one of them, sometimes both (or all four on the quad core):
kernel: switched to high resolution mode on cpu 1 kernel: switched to high resolution mode on cpu 0
It's completely dead afterwards ?
Yeah. No magic sysrq key or anything. There's gotta be a race somewhere that's causing it, but it's not obvious where to look for it.
My regular 2-core notebook no longer suffers from it, and subtle .config changes used to make it come and go back when it first appeared.
The quad-core has only done it twice on me thus far.
Tracking this one down looks tricky. It might require some early lockup detection code to be tailor made or something.
Cheers
On Tue, Nov 13, 2007 at 05:07:21PM +0100, Thomas Gleixner wrote:
On Tue, 13 Nov 2007, Mark Lord wrote:
Andrew Morton wrote:
On Mon, 12 Nov 2007 22:42:32 -0800 "Natalie Protasevich" protasnb@gmail.com wrote:
..
with CONFIG_NO_HZ and/or CONFIG_HPET_TIMER set kernel 2.6.23 doesn't boot (ARM, Timer) http://bugzilla.kernel.org/show_bug.cgi?id=9229 Kernel: 2.6.23
No response from developers
..
The bug report is bogus. ARM has no CONFIG_HPET_TIMER.
Plus we've just merged a fix for NO_HZ on PXA platforms due to an utterly broken one-shot implementation. So chances are this problem is now fixed.
However, I object strongly to Andrew's responses to these bugs. He's completely out of line.
Given the wide range of ARM platforms today, it is utterly idiotic to expect a single person to be able to provide responses for all ARM bugs. I for one wish I'd never *VOLUNTEERED* to be a part of the kernel bugzilla, and really *WISH* I could pull out of that function.
Given the wide range of ARM platforms today, it is utterly idiotic to expect a single person to be able to provide responses for all ARM bugs. I for one wish I'd never *VOLUNTEERED* to be a part of the kernel bugzilla, and really *WISH* I could pull out of that function.
You can. Perhaps that bugzilla needs to point to some kind of arm-maintainers@vger.kernel.org list for the various ARM platform maintainers ?
Alan
On Tue, Nov 13, 2007 at 06:25:16PM +0000, Alan Cox wrote:
Given the wide range of ARM platforms today, it is utterly idiotic to expect a single person to be able to provide responses for all ARM bugs. I for one wish I'd never *VOLUNTEERED* to be a part of the kernel bugzilla, and really *WISH* I could pull out of that function.
You can. Perhaps that bugzilla needs to point to some kind of arm-maintainers@vger.kernel.org list for the various ARM platform maintainers ?
That might work - though it would be hard to get all the platform maintainers to be signed up to yet another mailing list, I'm sure sufficient would do.
On Tue, Nov 13, 2007 at 10:34:37PM +0000, Russell King wrote:
On Tue, Nov 13, 2007 at 06:25:16PM +0000, Alan Cox wrote:
Given the wide range of ARM platforms today, it is utterly idiotic to expect a single person to be able to provide responses for all ARM bugs. I for one wish I'd never *VOLUNTEERED* to be a part of the kernel bugzilla, and really *WISH* I could pull out of that function.
You can. Perhaps that bugzilla needs to point to some kind of arm-maintainers@vger.kernel.org list for the various ARM platform maintainers ?
That might work - though it would be hard to get all the platform maintainers to be signed up to yet another mailing list, I'm sure sufficient would do.
As long as it would just be bug reports, I'm sure that most of us could be persuaded to subscribe. Adding another list for general discussions is probably not going to be read, the current list provides more than enough to keep us busy.
On Nov 13, 2007 12:15 PM, Andrew Morton akpm@linux-foundation.org wrote:
On Mon, 12 Nov 2007 22:42:32 -0800 "Natalie Protasevich" protasnb@gmail.com wrote:
This is the listing of the open bugs that are relatively new, around 2.6.22 and up. They are vaguely classified by specific area. (not a full list, there are more :)
[...]
IDE/SATA=========================================================
[...]
DVD-RAM umount and disk free bug http://bugzilla.kernel.org/show_bug.cgi?id=9265 Kernel: 2.6.15 (asked to try current kernel)
No response from developers
Bug was filled under IO/Storage-Other so is it assigned to other_other@kernel-bugs.osdl.org.
Could be a FS problem as well but it is the best to wait for confirmation with 2.6.23 before proceeding further...
On Tue, 2007-11-13 at 03:15 -0800, Andrew Morton wrote:
SCSI==================================================================
qla2xxx: driver initialization does not complete when booting with Port connected http://bugzilla.kernel.org/show_bug.cgi?id=9267 Kernel: 2.6.23.1
No response from developers
Urm, well, if no-one ever tells the SCSI list it's unrealistic to expect anyone to be working on it. As far as I can tell, email was sent to Andrew Vasquez only on 31 October. However, the fault looks to be generic, so he probably just dropped it.
This seems to be the significant line from the trace:
Oct 7 23:35:07 t-host kernel: ISP2422: PCI-X Mode 1 (133 MHz) @ 0000:01:03.0 hdma-, host#=1, fw=4.00.27 [IP] Oct 7 23:35:07 t-host kernel: ACPI: PCI Interrupt 0000:01:03.1[B] -> GSI 29 (level, low) -> IRQ 22 Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.1: Found an ISP2422, irq 22, iobase 0xf8cf4000 Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.1: Configuring PCI space... Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.1: Configure NVRAM parameters... Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.1: Verifying loaded RISC code... Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.1: Allocated (64 KB) for EFT... Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.1: Allocated (1413 KB) for firmware dump... Oct 7 23:35:07 t-host kernel: scsi2 : qla2xxx Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.1: Oct 7 23:35:07 t-host kernel: QLogic Fibre Channel HBA Driver: 8.01.07-k7 Oct 7 23:35:07 t-host kernel: QLogic QLA2462 - PCI-X 2.0 to 4Gb FC, Dual Channel Oct 7 23:35:07 t-host kernel: ISP2422: PCI-X Mode 1 (133 MHz) @ 0000:01:03.1 hdma-, host#=2, fw=4.00.27 [IP] Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.0: LIP reset occured (f8f7). Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.0: LIP occured (f8f7). Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.0: LOOP UP detected (4 Gbps). Oct 7 23:35:07 t-host kernel: ohci_hcd 0000:03:00.0: auto-stop root hub Oct 7 23:35:07 t-host kernel: ohci_hcd 0000:03:00.1: auto-stop root hub Oct 7 23:35:07 t-host kernel: scsi 1:0:0:0: Direct-Access transtec PV610F16R1C 348B PQ: 0 ANSI: 4 Oct 7 23:35:07 t-host kernel: kobject_add failed for 1:0:0:0 with -EEXIST, don't try to register things with the same name in the same directory. Oct 7 23:35:07 t-host kernel: [<c022c841>] kobject_shadow_add+0x111/0x190 Oct 7 23:35:07 t-host kernel: [<c0286814>] device_add+0xc4/0x570 Oct 7 23:35:07 t-host kernel: [<c02c90ce>] scsi_adjust_queue_depth+0x9e/0xf0 Oct 7 23:35:07 t-host kernel: [<c02249b2>] __blk_queue_init_tags+0x32/0x70 Oct 7 23:35:07 t-host kernel: [<c02d302f>] scsi_sysfs_add_sdev+0x4f/0x230 Oct 7 23:35:07 t-host kernel: [<f8d93421>] qla2xxx_slave_configure+0x71/0x100 [qla2xxx] Oct 7 23:35:07 t-host kernel: [<c02d0ecf>] scsi_probe_and_add_lun+0xa5f/0xb40 Oct 7 23:35:07 t-host kernel: [<c02d1559>] __scsi_scan_target+0xd9/0x6c0 Oct 7 23:35:07 t-host kernel: [<c03a5be1>] schedule+0x2e1/0x950 Oct 7 23:35:07 t-host kernel: [<c02d21f9>] scsi_scan_target+0xa9/0xe0 Oct 7 23:35:07 t-host kernel: [<c02d5640>] fc_scsi_scan_rport+0x0/0x80 Oct 7 23:35:07 t-host kernel: [<c02d56a9>] fc_scsi_scan_rport+0x69/0x80 Oct 7 23:35:07 t-host kernel: [<c012b032>] run_workqueue+0x72/0x100 Oct 7 23:35:07 t-host kernel: [<c012e8d0>] prepare_to_wait+0x20/0x70 Oct 7 23:35:07 t-host kernel: [<c012b8c0>] worker_thread+0x0/0x100 Oct 7 23:35:07 t-host kernel: [<c012b964>] worker_thread+0xa4/0x100 Oct 7 23:35:07 t-host kernel: [<c012e720>] autoremove_wake_function+0x0/0x50 Oct 7 23:35:07 t-host kernel: [<c012b8c0>] worker_thread+0x0/0x100 Oct 7 23:35:07 t-host kernel: [<c012e462>] kthread+0x42/0x70 Oct 7 23:35:07 t-host kernel: [<c012e420>] kthread+0x0/0x70 Oct 7 23:35:07 t-host kernel: [<c0103573>] kernel_thread_helper+0x7/0x14 Oct 7 23:35:07 t-host kernel: ======================= Oct 7 23:35:07 t-host kernel: error 1
It looks like some type of sysfs/kobject race in SCSI ... and I think we might have seen it before, just not able to reproduce it reliably.
My bet would be that the LIP which acts like a reset and occurs in the middle of the scan and so the initialising object is killed on the first scan but not yet dead and then can't be re-added on the second scan.
Hannes has patches to help with this, but they're rather complex, and not really 2.6.24 material. I could see if there's a simpler fix.
James
On Tue, 13 Nov 2007 09:33:21 -0600 James Bottomley wrote:
On Tue, 2007-11-13 at 03:15 -0800, Andrew Morton wrote:
SCSI==================================================================
qla2xxx: driver initialization does not complete when booting with Port connected http://bugzilla.kernel.org/show_bug.cgi?id=9267 Kernel: 2.6.23.1
No response from developers
Urm, well, if no-one ever tells the SCSI list it's unrealistic to expect anyone to be working on it. As far as I can tell, email was sent to Andrew Vasquez only on 31 October. However, the fault looks to be generic, so he probably just dropped it.
It seems that new SCSI bugs need to be sent to linux-scsi@vger.kernel.org.
Martin, can you arrange that to happen automatically instead of Andrew having to do it manually?
--- ~Randy
pata_pdc202xx_old excessive ATA bus errors http://bugzilla.kernel.org/show_bug.cgi?id=9337 2.6.24-rc2
No response from developers
Untrue. We've been discussing it on list in the past and its now on bugzilla. Not obvious from outside I realise. That one I'm afraid is probably a longer term item.
DVD-RAM umount and disk free bug http://bugzilla.kernel.org/show_bug.cgi?id=9265 Kernel: 2.6.15 (asked to try current kernel)
No response from developers
Not actually sure who is looking after this now ?
LPC IT8705 POST port making noise on parallel port http://bugzilla.kernel.org/show_bug.cgi?id=9306 Kernel: 2.6.16+
No response from developers
Not sure who really owns parallel. Have grabbed and will sort out.
FILE SYSTEMS=======================================================
ext4: delalloc space accounting problem drops data http://bugzilla.kernel.org/show_bug.cgi?id=9329 Kernel: 2.6.24-rc1
No response from developers
Actually, there has been a response (Eric asked in mailing list and created a bug and got answer to the mailing list): http://marc.info/?l=linux-ext4&m=119454449014728&w=2
POSIX Access Control Lists cause bogus file system check errors http://bugzilla.kernel.org/show_bug.cgi?id=9241 Kernel: 2.6.23.1
Andreas did some work, seemed to lose interest.
As I read the bug it seems that the cause was a filesystem with errors (which were in ACL's and thus kernel didn't boot only with ACL's enabled) and fsck fixed the problem... I would close this one as invalid (OK, I know the filesystem had to be corrupted somehow but unless this is at least occasionally reproducible, there's low chance of finding the bug).
Honza
On Tue, Nov 13, 2007 at 03:15:53AM -0800, Andrew Morton wrote:
On Mon, 12 Nov 2007 22:42:32 -0800 "Natalie Protasevich" protasnb@gmail.com wrote:
PLATFORM===============================================================
xipImage is built so that uBoot cant run it (ARM) http://bugzilla.kernel.org/show_bug.cgi?id=9356 Kernel: 2.6.21
Zero responses from developers
For christ sake Andrew. Some of us are not employed to do kernel work 24h x 365days a year. You might be, I'm not.
First thing, it's not a regression. Second thing, it's *not* a bug.
uboot requires kernel images to be specially wrapped up in their crappy formats before uboot will recognise it. This means that if someone wants to boot a binary image with uboot, they need to either:
1. work out the correct 'mkimage' command and run that program after the kernel build has completed.
2. sort out adding a new target to the kernel makefiles to run this uboot specific 'mkimage' command automatically.
And Alexandre (the original feature-missing reporter) has linked to a message where a patch was proposed to do (2). So obviously it's no longer a problem for the reporter.
with CONFIG_NO_HZ and/or CONFIG_HPET_TIMER set kernel 2.6.23 doesn't boot (ARM, Timer) http://bugzilla.kernel.org/show_bug.cgi?id=9229 Kernel: 2.6.23
No response from developers
Bug was assigned to reporter, so I ignored it on the grounds that the reporter was resolving it. Plus, until recently I didn't have any workable PXA systems to test stuff on.
In the end, a similar issue has been resolved anyway after a lot of discussion on the ARM lists about how PXA should handle one-shot mode with clockevents. It took absolutely ages to get agreement on what was a simple patch.
commit 91bc51d8a10b00d8233dd5b6f07d7eb40828b87d Author: Russell King rmk@dyn-67.arm.linux.org.uk Date: Thu Nov 8 23:35:46 2007 +0000
[ARM] pxa: fix one-shot timer mode
One-shot timer mode on PXA has various bugs which prevent kernels build with NO_HZ enabled booting. They end up spinning on a permanently asserted timer interrupt because we don't properly clear it down - clearing the OIER bit does not stop the pending interrupt status. Fix this in the set_mode handler as well.
Moreover, the code which sets the next expiry point may race with the hardware, and we might not set the match register sufficiently in the future. If we encounter that situation, return -ETIME so the generic time code retries.
Acked-by: Thomas Gleixner tglx@linutronix.de Acked-by: Nicolas Pitre nico@cam.org Signed-off-by: Russell King rmk+kernel@arm.linux.org.uk
Ergo, the bug can be closed provided the reporter re-tests a recent git snapshot. Sorry, no idea how the above commit relates to Linus' releases and/or git snapshots.
On Tue, 13 Nov 2007, Andrew Morton wrote:
HID====================================================================
Kernel NULL pointer dereference at :usbhid:hiddev_ioctl+0x2f/0xabc http://bugzilla.kernel.org/show_bug.cgi?id=9216 Kernel: 2.6.23.1 Looks like this is a regression
No response from developers
Hi,
it is assigned to 'other_modules@kernel-bugs.osdl.org', so I didn't notice, it's as simple as that.
Hi!
Suspend to RAM resume hangs on a tickless (NO_HZ) kernel http://bugzilla.kernel.org/show_bug.cgi?id=9275 Kernel: 2.6.23 This is HP notebook nc6320 T2400 945GM
No response from developers
Maybe I'm optimistic, but I expected Ingo/Thomas to look after nohz problems. nohz=off highres=off fixes more than one suspend problem...
...stuff I've seen with NOHZ even without suspend (cursor blinking irregulary) make me think that nohz perhaps should not be used in production just yet...
Pavel
FWIW, I see the same problem with another HP notebook, DV4378EA with radeon X700 video card. It does not happen frequently but I can say that since I disabled the tickless feature I can't reproduce the problem anymore.
On Nov 14, 2007 2:24 PM, Pavel Machek pavel@ucw.cz wrote:
Hi!
Suspend to RAM resume hangs on a tickless (NO_HZ) kernel http://bugzilla.kernel.org/show_bug.cgi?id=9275 Kernel: 2.6.23 This is HP notebook nc6320 T2400 945GM
No response from developers
Maybe I'm optimistic, but I expected Ingo/Thomas to look after nohz problems. nohz=off highres=off fixes more than one suspend problem...
...stuff I've seen with NOHZ even without suspend (cursor blinking irregulary) make me think that nohz perhaps should not be used in production just yet...
Pavel
-- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Wed, Nov 14, 2007 at 01:24:48PM +0000, Pavel Machek wrote:
Hi!
Suspend to RAM resume hangs on a tickless (NO_HZ) kernel http://bugzilla.kernel.org/show_bug.cgi?id=9275 Kernel: 2.6.23 This is HP notebook nc6320 T2400 945GM
No response from developers
Maybe I'm optimistic, but I expected Ingo/Thomas to look after nohz problems. nohz=off highres=off fixes more than one suspend problem...
...stuff I've seen with NOHZ even without suspend (cursor blinking irregulary) make me think that nohz perhaps should not be used in production just yet...
It appears that bug 9229 has been solved, and the reporter of that bug now says that:
If I unset NO_TZ suspend/resume works. If I set it suspend/resume doesn't works.
So I think this guy is now suffering from bug #9275
Is there any chance this can be distributed individually to the respecive lists (i.e. ACPI issues go to their mailing list, ALSA to their mailing list, etc).
I just spent the last 35 minutes reading 80 emails, mostly flame throwing, that had absolutely NOTHING to do with the bugs I work on (Alsa, snd-hda-intel to be specific). Besides, a lot of the language is very unprofessional, and some of us have to read these at work.
Thanks,
Tobin Davis Part-time Alsa developer
On Mon, 2007-11-12 at 22:42 -0800, Natalie Protasevich wrote:
This is the listing of the open bugs that are relatively new, around 2.6.22 and up. They are vaguely classified by specific area. (not a full list, there are more :)
The good part is that reporters of the bugs below are still around and haven't dissipated, or disposed of their hardware, so it is a good time to get the bugs. Those bugzillas that have been started as regressions on Rafael's list are not mentioned here so far, since they are being tracked as new regressions already.
It would be appreciated if the corresponding maintenance team could take a look, close off any which are fixed and see if they can fix any which aren't.
NOTE: when replying to this email, please add the bug number to the Subject in the form [Bug 1234] so that bugzilla will capture the discussion. Thanks.
ACPI====================================================================
System does not load without acpi=off ide=nodma noapic http://bugzilla.kernel.org/show_bug.cgi?id=9358 Kernel: 2.6.23.1
ACPI Error attaching device data http://bugzilla.kernel.org/show_bug.cgi?id=9354 Kernel: 2.6.24-rc2
/proc/acpi/battery displays Incorrect voltages http://bugzilla.kernel.org/show_bug.cgi?id=9341 Kernel: 2.6.23.1
PATA scan: ACPI Exception AE_AML_PACKAGE_LIMIT... is beyond end of object http://bugzilla.kernel.org/show_bug.cgi?id=9320 Kernel: 2.6.24-rc2 (Tejun: calling _GTF without calling _STM first. _GTM doesn't have any prerequisite (it can't). Can someone familiar with ACPI tell me why the method is failing? At any rate, libata should work fine regardless of ACPI failures. Maybe it's time to start blacklist to skip ATA-ACPI for some boards to avoid those annoying messages during boot)
ACPI Battery Info in /sys but not /proc/acpi http://bugzilla.kernel.org/show_bug.cgi?id=9183 Kernel: 2.6.23-rc8-mm2
When using ACPI on a Compaq Presario V6221EU the laptop goes into deadlock after a random amount of time http://bugzilla.kernel.org/show_bug.cgi?id=9118 Kernel: 2.6.23-rc6
ACPI video driver should validate brightness level before setting it via _BCM http://bugzilla.kernel.org/show_bug.cgi?id=9277 Kernel: 2.6.23
VIDEO/DVB
dvb driver reboot system http://bugzilla.kernel.org/show_bug.cgi?id=9357 Kernel: 2.6.21.5
PLATFORM===============================================================
xipImage is built so that uBoot cant run it (ARM) http://bugzilla.kernel.org/show_bug.cgi?id=9356 Kernel: 2.6.21
Samsung R20 - ACPI: PCI Root Bridge [PCI0] (0000:00) http://bugzilla.kernel.org/show_bug.cgi?id=9339 Kernel: 2.6.24 (boot is very long ..MP-BIOS bug: 8254 timer not connected to IO-APIC then the boot stop at : ACPI: PCI Root Bridge [PCI0] (0000:00) (during 3 minutes, and boot continue)
system_64.h: switch_to inline asm should be more robbust wrt optimizations http://bugzilla.kernel.org/show_bug.cgi?id=9302 Kernel: 2.6.24-rc1
with CONFIG_NO_HZ and/or CONFIG_HPET_TIMER set kernel 2.6.23 doesn't boot (ARM, Timer) http://bugzilla.kernel.org/show_bug.cgi?id=9229 Kernel: 2.6.23
NETWORKING===========================================================
RTNLGRP_ND_USEROPT does not report ifindex (IPv6) http://bugzilla.kernel.org/show_bug.cgi?id=9349 Kernel: 2.6.24+
a kernel error happend in the func: __skb_dequeue when using in pfifo_fast_dequeue http://bugzilla.kernel.org/show_bug.cgi?id=9342 Kernel: 2.6.11.1 - reporter asked to try recent kernel
e100 does not work after boot http://bugzilla.kernel.org/show_bug.cgi?id=9336 Kernel: 2.6.23.1
2.6.23.1-smp kernel panic (network-related) http://bugzilla.kernel.org/show_bug.cgi?id=9318 Kernel: 2.6.23.1 Infiniband panic
sundance -> 4port D-Link System Inc DFE-580TX -> Log errors http://bugzilla.kernel.org/show_bug.cgi?id=9311 Kernel: 2.6.22.9
via-rhine driver stalls with: PHY status 786d, resetting... http://bugzilla.kernel.org/show_bug.cgi?id=9300 Kernel: 2.6.23+
Weird network problems with 2.6.23-rc2 http://bugzilla.kernel.org/show_bug.cgi?id=9080 http://lkml.org/lkml/2007/8/11/40 - description
rt2500pci: low TCP throughput (wireless) http://bugzilla.kernel.org/show_bug.cgi?id=9273 Kernel: 2.6.24-rc1 This is a regression
Unable to build wifi network between zd1201 and b43 http://bugzilla.kernel.org/show_bug.cgi?id=9237 Kernel: 2.6.24-rc1
Crash after module unload in b43 (wireless) http://bugzilla.kernel.org/show_bug.cgi?id=9233 Kernel: 2.6.24-rc1
(net typhoon) "no descs for cmd, had (needed) 0 (1) cmd, 31 (7) resp" http://bugzilla.kernel.org/show_bug.cgi?id=9225 Kernel: 2.6.23.1
IDE/SATA=========================================================
pata_pdc202xx_old excessive ATA bus errors http://bugzilla.kernel.org/show_bug.cgi?id=9337 2.6.24-rc2
Drive seagate ST380011AS needs to be blacklisted http://bugzilla.kernel.org/show_bug.cgi?id=9309 Kernel: 2.6.22.X
DVD-RAM umount and disk free bug http://bugzilla.kernel.org/show_bug.cgi?id=9265 Kernel: 2.6.15 (asked to try current kernel)
FILE SYSTEMS=======================================================
ext4: delalloc space accounting problem drops data http://bugzilla.kernel.org/show_bug.cgi?id=9329 Kernel: 2.6.24-rc1
POSIX Access Control Lists cause bogus file system check errors http://bugzilla.kernel.org/show_bug.cgi?id=9241 Kernel: 2.6.23.1
MEMORY MANAGEMENT================================================
My system hangs when it has no more free memory to allocate via malloc() http://bugzilla.kernel.org/show_bug.cgi?id=9316 Kernel: 2.6.23 User program, "My system hangs when it has no more free memory to allocate via malloc()"
BUG: unable to handle kernel paging request at virtual address 26121228/kswapd0[231] exited with preempt_count 1 http://bugzilla.kernel.org/show_bug.cgi?id=9305 EIP is at free_block+0x6d/0xe4 Kernel: 2.6.22.6
POWER MANAGEMENT==================================================
IBM X41 looses time after Suspend2Disk http://bugzilla.kernel.org/show_bug.cgi?id=9314 Kernel: 2.6.23
Suspend to RAM resume hangs on a tickless (NO_HZ) kernel http://bugzilla.kernel.org/show_bug.cgi?id=9275 Kernel: 2.6.23 This is HP notebook nc6320 T2400 945GM
VIDEO DRIVERS========================================================
No text consoles with FRAMEBUFFER_CONSOLE_DETECT_PRIMARY http://bugzilla.kernel.org/show_bug.cgi?id=9310 Kernel: 2.6.24-rc1 This is a regression
PARALLEL PORT========================================================
LPC IT8705 POST port making noise on parallel port http://bugzilla.kernel.org/show_bug.cgi?id=9306 Kernel: 2.6.16+
I/O STORAGE===========================================================
kernel bug from pktcdvd http://bugzilla.kernel.org/show_bug.cgi?id=9294 Kernel: 2.6.23
After pci-e video card was installed, pci add-on usb card & firewire card fail http://bugzilla.kernel.org/show_bug.cgi?id=9223 Kernel: 2.6.20 (testing of latest kernel requested)
SCSI==================================================================
qla2xxx: driver initialization does not complete when booting with Port connected http://bugzilla.kernel.org/show_bug.cgi?id=9267 Kernel: 2.6.23.1
SOUND ALSA============================================================
Unable to load snd-hda-intel module: Unknown symbol in module, or unknown parameter http://bugzilla.kernel.org/show_bug.cgi?id=9242 Kernel: 2.6.24-rc1
usbaudio microphone: regular sound distortion on several Logitech Webcams http://bugzilla.kernel.org/show_bug.cgi?id=9230 Kernel: 2.6.22.9
HID====================================================================
Kernel NULL pointer dereference at :usbhid:hiddev_ioctl+0x2f/0xabc http://bugzilla.kernel.org/show_bug.cgi?id=9216 Kernel: 2.6.23.1 Looks like this is a regression _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
participants (47)
-
Adrian Bunk
-
Alan Cox
-
Andrew Morton
-
Bartlomiej Zolnierkiewicz
-
Ben Dooks
-
Benoit Boissinot
-
Bron Gondwana
-
Chuck Ebbert
-
Daniel Barkalow
-
David Miller
-
david@lang.hm
-
Denys Vlasenko
-
Fabio Comolli
-
Frans Pop
-
Gabriel C
-
Giacomo A. Catenazzi
-
Hannes Reinecke
-
Ingo Molnar
-
J. Bruce Fields
-
James Bottomley
-
Jan Kara
-
Jaroslav Kysela
-
Jens Axboe
-
Jiri Kosina
-
Jörn Engel
-
Kok, Auke
-
Larry Finger
-
Mark Lord
-
Matthew Wilcox
-
Natalie Protasevich
-
Olivier Galibert
-
Pavel Machek
-
Peter Zijlstra
-
Rafael J. Wysocki
-
Randy Dunlap
-
Randy Dunlap
-
Ray Lee
-
Rene Herman
-
Romano Giannetti
-
Russell King
-
Sam Ravnborg
-
Stephen Hemminger
-
Takashi Iwai
-
Theodore Tso
-
Thomas Gleixner
-
Tobin Davis
-
Willy Tarreau