On Tue, Jan 25, 2022 at 01:56:05PM +0100, Matthias Schiffer wrote:
On Tue, 2022-01-25 at 09:25 +0100, Geert Uytterhoeven wrote:
On Mon, Jan 24, 2022 at 10:02 PM Sergey Shtylyov s.shtylyov@omp.ru wrote:
On 1/24/22 6:01 PM, Andy Shevchenko wrote:
...
- The vIRQ0 handling: a) WARN() followed by b) returned
value 0
I'm happy with the vIRQ0 handling. Today platform_get_irq() and it's silent variant returns either a valid and usuable irq number or a negative error value. That's totally fine.
It might return 0. Actually it seems that the WARN() can only be issued in two cases:
- SPARC with vIRQ0 in one of the array member
- fallback to ACPI for GPIO IRQ resource with index 0
You have probably missed the recent discovery that arch/sh/boards/board-aps4*.c causes IRQ0 to be passed as a direct IRQ resource?
So far no one reported seeing the big fat warning ;-)
FWIW, we had a similar issue with an IRQ resource passed from the tqmx86 MFD driver do the GPIO driver, which we noticed due to this warning, and which was fixed in a946506c48f3bd09363c9d2b0a178e55733bcbb6 and 9b87f43537acfa24b95c236beba0f45901356eb2.
No, it's not, unfortunately :-( You just band aided the warning issue, but the root cause is the WARN() and possibility to see valid (v)IRQ0 in the resources. See below.
I believe these changes are what promted this whole discussion and led to my "Reported-by" on the patch?
It is not entirely clear to me when IRQ 0 is valid and when it isn't, but the warning seems useful to me. Maybe it would make more sense to warn when such an IRQ resource is registered for a platform device, and not when it is looked up?
My opinion is that it would be very confusing if there are any places in the kernel (on some platforms) where IRQ 0 is valid,
And those places are board files like yours :( They have to be fixed eventually. Ideally by using IRQ domains. At least that's how it's done elsewhere.
but for platform_get_irq() it would suddenly mean "not found". Keeping a negative return value seems preferable to me for this reason.
IRQ 0 is valid, vIRQ0 (or read it as cookie) is not.
Now, the problem in your case is that you are talking about board files, while ACPI and DT never gives resource with vIRQ0. For board files some (legacy) code decides that it's fine to supply HW IRQ, while the de facto case is that platform_get_resource() returns whatever is in the resource, while platform_get_irq() should return a cookie.
(An alternative, more involved idea would be to add 1 to all IRQ "cookies", so IRQ 0 would return 1, leaving 0 as a special value. I have absolutely no idea how big the API surface is that would need changes, and it is likely not worth the effort at all.)
This is what IRQ domains do, they start vIRQs from 1.
The bottom line here is the SPARC case. Anybody familiar with the platform can shed a light on this. If there is no such case, we may remove warning along with ret = 0 case from platfrom_get_irq().
I'm afraid you're too fast here... :-) We'll have a really hard time if we continue to allow IRQ0 to be returned by platform_get_irq() -- we'll have oto fileter it out in the callers then...
So far no one reported seeing the big fat warning?
- The specific cookie for "IRQ not found, while no error
happened" case
Not sure what you mean here. I have no problem that a situation I can cope with is called an error for the query function. I just do error handling and continue happily. So the part "while no error happened" is irrelevant to me.
I meant that instead of using special error code, 0 is very much good for the cases when IRQ is not found. It allows to distinguish -ENXIO from the low layer from -ENXIO with this magic meaning.
I don't see how -ENXIO can trickle from the lower layers, frankly...
It might one day, leading to very hard to track bugs.
As gregkh noted, changing the return value without also making the compile fail will be a huge PITA whenever driver patches are back- or forward-ported, as it would require subtle changes in error paths, which can easily slip through unnoticed, in particular with half- automated stable backports.
Let's not modify kernel at all then, because in many cases it is a PITA for back- or forward-porting :-)
Even if another return value like -ENODEV might be better aligned with ...regulator_get_optional() and similar functions, or we even find a way to make 0 usable for this, none of the proposed changes strike me as big enough a win to outweigh the churn caused by making such a change at all.
Yeah, let's continue to suffer from ugly interface and see more band aids landing around...