Re: [PATCH] driver core: platform: Rename platform_get_irq_optional() to platform_get_irq_silent()
On Thu, Jan 20, 2022 at 08:57:18AM +0100, Uwe Kleine-König wrote:
On Wed, Jan 19, 2022 at 08:51:29PM +0200, Andy Shevchenko wrote:
On Sat, Jan 15, 2022 at 04:45:39PM +0100, Uwe Kleine-König wrote:
On Fri, Jan 14, 2022 at 03:04:38PM +0200, Andy Shevchenko wrote:
On Thu, Jan 13, 2022 at 08:43:58PM +0100, Uwe Kleine-König wrote:
It'd certainly be good to name anything that doesn't correspond to one of the existing semantics for the API (!) something different rather than adding yet another potentially overloaded meaning.
It seems we're (at least) three who agree about this. Here is a patch fixing the name.
And similar number of people are on the other side.
If someone already opposed to the renaming (and not only the name) I must have missed that.
So you think it's a good idea to keep the name platform_get_irq_optional() despite the "not found" value returned by it isn't usable as if it were a normal irq number?
I meant that on the other side people who are in favour of Sergey's patch. Since that I commented already that I opposed the renaming being a standalone change.
Do you agree that we have several issues with platform_get_irq*() APIs?
- The unfortunate naming
unfortunate naming for the currently implemented semantic, yes.
Yes.
- The vIRQ0 handling: a) WARN() followed by b) returned value 0
I'm happy with the vIRQ0 handling. Today platform_get_irq() and it's silent variant returns either a valid and usuable irq number or a negative error value. That's totally fine.
It might return 0. Actually it seems that the WARN() can only be issued in two cases: - SPARC with vIRQ0 in one of the array member - fallback to ACPI for GPIO IRQ resource with index 0
But the latter is bogus, because it would mean a bug in the ACPI code.
The bottom line here is the SPARC case. Anybody familiar with the platform can shed a light on this. If there is no such case, we may remove warning along with ret = 0 case from platfrom_get_irq().
- The specific cookie for "IRQ not found, while no error happened" case
Not sure what you mean here. I have no problem that a situation I can cope with is called an error for the query function. I just do error handling and continue happily. So the part "while no error happened" is irrelevant to me.
I meant that instead of using special error code, 0 is very much good for the cases when IRQ is not found. It allows to distinguish -ENXIO from the low layer from -ENXIO with this magic meaning.
Additionally I see the problems:
- The semantic as implemented in Sergey's patch isn't better than the
current one.
I disagree on this. See above on why.
platform_get_irq*() is still considerably different from (clk|gpiod)_get* because the not-found value for the _optional variant isn't usuable for the irq case. For clk and gpio I get rid of a whole if branch, for irq I only change the if-condition. (And if that change is considered good or bad seems to be subjective.)
You are mixing up two things: - semantics of the pointer - semantics of the cookie
Like I said previously the mistake is in putting an equal sign between apples and oranges (or in terms of Python, which is a good example here, None and False objects, where in both case 0 is magic and Python `if X`, `while `X` will work in the same way, the `typeof(X)` is different semantically).
For the idea to add a warning to platform_get_irq_optional for all but -ENXIO (and -EPROBE_DEFER), I see the problem:
- platform_get_irq*() issuing an error message is only correct most of
the time and given proper error handling in the caller (which might be able to handle not only -ENXIO but maybe also -EINVAL[1]) the error message is irritating. Today platform_get_irq() emits an error message for all but -EPROBE_DEFER. As soon as we find a driver that handles -EINVAL we need a function platform_get_irq_variant1 to be silent for -EINVAL, -EPROBE_DEFER and -ENXIO (or platform_get_irq_variant2 that is only silent for -EINVAL and -EPROBE_DEFER?)
IMHO a query function should always be silent and let the caller do the error handling. And if it's only because
mydev: IRQ index 0 not found
is worse than
mydev: neither TX irq not a muxed RX/TX irq found
. Also "index 0" is irritating for devices that are expected to have only a single irq (i.e. the majority of all devices).
Yeah, ack the #5.
Yes, I admit, we can safe some code by pushing the error message in a query function. But that doesn't only have advantages.
[1] Looking through the source I wonder: What are the errors that can happen in platform_get_irq*()? (calling everything but a valid irq number an error) Looking at many callers, they only seem to expect "not found" and some "probe defer" (even platform_get_irq() interprets everything but -EPROBE_DEFER as "IRQ index %u not found\n".) IMHO before we should consider to introduce a platform_get_irq*() variant with improved semantics, some cleanup in the internals of the irq lookup are necessary.
Hello!
On 1/24/22 6:01 PM, Andy Shevchenko wrote:
> It'd certainly be good to name anything that doesn't correspond to one > of the existing semantics for the API (!) something different rather > than adding yet another potentially overloaded meaning.
It seems we're (at least) three who agree about this. Here is a patch fixing the name.
And similar number of people are on the other side.
If someone already opposed to the renaming (and not only the name) I must have missed that.
So you think it's a good idea to keep the name platform_get_irq_optional() despite the "not found" value returned by it isn't usable as if it were a normal irq number?
I meant that on the other side people who are in favour of Sergey's patch. Since that I commented already that I opposed the renaming being a standalone change.
Do you agree that we have several issues with platform_get_irq*() APIs?
[...]
- The vIRQ0 handling: a) WARN() followed by b) returned value 0
I'm happy with the vIRQ0 handling. Today platform_get_irq() and it's silent variant returns either a valid and usuable irq number or a negative error value. That's totally fine.
It might return 0. Actually it seems that the WARN() can only be issued in two cases:
- SPARC with vIRQ0 in one of the array member
- fallback to ACPI for GPIO IRQ resource with index 0
You have probably missed the recent discovery that arch/sh/boards/board-aps4*.c causes IRQ0 to be passed as a direct IRQ resource?
But the latter is bogus, because it would mean a bug in the ACPI code.
Worth changing >= 0 to > 0 there, maybe?
The bottom line here is the SPARC case. Anybody familiar with the platform can shed a light on this. If there is no such case, we may remove warning along with ret = 0 case from platfrom_get_irq().
I'm afraid you're too fast here... :-) We'll have a really hard time if we continue to allow IRQ0 to be returned by platform_get_irq() -- we'll have oto fileter it out in the callers then...
- The specific cookie for "IRQ not found, while no error happened" case
Not sure what you mean here. I have no problem that a situation I can cope with is called an error for the query function. I just do error handling and continue happily. So the part "while no error happened" is irrelevant to me.
I meant that instead of using special error code, 0 is very much good for the cases when IRQ is not found. It allows to distinguish -ENXIO from the low layer from -ENXIO with this magic meaning.
I don't see how -ENXIO can trickle from the lower layers, frankly...
[...]
MBR, Sergey
Hi Sergey,
On Mon, Jan 24, 2022 at 10:02 PM Sergey Shtylyov s.shtylyov@omp.ru wrote:
On 1/24/22 6:01 PM, Andy Shevchenko wrote:
>> It'd certainly be good to name anything that doesn't correspond to one >> of the existing semantics for the API (!) something different rather >> than adding yet another potentially overloaded meaning. > > It seems we're (at least) three who agree about this. Here is a patch > fixing the name.
And similar number of people are on the other side.
If someone already opposed to the renaming (and not only the name) I must have missed that.
So you think it's a good idea to keep the name platform_get_irq_optional() despite the "not found" value returned by it isn't usable as if it were a normal irq number?
I meant that on the other side people who are in favour of Sergey's patch. Since that I commented already that I opposed the renaming being a standalone change.
Do you agree that we have several issues with platform_get_irq*() APIs?
[...]
- The vIRQ0 handling: a) WARN() followed by b) returned value 0
I'm happy with the vIRQ0 handling. Today platform_get_irq() and it's silent variant returns either a valid and usuable irq number or a negative error value. That's totally fine.
It might return 0. Actually it seems that the WARN() can only be issued in two cases:
- SPARC with vIRQ0 in one of the array member
- fallback to ACPI for GPIO IRQ resource with index 0
You have probably missed the recent discovery that arch/sh/boards/board-aps4*.c causes IRQ0 to be passed as a direct IRQ resource?
So far no one reported seeing the big fat warning ;-)
The bottom line here is the SPARC case. Anybody familiar with the platform can shed a light on this. If there is no such case, we may remove warning along with ret = 0 case from platfrom_get_irq().
I'm afraid you're too fast here... :-) We'll have a really hard time if we continue to allow IRQ0 to be returned by platform_get_irq() -- we'll have oto fileter it out in the callers then...
So far no one reported seeing the big fat warning?
- The specific cookie for "IRQ not found, while no error happened" case
Not sure what you mean here. I have no problem that a situation I can cope with is called an error for the query function. I just do error handling and continue happily. So the part "while no error happened" is irrelevant to me.
I meant that instead of using special error code, 0 is very much good for the cases when IRQ is not found. It allows to distinguish -ENXIO from the low layer from -ENXIO with this magic meaning.
I don't see how -ENXIO can trickle from the lower layers, frankly...
It might one day, leading to very hard to track bugs.
Gr{oetje,eeting}s,
Geert
-- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On Tue, 2022-01-25 at 09:25 +0100, Geert Uytterhoeven wrote:
Hi Sergey,
On Mon, Jan 24, 2022 at 10:02 PM Sergey Shtylyov s.shtylyov@omp.ru wrote:
On 1/24/22 6:01 PM, Andy Shevchenko wrote:
> > > It'd certainly be good to name anything that doesn't > > > correspond to one > > > of the existing semantics for the API (!) something > > > different rather > > > than adding yet another potentially overloaded > > > meaning. > > > > It seems we're (at least) three who agree about this. > > Here is a patch > > fixing the name. > > And similar number of people are on the other side.
If someone already opposed to the renaming (and not only the name) I must have missed that.
So you think it's a good idea to keep the name platform_get_irq_optional() despite the "not found" value returned by it isn't usable as if it were a normal irq number?
I meant that on the other side people who are in favour of Sergey's patch. Since that I commented already that I opposed the renaming being a standalone change.
Do you agree that we have several issues with platform_get_irq*() APIs?
[...]
- The vIRQ0 handling: a) WARN() followed by b) returned
value 0
I'm happy with the vIRQ0 handling. Today platform_get_irq() and it's silent variant returns either a valid and usuable irq number or a negative error value. That's totally fine.
It might return 0. Actually it seems that the WARN() can only be issued in two cases:
- SPARC with vIRQ0 in one of the array member
- fallback to ACPI for GPIO IRQ resource with index 0
You have probably missed the recent discovery that arch/sh/boards/board-aps4*.c causes IRQ0 to be passed as a direct IRQ resource?
So far no one reported seeing the big fat warning ;-)
FWIW, we had a similar issue with an IRQ resource passed from the tqmx86 MFD driver do the GPIO driver, which we noticed due to this warning, and which was fixed in a946506c48f3bd09363c9d2b0a178e55733bcbb6 and 9b87f43537acfa24b95c236beba0f45901356eb2. I believe these changes are what promted this whole discussion and led to my "Reported-by" on the patch?
It is not entirely clear to me when IRQ 0 is valid and when it isn't, but the warning seems useful to me. Maybe it would make more sense to warn when such an IRQ resource is registered for a platform device, and not when it is looked up?
My opinion is that it would be very confusing if there are any places in the kernel (on some platforms) where IRQ 0 is valid, but for platform_get_irq() it would suddenly mean "not found". Keeping a negative return value seems preferable to me for this reason.
(An alternative, more involved idea would be to add 1 to all IRQ "cookies", so IRQ 0 would return 1, leaving 0 as a special value. I have absolutely no idea how big the API surface is that would need changes, and it is likely not worth the effort at all.)
The bottom line here is the SPARC case. Anybody familiar with the platform can shed a light on this. If there is no such case, we may remove warning along with ret = 0 case from platfrom_get_irq().
I'm afraid you're too fast here... :-) We'll have a really hard time if we continue to allow IRQ0 to be returned by platform_get_irq() -- we'll have oto fileter it out in the callers then...
So far no one reported seeing the big fat warning?
- The specific cookie for "IRQ not found, while no error
happened" case
Not sure what you mean here. I have no problem that a situation I can cope with is called an error for the query function. I just do error handling and continue happily. So the part "while no error happened" is irrelevant to me.
I meant that instead of using special error code, 0 is very much good for the cases when IRQ is not found. It allows to distinguish -ENXIO from the low layer from -ENXIO with this magic meaning.
I don't see how -ENXIO can trickle from the lower layers, frankly...
It might one day, leading to very hard to track bugs.
As gregkh noted, changing the return value without also making the compile fail will be a huge PITA whenever driver patches are back- or forward-ported, as it would require subtle changes in error paths, which can easily slip through unnoticed, in particular with half- automated stable backports.
Even if another return value like -ENODEV might be better aligned with ...regulator_get_optional() and similar functions, or we even find a way to make 0 usable for this, none of the proposed changes strike me as big enough a win to outweigh the churn caused by making such a change at all.
Kind regards, Matthias
Gr{oetje,eeting}s,
Geert
-- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On Tue, Jan 25, 2022 at 01:56:05PM +0100, Matthias Schiffer wrote:
On Tue, 2022-01-25 at 09:25 +0100, Geert Uytterhoeven wrote:
On Mon, Jan 24, 2022 at 10:02 PM Sergey Shtylyov s.shtylyov@omp.ru wrote:
On 1/24/22 6:01 PM, Andy Shevchenko wrote:
...
- The vIRQ0 handling: a) WARN() followed by b) returned
value 0
I'm happy with the vIRQ0 handling. Today platform_get_irq() and it's silent variant returns either a valid and usuable irq number or a negative error value. That's totally fine.
It might return 0. Actually it seems that the WARN() can only be issued in two cases:
- SPARC with vIRQ0 in one of the array member
- fallback to ACPI for GPIO IRQ resource with index 0
You have probably missed the recent discovery that arch/sh/boards/board-aps4*.c causes IRQ0 to be passed as a direct IRQ resource?
So far no one reported seeing the big fat warning ;-)
FWIW, we had a similar issue with an IRQ resource passed from the tqmx86 MFD driver do the GPIO driver, which we noticed due to this warning, and which was fixed in a946506c48f3bd09363c9d2b0a178e55733bcbb6 and 9b87f43537acfa24b95c236beba0f45901356eb2.
No, it's not, unfortunately :-( You just band aided the warning issue, but the root cause is the WARN() and possibility to see valid (v)IRQ0 in the resources. See below.
I believe these changes are what promted this whole discussion and led to my "Reported-by" on the patch?
It is not entirely clear to me when IRQ 0 is valid and when it isn't, but the warning seems useful to me. Maybe it would make more sense to warn when such an IRQ resource is registered for a platform device, and not when it is looked up?
My opinion is that it would be very confusing if there are any places in the kernel (on some platforms) where IRQ 0 is valid,
And those places are board files like yours :( They have to be fixed eventually. Ideally by using IRQ domains. At least that's how it's done elsewhere.
but for platform_get_irq() it would suddenly mean "not found". Keeping a negative return value seems preferable to me for this reason.
IRQ 0 is valid, vIRQ0 (or read it as cookie) is not.
Now, the problem in your case is that you are talking about board files, while ACPI and DT never gives resource with vIRQ0. For board files some (legacy) code decides that it's fine to supply HW IRQ, while the de facto case is that platform_get_resource() returns whatever is in the resource, while platform_get_irq() should return a cookie.
(An alternative, more involved idea would be to add 1 to all IRQ "cookies", so IRQ 0 would return 1, leaving 0 as a special value. I have absolutely no idea how big the API surface is that would need changes, and it is likely not worth the effort at all.)
This is what IRQ domains do, they start vIRQs from 1.
The bottom line here is the SPARC case. Anybody familiar with the platform can shed a light on this. If there is no such case, we may remove warning along with ret = 0 case from platfrom_get_irq().
I'm afraid you're too fast here... :-) We'll have a really hard time if we continue to allow IRQ0 to be returned by platform_get_irq() -- we'll have oto fileter it out in the callers then...
So far no one reported seeing the big fat warning?
- The specific cookie for "IRQ not found, while no error
happened" case
Not sure what you mean here. I have no problem that a situation I can cope with is called an error for the query function. I just do error handling and continue happily. So the part "while no error happened" is irrelevant to me.
I meant that instead of using special error code, 0 is very much good for the cases when IRQ is not found. It allows to distinguish -ENXIO from the low layer from -ENXIO with this magic meaning.
I don't see how -ENXIO can trickle from the lower layers, frankly...
It might one day, leading to very hard to track bugs.
As gregkh noted, changing the return value without also making the compile fail will be a huge PITA whenever driver patches are back- or forward-ported, as it would require subtle changes in error paths, which can easily slip through unnoticed, in particular with half- automated stable backports.
Let's not modify kernel at all then, because in many cases it is a PITA for back- or forward-porting :-)
Even if another return value like -ENODEV might be better aligned with ...regulator_get_optional() and similar functions, or we even find a way to make 0 usable for this, none of the proposed changes strike me as big enough a win to outweigh the churn caused by making such a change at all.
Yeah, let's continue to suffer from ugly interface and see more band aids landing around...
participants (4)
-
Andy Shevchenko
-
Geert Uytterhoeven
-
Matthias Schiffer
-
Sergey Shtylyov