[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5c0d8998-6cb2-5ce5-e3a5-67237a51049f@kontron.com>
Date: Mon, 8 Sep 2025 16:54:50 +0200
From: Gilles BULOZ <gilles.buloz@...tron.com>
To: corey@...yard.net
Cc: Linux Kernel <linux-kernel@...r.kernel.org>,
OpenIPMI Developers <openipmi-developer@...ts.sourceforge.net>
Subject: Re: user->nr_msgs going negative in ipmi_msghandler.c
Le 05/09/2025 à 20:50, Corey Minyard a écrit :
> I'm adding the OpenIPMI mailing list and the LKML.
>
> On Fri, Sep 05, 2025 at 05:04:28PM +0200, Gilles BULOZ wrote:
>> Le 05/09/2025 à 15:15, Gilles BULOZ a écrit :
>>> Le 05/09/2025 à 14:03, Corey Minyard a écrit :
>>>> On Fri, Sep 05, 2025 at 06:54:23AM -0500, Corey Minyard wrote:
>>>>> On Fri, Sep 05, 2025 at 10:16:19AM +0200, Gilles BULOZ wrote:
>>>>>> Hi Corey,
>>>>>>
>>>>>> I'm HW/SW developer at Kontron (computer manufacturer) and don't know what to
>>>>>> think about an issue with user->nr_msgs going negative in ipmi_msghandler.c.
>>>>>> Not sure if it's a weakness in ipmi_msghandler.c or a bug in the IPMC software
>>>>>> of our computer board (satellite) with event messages.
>>>>> I worked with people from Kontron a long time ago. Those were good
>>>>> times :).
>>>>>
>>>>>> I see this when I run ipmitool on this board while some event messages already
>>>>>> available. In this case deliver_response() is processing the messages and is
>>>>>> decreasing user->nr_msgs below 0. Then when ipmitool calls
>>>>>> ioctl(IPMICTL_SEND_COMMAND) it gets an error with errno=EBUSY because in
>>>>>> i_ipmi_request() user->nr_msgs is incremented but still negative, casted to
>>>>>> unsigned int so becomes huge, and found greater than max_msgs_per_user (100).
>>>>> Thanks for the detailed description. The fix for the bug is:
>>>>>
>>>>> diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
>>>>> index e12b531f5c2f..ba33070622b1 100644
>>>>> --- a/drivers/char/ipmi/ipmi_msghandler.c
>>>>> +++ b/drivers/char/ipmi/ipmi_msghandler.c
>>>>> @@ -1634,6 +1634,7 @@ int ipmi_set_gets_events(struct ipmi_user *user, bool val)
>>>>>
>>>>> list_for_each_entry_safe(msg, msg2, &msgs, link) {
>>>>> msg->user = user;
>>>>> + atomic_add(1, &usr->nr_msgs);
>>>> Sorry, that should obviously be user->nr_msgs
>>> Thanks very much !
>>> I've tried it with kernel 6.11.8 and it's better but still not enough.
>>> Running "ipmitool sensor" with some debug in ipmi_msghandler.c shows that I always have nr_msgs=1 right after the atomic_add (called
>>> 9 times), then when in i_ipmi_request() I reach line "rv = -EBUSY;" with nr_msgs=-2 (twice).
>> My understanding is that ipmitool calls ioctl(IPMICTL_SET_GETS_EVENTS_CMD) calling ipmi_set_gets_events() and thanks to your patch
>> for each event available the nr_msgs is incremented here and decremented in deliver_response(). So your patch is OK for that.
>> But after that if other events are coming, deliver_response() is called and nr_msgs gets decremented. So when ipmitool calls
>> ioctl(IPMICTL_SEND_COMMAND), this calls i_ipmi_request() with nr_msgs < 0 and that returns -EBUSY
>
> Yeah, after I sent my email I started looking through the driver for
> other issues around this, and there were several. That change wasn't
> well thought through.
>
> So, I've added some tests around this in my test suite, and I've
> reworked this to be a much better implementation that's harder to get
> wrong.
>
> I'm going to send the fix in a separate email.
>
> Thanks,
>
> -corey
>
Thanks Corey
I confirm your fix (separate email) is working for me. Applied on kernel 6.17-rc5 sources, with few changes under drivers/char/ipmi/
to build the kernel modules there for kernel 6.11.8, and then using these modules.
Another simple fix that worked for me on 6.11.8 was to replace atomic_dec() with atomic_dec_if_positive() in deliver_response), in
addition to your suggested change to add an atomic_add() to ipmi_set_gets_events().
>>>>> kref_get(&user->refcount);
>>>>> deliver_local_response(intf, msg);
>>>>> }
>>>>>
>>>>>
>>>>> Can you try that out?
>>>>>
>>>>> I'll forward port this to current kernel (if appropriate, this has all
>>>>> been rewritten) and required a backport.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -corey
>>>>>
>>>>>> As workaround I set module parameter max_msgs_per_user to 0xffffffff so that
>>>>>> user->nr_msgs is never greater than this value.
>>>>>> I was using kernel 6.11.8 but updated to 6.16.3 and get the same issue.
>>>>>> I'm also not sure if our board is supposed to receive such event messages as
>>>>>> it is not Shelf Manager, even if these events come from the local sensors of
>>>>>> the board.
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>> Gilles Buloz
>>>>>> Kontron Modular Computers France
>>>>>> R&D SW/HW senior developer
>>>>>>
>>>> .
> .
Powered by blists - more mailing lists