linux-kernel - Re: [PATCH] hwmon: xgene: Fix crash when alarm occurs before driver probe

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFHUOYzFDXwPdS3OHz=t-iZ6L8uNq1goFafJhDnpU0bAtsEWLw@mail.gmail.com>
Date:   Tue, 6 Sep 2016 23:07:07 -0700
From:   Hoan Tran <hotran@....com>
To:     Guenter Roeck <linux@...ck-us.net>
Cc:     Jean Delvare <jdelvare@...e.com>, linux-hwmon@...r.kernel.org,
        lkml <linux-kernel@...r.kernel.org>,
        Itaru Kitayama <itaru.kitayama@...en.jp>, Loc Ho <lho@....com>,
        Duc Dang <dhdang@....com>
Subject: Re: [PATCH] hwmon: xgene: Fix crash when alarm occurs before driver probe

Hi Guenter,

On Tue, Sep 6, 2016 at 10:50 PM, Guenter Roeck <linux@...ck-us.net> wrote:
> On 09/06/2016 10:21 PM, Hoan Tran wrote:
>>
>> Hi Guenter,
>>
>> Thank for your quick review !
>>
>> On Tue, Sep 6, 2016 at 9:35 PM, Guenter Roeck <linux@...ck-us.net> wrote:
>>>
>>> On 09/06/2016 08:46 PM, Hoan Tran wrote:
>>>>
>>>>
>>>> The system crashes during probing xgene-hwmon driver when temperature
>>>> alarm interrupt occurs before.
>>>> It's because
>>>>  - xgene_hwmon_probe() requests PCC mailbox channel which also enables
>>>> the mailbox interrupt.
>>>>  - As temperature alarm interrupt is pending, ISR runs and crashes when
>>>> accesses
>>>> into invalid resource as unmapped PCC shared memory.
>>>>
>>>> This patch fixes this issue by saving this alarm message and scheduling
>>>> a
>>>> bottom handler after xgene_hwmon_probe() finish.
>>>>
>>>
>>> I am not completely happy with this fix. Main problem I have is that the
>>> processing associated with resp_pending doesn't happen until init_flag is
>>> set.
>>> Since the hwmon functions can be called right after
>>> hwmon_device_register_with_groups(),
>>> there is now a new race condition between that call and setting
>>> init_flag.
>>
>>
>> I think it's still good if hwmon functions are called right after
>> hwmon_device_register_with_groups().
>> The response message will be queued into FIFO and be processed later.
>>
> Yes, but the call to complete() won't happen in this case, or am I missing
> something ?

Yes, I think xgene_hwmon_rd() and xgene_hwmon_pcc_rd() functions have
to check "init_flag == true" before issue the read command.

Thanks
Hoan

>
> Guenter
>
>
>>>
>>> I am also a bit concerned with init_flag and rx_pending not being atomic
>>> and
>>> protected.
>>> What happens if a second interrupt occurs right after init_flag is set
>>> but
>>> before
>>> (or while) rx_pending is evaluated ?
>>
>>
>> Yah, that's a good catch. I can re-use the "kfifo_lock" spinlock for
>> this atomic protection.
>>
>>>
>>> On top of that, init_flag and thus the added complexity is (unless I am
>>> missing
>>> something) only needed if acpi is enabled. Penaltizing non-acpi code
>>> doesn't
>>> seem
>>> to be optimal.
>>
>>
>> I think, with DT, we still need this flag. In a case of temperature
>> alarm, the driver needs to set "temp1_critical_alarm" sysfs.
>> This "temp1_critical_alarm" should be created before "init_flag" = true.
>>
>> Thanks
>> Hoan
>>
>>>
>>> How do other drivers handle this situation ? This must be a common
>>> problem
>>> with all mbox users.
>>>
>>> Thanks,
>>> Guenter
>>>
>>>
>>>> Signed-off-by: Hoan Tran <hotran@....com>
>>>> Reported-by: Itaru Kitayama <itaru.kitayama@...en.jp>
>>>> ---
>>>>  drivers/hwmon/xgene-hwmon.c | 75
>>>> +++++++++++++++++++++++++++++++++------------
>>>>  1 file changed, 56 insertions(+), 19 deletions(-)
>>>>
>>>> diff --git a/drivers/hwmon/xgene-hwmon.c b/drivers/hwmon/xgene-hwmon.c
>>>> index bc78a5d..e3b4e84 100644
>>>> --- a/drivers/hwmon/xgene-hwmon.c
>>>> +++ b/drivers/hwmon/xgene-hwmon.c
>>>> @@ -107,6 +107,8 @@ struct xgene_hwmon_dev {
>>>>         struct completion       rd_complete;
>>>>         int                     resp_pending;
>>>>         struct slimpro_resp_msg sync_msg;
>>>> +       bool                    init_flag;
>>>> +       bool                    rx_pending;
>>>>
>>>>         struct work_struct      workq;
>>>>         struct kfifo_rec_ptr_1  async_msg_fifo;
>>>> @@ -465,13 +467,35 @@ static void xgene_hwmon_evt_work(struct
>>>> work_struct
>>>> *work)
>>>>         }
>>>>  }
>>>>
>>>> +static int xgene_hwmon_rx_ready(struct xgene_hwmon_dev *ctx, void *msg)
>>>> +{
>>>> +       if (!ctx->init_flag) {
>>>> +               ctx->rx_pending = true;
>>>> +               /* Enqueue to the FIFO */
>>>> +               kfifo_in_spinlocked(&ctx->async_msg_fifo, msg,
>>>> +                                   sizeof(struct slimpro_resp_msg),
>>>> +                                   &ctx->kfifo_lock);
>>>> +               return -EBUSY;
>>>> +       }
>>>> +
>>>> +       return 0;
>>>> +}
>>>> +
>>>>  /*
>>>>   * This function is called when the SLIMpro Mailbox received a message
>>>>   */
>>>>  static void xgene_hwmon_rx_cb(struct mbox_client *cl, void *msg)
>>>>  {
>>>>         struct xgene_hwmon_dev *ctx = to_xgene_hwmon_dev(cl);
>>>> -       struct slimpro_resp_msg amsg;
>>>> +
>>>> +       /*
>>>> +        * While the driver registers with the mailbox framework, an
>>>> interrupt
>>>> +        * can be pending before the probe function completes its
>>>> +        * initialization. If such condition occurs, just queue up the
>>>> message
>>>> +        * as the driver is not ready for servicing the callback.
>>>> +        */
>>>> +       if (xgene_hwmon_rx_ready(ctx, msg) < 0)
>>>> +               return;
>>>>
>>>>         /*
>>>>          * Response message format:
>>>> @@ -500,12 +524,8 @@ static void xgene_hwmon_rx_cb(struct mbox_client
>>>> *cl,
>>>> void *msg)
>>>>                 return;
>>>>         }
>>>>
>>>> -       amsg.msg   = ((u32 *)msg)[0];
>>>> -       amsg.param1 = ((u32 *)msg)[1];
>>>> -       amsg.param2 = ((u32 *)msg)[2];
>>>> -
>>>>         /* Enqueue to the FIFO */
>>>> -       kfifo_in_spinlocked(&ctx->async_msg_fifo, &amsg,
>>>> +       kfifo_in_spinlocked(&ctx->async_msg_fifo, msg,
>>>>                             sizeof(struct slimpro_resp_msg),
>>>> &ctx->kfifo_lock);
>>>>         /* Schedule the bottom handler */
>>>>         schedule_work(&ctx->workq);
>>>> @@ -520,6 +540,15 @@ static void xgene_hwmon_pcc_rx_cb(struct
>>>> mbox_client
>>>> *cl, void *msg)
>>>>         struct acpi_pcct_shared_memory *generic_comm_base =
>>>> ctx->pcc_comm_addr;
>>>>         struct slimpro_resp_msg amsg;
>>>>
>>>> +       /*
>>>> +        * While the driver registers with the mailbox framework, an
>>>> interrupt
>>>> +        * can be pending before the probe function completes its
>>>> +        * initialization. If such condition occurs, just queue up the
>>>> message
>>>> +        * as the driver is not ready for servicing the callback.
>>>> +        */
>>>> +       if (xgene_hwmon_rx_ready(ctx, &amsg) < 0)
>>>> +               return;
>>>> +
>>>>         msg = generic_comm_base + 1;
>>>>         /* Check if platform sends interrupt */
>>>>         if (!xgene_word_tst_and_clr(&generic_comm_base->status,
>>>> @@ -596,6 +625,17 @@ static int xgene_hwmon_probe(struct platform_device
>>>> *pdev)
>>>>         platform_set_drvdata(pdev, ctx);
>>>>         cl = &ctx->mbox_client;
>>>>
>>>> +       spin_lock_init(&ctx->kfifo_lock);
>>>> +       mutex_init(&ctx->rd_mutex);
>>>> +
>>>> +       rc = kfifo_alloc(&ctx->async_msg_fifo,
>>>> +                        sizeof(struct slimpro_resp_msg) *
>>>> ASYNC_MSG_FIFO_SIZE,
>>>> +                        GFP_KERNEL);
>>>> +       if (rc)
>>>> +               goto out_mbox_free;
>>>> +
>>>> +       INIT_WORK(&ctx->workq, xgene_hwmon_evt_work);
>>>> +
>>>>         /* Request mailbox channel */
>>>>         cl->dev = &pdev->dev;
>>>>         cl->tx_done = xgene_hwmon_tx_done;
>>>> @@ -676,17 +716,6 @@ static int xgene_hwmon_probe(struct platform_device
>>>> *pdev)
>>>>                 ctx->usecs_lat = PCC_NUM_RETRIES * cppc_ss->latency;
>>>>         }
>>>>
>>>> -       spin_lock_init(&ctx->kfifo_lock);
>>>> -       mutex_init(&ctx->rd_mutex);
>>>> -
>>>> -       rc = kfifo_alloc(&ctx->async_msg_fifo,
>>>> -                        sizeof(struct slimpro_resp_msg) *
>>>> ASYNC_MSG_FIFO_SIZE,
>>>> -                        GFP_KERNEL);
>>>> -       if (rc)
>>>> -               goto out_mbox_free;
>>>> -
>>>> -       INIT_WORK(&ctx->workq, xgene_hwmon_evt_work);
>>>> -
>>>>         ctx->hwmon_dev = hwmon_device_register_with_groups(ctx->dev,
>>>>                                                            "apm_xgene",
>>>>                                                            ctx,
>>>> @@ -697,17 +726,25 @@ static int xgene_hwmon_probe(struct
>>>> platform_device
>>>> *pdev)
>>>>                 goto out;
>>>>         }
>>>>
>>>> +       ctx->init_flag = true;
>>>> +       if (ctx->rx_pending) {
>>>> +               /*
>>>> +                * If there is a pending message, schedule the bottom
>>>> handler
>>>> +                */
>>>> +               schedule_work(&ctx->workq);
>>>> +       }
>>>> +
>>>>         dev_info(&pdev->dev, "APM X-Gene SoC HW monitor driver
>>>> registered\n");
>>>>
>>>>         return 0;
>>>>
>>>>  out:
>>>> -       kfifo_free(&ctx->async_msg_fifo);
>>>> -out_mbox_free:
>>>>         if (acpi_disabled)
>>>>                 mbox_free_channel(ctx->mbox_chan);
>>>>         else
>>>>                 pcc_mbox_free_channel(ctx->mbox_chan);
>>>> +out_mbox_free:
>>>> +       kfifo_free(&ctx->async_msg_fifo);
>>>>
>>>>         return rc;
>>>>  }
>>>>
>>>
>>
>