linux-kernel - Re: [PATCH RFC stable 4.14 1/1] mmc: core: fix hung task caused by race condition on context

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAPDyKFo_izPD7z-GmSEZ_8H_AX+KiVuLqN7JcD2Kdjjuukk-7g@mail.gmail.com>
Date:   Thu, 29 Sep 2022 14:41:26 +0200
From:   Ulf Hansson <ulf.hansson@...aro.org>
To:     "dinggao.pan" <dinggao.pan@...izon.ai>
Cc:     "bigeasy@...utronix.de" <bigeasy@...utronix.de>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "rostedt@...dmis.org" <rostedt@...dmis.org>,
        "ming.yu" <ming.yu@...izon.ai>,
        "yunqian.wang" <yunqian.wang@...izon.ai>,
        "linux-mmc@...r.kernel.org" <linux-mmc@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-rt-users@...r.kernel.org" <linux-rt-users@...r.kernel.org>
Subject: Re: [PATCH RFC stable 4.14 1/1] mmc: core: fix hung task caused by
 race condition on context_info

On Mon, 5 Sept 2022 at 08:22, dinggao.pan <dinggao.pan@...izon.ai> wrote:
>
> Hi,
> After applying rt patches to our 4.14 kernel and enabling preempt-rt, we met a hung task during boot caused by race condition on context_info stored in struct mmc_host.
> From our investigation, context_info should not be changed by threads that have not claimed the host, hence the following fix.
>
> Any comments are much appreciated.
> Dinggao Pan

Hi Dinggao,

Apologize for the delay.

The 4.14 kernel is too old for me to be able to comment. In
particular, the mmc block layer moved to blk-mq in v4.16, which means
the path you are investigating doesn't exist any more, sorry.

Kind regards
Uffe

>
> From: "Dinggao Pan" <mailto:dinggao.pan@...izon.ai>
>
> 　　A race condition happens under following circumstances:
>     (mmc_thread1)               |              (mmc_thread2)
>     mmc_issue_rq(req1)          |
>       > qcnt++ for req1         |
>         host handling req1      |
>     mmc_queue_thread(req=null)  |
>       > enter queue thread      |
>         again, fetches blk req  |
>         (return null), sets     |
>         is_waiting_last_req 1   |  mmc_request_fn(req1) -> set is_new_req 1
>                                 |                   and wake_up wait_queue
>     mmc_issue_rq(req2)          |   > mmc_thread2 tries to claim host
>       > **qcnt++ for req2**     |
>       mmc_finalize_req(req2)    |
>         > should wait for req1  |
>           done but req2 return  |
>           MMC_BLK_NEW_REQ       |
>           due to is_new_req     |
>           already set to 1      |
>                                 |
>                                 |
>     req1 done                   |
>       > qcnt-- for req1         |
>     mmc_issue_rq(req3)          |
>       > qcnt++ for req3         |
> req2 is not handled but qcnt is already added(noted by **),
> thus mmc_thread1 will never release host, causing mmc_threads
> except thread1 to hung. Fix race by moving wake_up to the front of
> context_info update.
>
> Reviewed By: Yunqian Wang <mailto:yunqian.wang@...izon.ai>
> Signed-off-by: Dinggao Pan <mailto:dinggao.pan@...izon.ai>
> Signed-off-by: Ming Yu <mailto:ming.yu@...izon.ai>
> ---
> drivers/mmc/core/queue.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
> index 0a4e77a5b..58318c102 100644
> --- a/drivers/mmc/core/queue.c
> +++ b/drivers/mmc/core/queue.c
> @@ -107,6 +107,11 @@ static void mmc_request_fn(struct request_queue *q)
>                return;
>       }
>
> +      if (mq->asleep) {
> +               wake_up_process(mq->thread);
> +               return;
> +      }
> +
>       cntx = &mq->card->host->context_info;
>
>       if (cntx->is_waiting_last_req) {
> @@ -114,8 +119,6 @@ static void mmc_request_fn(struct request_queue *q)
>                wake_up_interruptible(&cntx->wait);
>       }
>
> -       if (mq->asleep)
> -                wake_up_process(mq->thread);
> }
>
> static struct scatterlist *mmc_alloc_sg(int sg_len, gfp_t gfp)
> --
> 2.36.1