lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+WzARkyhLrntJfZ2cCB+Z5kiiLAB=OzhERgWQ66bVKr++Yk-A@mail.gmail.com>
Date:   Mon, 12 Jul 2021 11:45:00 +0800
From:   zhenguo yao <yaozhenguo1@...il.com>
To:     Jens Axboe <axboe@...nel.dk>
Cc:     oleg@...hat.com, linux-kernel@...r.kernel.org, yaozhenguo@...com
Subject: Re: [PATCH] task_work: return -EBUSY when adding same work

This issue happens in a stress test of memory UE injection. It has
more than once UEs reported to the OS at the same moment in the test.
So  do_machine_check-->queue_task_work is called many times.
mce_kill_me work is added to list many times.  When mce_kill_me is add
to the list,  it becomes the list header and then another mce_kill_me
is added to the list before task_work_run is called.  The list becomes
a dead loop: task->task_works = mce_kill_me, mce_kill_me->next =
mce_kill_me.  When the task want to  return to user mode and run
task_work_run.  It becomes a dead loop and never return to user mode
and process signal SIGBUS that mce_kill_me sent to him. I fix this by
following patch
--
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 22791aa..9333696 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1299,7 +1299,9 @@ static void queue_task_work(struct mce *m, int
kill_current_task)
        else
                current->mce_kill_me.func = kill_me_maybe;

-       task_work_add(current, &current->mce_kill_me, TWA_RESUME);
+       /* Avoid endless loops when task_work_run is running */
+       if (READ_ONCE(current->task_works) != &current->mce_kill_me)
+               task_work_add(current, &current->mce_kill_me, TWA_RESUME);
 }
--
But I think it is better return an error in task_work_add when same
work is added to the list. Similar problem may happen in other scenes.
It is hard to debug when it is a seldom issue.

Jens Axboe <axboe@...nel.dk> 于2021年7月12日周一 上午10:44写道:
>
> On 7/11/21 8:13 PM, zhenguo yao wrote:
> > Yes I hit this condition.  The caller is queue_task_work in
> > arch/x86/kernel/cpu/mce/core.c.
> > It is really a BUG. I have submitted another patch to fix it:
> > https://lkml.org/lkml/2021/7/9/186.
>
> That patch seems broken, what happens if mce_kill_me is added already,
> but it isn't the first work item in the list?
>
> --
> Jens Axboe
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ