lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Date:   Mon, 12 Mar 2018 00:42:14 +0800
From:   焦晓冬 <milestonejxd@...il.com>
To:     linux-kernel@...r.kernel.org
Cc:     mingo@...e.hu
Subject: finish_wait with list_empty_careful maybe fragile or buggy

void finish_wait(struct wait_queue_head *wq_head, struct
wait_queue_entry *wq_entry)
{
    ....
->    if (!list_empty_careful(&wq_entry->entry)) {
->        spin_lock_irqsave(&wq_head->lock, flags);
->        list_del_init(&wq_entry->entry);
->        spin_unlock_irqrestore(&wq_head->lock, flags);
->    }
}

After careful look into the stop/wakeup code, I found the above code
fragile or even
buggy. This code was introduced at least 14 years ago and seems
fragile or buggy now
after years of study on SMP synchronization by us.

I understand this code are being used a lot and no bug seems to
emerge. But, as I'll
explain, it depends on a lot of unreliable implementation details.

Firstly,

static inline int list_empty_careful(const struct list_head *head)
{
    struct list_head *next = head->next;
    return (next == head) && (next == head->prev);
}

note that the read of head->next & head->prev is not protected by READ_ONCE. So
the read is free to be optimized out entirely. Luckily, this
optimization is hard for compilers
now, since all other accesses are out of finish_wait. And still, GCC
won't spit aligned
accesses into multiple instructions so it is atomic so far.

Secondly,

if ( ! list_empty_careful(&wq_entry->entry) ) {
     <remove entry with spinning-lock>
}
<ends stack frame of the function calling finish_wait>
<overwrites wq_entry with another frame>

and

__wake_up_common()   -->
<read wq_entry->func>  -->
<read wq_entry->flags>  -->
autoremove_wake_function()  -->
<remove wq_entry->entry from wait queue>  -->

are not properly ordered for SMP so that  <read wq_entry->flags> may
be reordered after
<remove wq_entry->entry from wait queue> since no dependency or memory
barrier forbids
it. This may cause <overwrites wq_entry with another frame> on one CPU
takes place before
<read wq_entry->flags> on another CPU and cause <read wq_entry->flags>
to return bad
value.

This behavior is not reported may thank to:
- quite a few code is using autoremove_wake_function
- CPU pipeline is not as deep as to make this emerge

Best regards,
Patrick Trol

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ