lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 16 Oct 2017 18:08:37 +0100
From:   Dean Jenkins <Dean_Jenkins@...tor.com>
To:     Ronald Tschalär <ronald@...ovation.ch>,
        Marcel Holtmann <marcel@...tmann.org>,
        Gustavo Padovan <gustavo@...ovan.org>,
        Johan Hedberg <johan.hedberg@...il.com>
CC:     Lukas Wunner <lukas@...ner.de>, <linux-bluetooth@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] Bluetooth: hci_ldisc: Allow sleeping while proto locks
 are held.

Hi Ronald,

On 15/10/17 11:40, =?UTF-8?q?Ronald=20Tschal=C3=A4r?= wrote:
> Lastly, the read-lock in the hci_uart_tx_wakeup() callback now needed
> to be removed because that function is called from an IRQ context. But
> since it doesn't actually call any proto functions, instead just
> queueing the work, and the other operations are atomic bit operations,
> this is ok.
You are discussing my commits so I'll try to answer your queries. I am a 
bit busy but I'll attempt to answer you in a timely manner.

Sorry, I know nothing about bcm, I will explain using BCSP scenarios.

I use SMP ARM based embedded systems in a commercial environment. These 
systems can suffer heavy CPU loading which can expose race conditions. 
The crashes are very rare in a PC environment.

I will try to explain why the locking is needed in hci_uart_tx_wakeup().

For reference, I am using the HEAD of Linux 4.14-rc4

There is at least 1 race condition to consider. The atomic variables do 
not help because the codeblock as a whole is not atomic. Hence locking 
is needed as follows.

The issue is that hci_uart_tty_close() runs asynchronously to 
hci_uart_tx_wakeup() which is shown below (assuming a SMP system).

int hci_uart_tx_wakeup(struct hci_uart *hu)
{
         read_lock(&hu->proto_lock);

In parallel, hci_uart_tty_close() can be running here:

         if (!test_bit(HCI_UART_PROTO_READY, &hu->flags))
                 goto no_schedule;

In parallel, hci_uart_tty_close() can be running here:

         if (test_and_set_bit(HCI_UART_SENDING, &hu->tx_state)) {

In parallel, hci_uart_tty_close() can be running here:

                 set_bit(HCI_UART_TX_WAKEUP, &hu->tx_state);
                 goto no_schedule;
         }

         BT_DBG("");

In parallel, hci_uart_tty_close() can be running here:

         schedule_work(&hu->write_work);

In parallel, hci_uart_tty_close() can be running here:

no_schedule:
         read_unlock(&hu->proto_lock);

         return 0;
}

hci_uart_tty_close() progressively removes the resources needed by the 
scheduled work queue hu->write_work which runs hci_uart_write_work(). 
Also there is a delay between the scheduling and hci_uart_write_work 
actually running. This means hci_uart_write_work() can race with 
hci_uart_tty_close(), sometimes causing hci_uart_tty_close() to crash, 
for example due to the write buffer no longer being there.

static void hci_uart_write_work(struct work_struct *work)
{
<snipped>
         set_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
         len = tty->ops->write(tty, skb->data, skb->len);    <== can 
crash inside the write function pointer as the write buffer is no longer 
valid
         hdev->stat.byte_tx += len;    <== can crash here as hdev now 
invalid
<snipped>
}

Therefore, a robust solution is to lock out hci_uart_tty_close() when 
hci_uart_tx_wakeup() runs, that is the reason why 
read_lock(&hu->proto_lock) is used in hci_uart_write_work(). The atomic 
flag HCI_UART_PROTO_READY is prevented from being cleared whilst 
hci_uart_tx_wakeup() runs.

In fact, hci_uart_tty_close() is really a bit of a mess because it 
progressively removes resources. It is is based on old code which has 
been patched over the many years. Therefore it has kept code which is 
probably obsolete or duplicated. Ideally, hci_uart_tty_close() should be 
rewritten.

For example, there is a call
cancel_work_sync(&hu->write_work);
in  hci_uart_tty_close() which at first glance seems to be helpful but 
it is flawed because hci_uart_tx_wakeup() can schedule a new work item 
AFTER cancel_work_sync(&hu->write_work) runs. Therefore, locking is 
needed to prevent hci_uart_tx_wakeup() from being scheduled whilst 
HCI_UART_PROTO_READY is being cleared in hci_uart_tty_close().

In the case of BCSP, hci_bscp.c calls hci_uart_tx_wakeup() to schedule 
the transmission of BCSP frames. This includes BCSP retransmissions. 
Therefore, the scenario where a BCSP retransmission is being scheduled 
whilst hci_uart_tty_close() runs needs to be protected to avoid crashes. 
BCSP has a re-transmission timer which can expire whilst 
hci_uart_tty_close() runs.

In older kernels without my commits, when BCSP attempts to transmit 
during the execution of hci_uart_tty_close(), it can result in crashes 
as outlined above. The flag HCI_UART_PROTO_READY now prevents 
transmission and error messages can sometimes be seen from 
hci_uart_send_frame() because transmission is prevented.

It is possible that my commits might have been too robust so perhaps 
some simplification is possible.

Feel free to send me further questions about my commits. I will try to 
answer your queries within a couple of days each time.

Regards,
Dean

-- 
Dean Jenkins
Embedded Software Engineer
Linux Transportation Solutions
Mentor Embedded Software Division
Mentor Graphics (UK) Ltd.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ