lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+axb-FagKnftWDkOynT8xWoqOw758Q9t_Whs-bZPqU1Xg@mail.gmail.com>
Date:	Fri, 29 Jan 2016 14:40:33 +0100
From:	Dmitry Vyukov <dvyukov@...gle.com>
To:	syzkaller <syzkaller@...glegroups.com>
Cc:	linux-ide@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	Jeff Garzik <jgarzik@...hat.com>,
	Sergei Shtylyov <sshtylyov@...mvista.com>,
	Kostya Serebryany <kcc@...gle.com>,
	Alexander Potapenko <glider@...gle.com>,
	Sasha Levin <sasha.levin@...cle.com>
Subject: Re: ata: BUG in ata_sff_hsm_move

On Fri, Jan 29, 2016 at 2:18 PM, Dmitry Vyukov <dvyukov@...gle.com> wrote:
> On Fri, Jan 29, 2016 at 1:23 PM, Tejun Heo <tj@...nel.org> wrote:
>> Hello, Dmitry.
>>
>> On Fri, Jan 29, 2016 at 12:59:49PM +0100, Dmitry Vyukov wrote:
>>> > Hmmm... the port interrupt handler checks for IDLE before calling into
>>> > hsm_move, so the only explanation would be that something is resetting
>>> > it to IDLE inbetween.  ce7514526742 ("libata: prevent HSM state change
>>> > race between ISR and PIO") describes and fixes the same problem.  The
>>> > fix seems correct and I can't find anywhere else where this can
>>> > happen.  :(
>>> >
>>> > Can you please post the kernel log leading to the BUG?  Also, I don't
>>> > think that condition needs to be BUG.  I'll change it to WARN.
>>>
>>> Here are two logs, in both cases no kernel messages prior to the bug:
>>> https://gist.githubusercontent.com/dvyukov/5087d633e3620280b6c7/raw/31c9ab1ced92ac5f85cfb15eaf48ec5793c2c3a1/gistfile1.txt
>>> https://gist.githubusercontent.com/dvyukov/825b2e3d5fb80ae08a9a/raw/03c5a4f4c4bd9d0a304a71cda2da4c92f4b7f1ba/gistfile1.txt
>>
>> lol, this is kinda embarrassing.  It looks like the poll path wasn't
>> doing any locking.  Can you please verify the following patch at least
>> doesn't crash the machine immediately and if so keep it applied to the
>> test kernel so that we can verify that the problem actually goes away?
>
>
> Great that you managed to debug it without a repro!
> I've applied this patch to my tree and will rerun fuzzer. I will
> notify you if I see this warning again.
> Thanks

It now deadlocks at this stack. It is probably not OK to call
ata_sff_hsm_move holding the spinlock.

[    8.168809] NMI backtrace for cpu 2
[    8.168809] CPU: 2 PID: 17 Comm: kworker/2:0 Not tainted 4.5.0-rc1+ #302
[    8.168809] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Bochs 01/01/2011
[    8.168809] Workqueue: ata_sff ata_sff_pio_task
[    8.168809] task: ffff88006d24c740 ti: ffff88006d090000 task.ti:
ffff88006d090000
[    8.168809] RIP: 0010:[<ffffffff82c0e388>]  [<ffffffff82c0e388>]
delay_tsc+0x18/0x70
[    8.168809] RSP: 0000:ffff88006d097a60  EFLAGS: 00000086
[    8.168809] RAX: 000000008edaeede RBX: ffff880036e88028 RCX: 000000000000001e
[    8.168809] RDX: 0000000000000033 RSI: 000000338edaee9c RDI: 0000000000000001
[    8.168809] RBP: ffff88006d097a60 R08: 0000000000000002 R09: 0000000000000000
[    8.168809] R10: fffffbfff11baf82 R11: 1ffffffff134edc5 R12: ffff880036e88038
[    8.168809] R13: 000000009a9d2d40 R14: ffff880036e88030 R15: 0000000099383d98
[    8.168809] FS:  0000000000000000(0000) GS:ffff88006d600000(0000)
knlGS:0000000000000000
[    8.168809] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[    8.168809] CR2: 00000000ffffffff CR3: 0000000007a4b000 CR4: 00000000000006e0
[    8.168809] Stack:
[    8.168809]  ffff88006d097a70 ffffffff82c0e2da ffff88006d097aa8
ffffffff814661a9
[    8.168809]  ffff880036e88028 0000000000000082 ffff880036e90008
ffff880036e90010
[    8.168809]  0000000000000001 ffff88006d097ad0 ffffffff86652c47
ffffffff83a80c4c
[    8.168809] Call Trace:
[    8.168809]  [<ffffffff82c0e2da>] __delay+0xa/0x10
[    8.168809]  [<ffffffff814661a9>] do_raw_spin_lock+0x149/0x2b0
[    8.168809]  [<ffffffff86652c47>] _raw_spin_lock_irqsave+0xa7/0xd0
[    8.168809]  [<ffffffff83a80c4c>] ? ata_hsm_qc_complete+0x12c/0x420
[    8.168809]  [<ffffffff83a80c4c>] ata_hsm_qc_complete+0x12c/0x420
[    8.168809]  [<ffffffff83a82f42>] ata_sff_hsm_move+0x1e2/0x1c60
[    8.168809]  [<ffffffff83a85f17>] ? ata_sff_pio_task+0x87/0x530
[    8.168809]  [<ffffffff83a86296>] ata_sff_pio_task+0x406/0x530
[    8.168809]  [<ffffffff813a2386>] process_one_work+0x796/0x1440
[    8.168809]  [<ffffffff813a22ba>] ? process_one_work+0x6ca/0x1440
[    8.168809]  [<ffffffff813d6190>] ? finish_task_switch+0x120/0x5f0
[    8.168809]  [<ffffffff813a1bf0>] ? pwq_dec_nr_in_flight+0x2e0/0x2e0
[    8.168809]  [<ffffffff813a310b>] worker_thread+0xdb/0xfc0
[    8.168809]  [<ffffffff813b637f>] kthread+0x23f/0x2d0
[    8.168809]  [<ffffffff813a3030>] ? process_one_work+0x1440/0x1440
[    8.168809]  [<ffffffff813b6140>] ? kthread_create_on_node+0x3b0/0x3b0
[    8.168809]  [<ffffffff813b6140>] ? kthread_create_on_node+0x3b0/0x3b0
[    8.168809]  [<ffffffff8665362f>] ret_from_fork+0x3f/0x70
[    8.168809]  [<ffffffff813b6140>] ? kthread_create_on_node+0x3b0/0x3b0




>> diff --git a/drivers/ata/libata-sff.c b/drivers/ata/libata-sff.c
>> index 608677d..6991efc 100644
>> --- a/drivers/ata/libata-sff.c
>> +++ b/drivers/ata/libata-sff.c
>> @@ -1362,12 +1362,14 @@ static void ata_sff_pio_task(struct work_struct *work)
>>         u8 status;
>>         int poll_next;
>>
>> +       spin_lock_irq(ap->lock);
>> +
>>         BUG_ON(ap->sff_pio_task_link == NULL);
>>         /* qc can be NULL if timeout occurred */
>>         qc = ata_qc_from_tag(ap, link->active_tag);
>>         if (!qc) {
>>                 ap->sff_pio_task_link = NULL;
>> -               return;
>> +               goto out_unlock;
>>         }
>>
>>  fsm_start:
>> @@ -1382,11 +1384,14 @@ static void ata_sff_pio_task(struct work_struct *work)
>>          */
>>         status = ata_sff_busy_wait(ap, ATA_BUSY, 5);
>>         if (status & ATA_BUSY) {
>> +               spin_unlock_irq(ap->lock);
>>                 ata_msleep(ap, 2);
>> +               spin_lock_irq(ap->lock);
>> +
>>                 status = ata_sff_busy_wait(ap, ATA_BUSY, 10);
>>                 if (status & ATA_BUSY) {
>>                         ata_sff_queue_pio_task(link, ATA_SHORT_PAUSE);
>> -                       return;
>> +                       goto out_unlock;
>>                 }
>>         }
>>
>> @@ -1403,6 +1408,8 @@ static void ata_sff_pio_task(struct work_struct *work)
>>          */
>>         if (poll_next)
>>                 goto fsm_start;
>> +out_unlock:
>> +       spin_unlock_irq(ap->lock);
>>  }
>>
>>  /**
>>
>> --
>> You received this message because you are subscribed to the Google Groups "syzkaller" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@...glegroups.com.
>> For more options, visit https://groups.google.com/d/optout.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ