linux-kernel - Re: Deadlocks due to per-process plugging

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <x49ehoii8ps.fsf@segfault.boston.devel.redhat.com>
Date:	Wed, 11 Jul 2012 12:05:51 -0400
From:	Jeff Moyer <jmoyer@...hat.com>
To:	Jan Kara <jack@...e.cz>
Cc:	LKML <linux-kernel@...r.kernel.org>, linux-fsdevel@...r.kernel.org,
	Tejun Heo <tj@...nel.org>, Jens Axboe <jaxboe@...ionio.com>
Subject: Re: Deadlocks due to per-process plugging

Jan Kara <jack@...e.cz> writes:

>   Hello,
>
>   we've recently hit a deadlock in our QA runs which is caused by the
> per-process plugging code. The problem is as follows:
>   process A					process B (kjournald)
>   generic_file_aio_write()
>     blk_start_plug(&plug);
>     ...
>     somewhere in here we allocate memory and
>     direct reclaim submits buffer X for IO
>     ...
>     ext3_write_begin()
>       ext3_journal_start()
>         we need more space in a journal
>         so we want to checkpoint old transactions,
>         we block waiting for kjournald to commit
>         a currently running transaction.
> 						journal_commit_transaction()
> 						  wait for IO on buffer X
> 						  to complete as it is part
> 						  of the current transaction
>
>   => deadlock since A waits for B and B waits for A to do unplug.
> BTW: I don't think this is really ext3/ext4 specific. I think other
> filesystems can get into problems as well when direct reclaim submits some
> IO and the process subsequently blocks without submitting the IO.

So, I thought schedule would do the flush.  Checking the code:

asmlinkage void __sched schedule(void)
{
        struct task_struct *tsk = current;

        sched_submit_work(tsk);
        __schedule();
}

And sched_submit_work looks like this:

static inline void sched_submit_work(struct task_struct *tsk)
{
        if (!tsk->state || tsk_is_pi_blocked(tsk))
                return;
        /*
         * If we are going to sleep and we have plugged IO queued,
         * make sure to submit it to avoid deadlocks.
         */
        if (blk_needs_flush_plug(tsk))
                blk_schedule_flush_plug(tsk);
}

This eventually ends in a call to blk_run_queue_async(q) after
submitting the I/O from the plug list.  Right?  So is the question
really why doesn't the kblockd workqueue get scheduled?

Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/