lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y/NT1/ynarp9cDlS@linutronix.de>
Date:   Mon, 20 Feb 2023 12:04:55 +0100
From:   Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Crystal Wood <swood@...hat.com>, John Keeping <john@...anate.com>,
        linux-rt-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: rtmutex, pi_blocked_on, and blk_flush_plug()

On 2023-02-20 10:49:26 [+0100], Thomas Gleixner wrote:
> > The logic is different but the deadlock should be avoided:
> > - mutex_t and rw_semaphore invoke schedule() while blocking on a lock.
> >   As part of schedule() sched_submit_work() is invoked.
> >   This is the same in RT and !RT so I don't expect any dead lock since
> >   the involved locks are the same.
> 
> Huch?
> 
> xlog_cil_commit()
>   down_read(&cil->xc_ctx_lock)
>     __rwbase_read_lock()
>        __rt_mutex_slowlock()
>          current->pi_blocked_on = ...
>          schedule()
>            __blk_flush_plug()
>              dd_insert_requests()
>                rt_spin_lock()
>                  WARN_ON(current->pi_blocked_on);
> 
> So something like the below is required. But that might not cut it
> completely. wq_worker_sleeping() is fine, but I'm not convinced that
> io_wq_worker_sleeping() is safe. That needs some investigation.

Okay, so this makes sense.

> Thanks,
> 
>         tglx
> ---
> 
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6666,6 +6666,9 @@ static inline void sched_submit_work(str
>  	 */
>  	SCHED_WARN_ON(current->__state & TASK_RTLOCK_WAIT);
>  
> +	if (current->pi_blocked_on)
> +		return;
> +

The ->pi_blocked_on field is set by __rwbase_read_lock() before
schedule() is invoked while blocking on the sleeping lock. By doing this
we avoid __blk_flush_plug() and as such will may deadlock because we are
going to sleep and made I/O progress earlier which is not globally
visibly but might be (s/might be/is/ in the deadlock case) expected by
the owner of the lock.

We could trylock and if this fails, flush and do the proper lock.
This would ensure that we set pi_blocked_on after we flushed.

>  	/*
>  	 * If we are going to sleep and we have plugged IO queued,
>  	 * make sure to submit it to avoid deadlocks.

Sebastian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ