lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Mon, 21 Nov 2022 17:52:25 +0000
From:   Jonathan Cameron <Jonathan.Cameron@...wei.com>
To:     Ira Weiny <ira.weiny@...el.com>
CC:     Hillf Danton <hdanton@...a.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Li Ming <ming4.li@...el.com>,
        "Vishal Verma" <vishal.l.verma@...el.com>,
        Lukas Wunner <lukas@...ner.de>,
        "Alison Schofield" <alison.schofield@...el.com>,
        <linux-kernel@...r.kernel.org>, <linux-pci@...r.kernel.org>
Subject: Re: [PATCH] PCI/DOE: Remove asynchronous task support

On Sun, 20 Nov 2022 05:57:22 -0800
Ira Weiny <ira.weiny@...el.com> wrote:

> On Sun, Nov 20, 2022 at 10:27:35AM +0800, Hillf Danton wrote:
> > On Sat, 19 Nov 2022 14:25:27 -0800 Ira Weiny <ira.weiny@...el.com>  
> > > @@ -529,8 +492,18 @@ int pci_doe_submit_task(struct pci_doe_mb *doe_mb, struct pci_doe_task *task)
> > >  		return -EIO;
> > >  
> > >  	task->doe_mb = doe_mb;
> > > -	INIT_WORK(&task->work, doe_statemachine_work);
> > > -	queue_work(doe_mb->work_queue, &task->work);
> > > +
> > > +again:
> > > +	if (!mutex_trylock(&doe_mb->exec_lock)) {
> > > +		if (wait_event_timeout(task->doe_mb->wq,
> > > +				test_bit(PCI_DOE_FLAG_CANCEL, &doe_mb->flags),
> > > +				PCI_DOE_POLL_INTERVAL))
> > > +			return -EIO;  
> > 
> > Is EIO worth a line of pr_warn()?  
> 
> Maybe but I'm not sure it is worth it.  This was paralleling the original code
> which called pci_doe_flush_mb() to shut down the mailbox.  So this is likely to
> never happen.  The callers could print something if needed.
> 
> >   
> > > +		goto again;
> > > +	}
> > > +	exec_task(task);
> > > +	mutex_unlock(&doe_mb->exec_lock);
> > > +  
> > 
> > If it is likely to take two minutes to acquire the exec_lock after
> > rounds of trying again, trylock + wait timeout barely make sense given EIO.  
> 
> I'm not sure where 2 minutes come from?
> 
> #define PCI_DOE_TIMEOUT HZ
> #define PCI_DOE_POLL_INTERVAL   (PCI_DOE_TIMEOUT / 128)
> 
> It is also not anticipated that more than 1 task is being given to the mailbox
> but the protection needs to be there because exec_task() will get confused if
> more than 1 thread submits at the same time.

Given multiple protocols can be on the same DOE and they may be handled by
either subdrivers or indeed driven by userspace interface, there is a high
chance that more than one task will be queued up (once we have a few more
supported protocols).

> 
> All this said I've now convinced myself that there is a race in the use of
> PCI_DOE_FLAG_CANCEL even with the existing code.
> 
> I believe that if the pci device goes away the doe_mb structure may get free'ed
> prior to other threads having a chance to check doe_mb->flags.  Worse yet the
> work queue itself (doe_mb->wq) may become invalid...
> 
> I don't believe this can currently happen because anyone using the doe_mb
> structure has a reference to the pci device.
> 
> With this patch I think all the doe_mb->flags and the wait queue can go away.
> pci_doe_wait() can be replaced with a simple msleep_interruptible().
> 
> Let me work through that a bit.
> 
> Ira
> 
> > 
> > Hillf
> > 
> > /**
> >  * wait_event_timeout - sleep until a condition gets true or a timeout elapses
> >  * @wq_head: the waitqueue to wait on
> >  * @condition: a C expression for the event to wait for
> >  * @timeout: timeout, in jiffies
> >  *
> >  * The process is put to sleep (TASK_UNINTERRUPTIBLE) until the
> >  * @condition evaluates to true. The @condition is checked each time
> >  * the waitqueue @wq_head is woken up.
> >  *
> >  * wake_up() has to be called after changing any variable that could
> >  * change the result of the wait condition.
> >  *
> >  * Returns:
> >  * 0 if the @condition evaluated to %false after the @timeout elapsed,
> >  * 1 if the @condition evaluated to %true after the @timeout elapsed,
> >  * or the remaining jiffies (at least 1) if the @condition evaluated
> >  * to %true before the @timeout elapsed.
> >  */  

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ