lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4CAE58B5.1030901@intel.com>
Date:	Thu, 07 Oct 2010 16:33:09 -0700
From:	Dan Williams <dan.j.williams@...el.com>
To:	Ilya Yanok <yanok@...raft.com>
CC:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	'Vladimir K' <vlad@...raft.com>, Wolfgang Denk <wd@...x.de>
Subject: Re: [RFC] CONFIG_NET_DMA  can hang the system if DMA engine driver
 uses tasklets

On 10/7/2010 4:14 PM, Ilya Yanok wrote:
[..]
> We can see that the network stack calls dma_memcpy_to_iovec() function
> from the softirq context and it never returns in case of DMA driver runs
> out of descriptors and thus blocks the tasklet from being executed. We
> have a deadlock.
>
> Dan, I'd like to ask your opinion, do you think this is a problem of
> CONFIG_NET_DMA feature implementation or should the DMA engine drivers
> be aware of it? How should we fix it?
>
> I can imagine the following possible solutions:
> 1. Add a possibility to return a failure to the dma_memcpy_to_iovec()
> function (and reschedule it from the upper level) to give tasklets a
> chance to be executed.
> 2. Place a restriction on the DMA drivers that descriptors should be
> freed from the hard-irq context, not soft-irq and fix the existing drivers.
> 3. Try to free the descriptors not only from tasklet but also from the
> place they get requested.

This is what ioatdma and iop-adma do i.e. process descriptor reclaim 
from the allocation failure path.  For example in ioat2_check_space_lock():

    /* progress reclaim in the allocation failure case we may be
     * called under bh_disabled so we need to trigger the timer
     * event directly
     */
    if (jiffies > chan->timer.expires && timer_pending(&chan->timer)) {
            struct ioatdma_device *device = chan->device;

            mod_timer(&chan->timer, jiffies + COMPLETION_TIMEOUT);
            device->timer_fn((unsigned long) &chan->common);
    }

The assumption is that a free descriptor is always a short time delay.

> Maybe somebody has a better solution.

Not really, but extending dmatest with a test for this expectation would 
help make this more clear but it would need a config option that injects 
descriptor allocation failures.

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ