netdev - Re: [RFC PATCH v2 1/2] net: af_packet support for direct ring access in user space

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <54BA9D70.50403@gmail.com>
Date:	Sat, 17 Jan 2015 09:35:44 -0800
From:	John Fastabend <john.fastabend@...il.com>
To:	David Miller <davem@...emloft.net>
CC:	netdev@...r.kernel.org, danny.zhou@...el.com,
	nhorman@...driver.com, dborkman@...hat.com, john.ronciak@...el.com,
	hannes@...essinduktion.org, brouer@...hat.com
Subject: Re: [RFC PATCH v2 1/2] net: af_packet support for direct ring access
 in user space

On 01/14/2015 12:35 PM, David Miller wrote:
> From: John Fastabend <john.fastabend@...il.com>
> Date: Mon, 12 Jan 2015 20:35:11 -0800
>
>> +		if ((region.direction != DMA_BIDIRECTIONAL) &&
>> +		    (region.direction != DMA_TO_DEVICE) &&
>> +		    (region.direction != DMA_FROM_DEVICE))
>> +			return -EFAULT;
>   ...
>> +		if ((umem->nmap == npages) &&
>> +		    (0 != dma_map_sg(dev->dev.parent, umem->sglist,
>> +				     umem->nmap, region.direction))) {
>> +			region.iova = sg_dma_address(umem->sglist) + offset;
>
> I am having trouble seeing how this can work.
>
> dma_map_{single,sg}() mappings need synchronization after a DMA
> transfer takes place.
>
> For example if the DMA occurs to the device, then that region can
> be cached in the PCI controller's internal caches and thus future
> cpu writes into that memory region will not be seen, until a
> dma_sync_*() is invoked.
>
> That isn't going to happen when the device transmit queue is
> being completely managed in userspace.
>
> And this takes us back to the issue of protection, I don't think
> it is addressed properly yet.
>
> CAP_NET_ADMIN privileges do not mean "can crap all over memory"
> yet with this feature that can still happen.
>
> If we are dealing with a device which cannot provide strict protection
> to only the process's locked local pages, you have to do something
> to implement that protection.
>
> And you have _exactly_ one option to do that, abstracting the page
> addresses and eating a system call to trigger the sends, so that you
> can read from the user's (fake) descriptors and write into the real
> descriptors (translating the DMA addresses along the way) and
> triggering the TX doorbell.

OK, I think this brings us back to some of the original designs/ideas
we were thinking about with Daniel/Neil. We are going to take a look
at this. At least on the RX side we can have the af_packet logic give
us a set of DMA addresses'. I wonder if we can also make the busy
poll logic per queue and use it.

>
> I am not going to consider seriously an implementation that says "yeah
> sometimes the user can crap onto other people's memory", this isn't
> MS-DOS, it's a system where proper memory protections are mandatory
> rather than optional.
>

More to sort out on our side. Thanks for looking at the patches.

.John

-- 
John Fastabend         Intel Corporation
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html