netdev - RE: [PATCH] Packet socket: mmapped IO: PACKET_TX

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <FCC0EC655BD1AE408C047268D1F5DF4C3BA612A5@NASANEXMB10.na.qualcomm.com>
Date:	Wed, 12 Nov 2008 14:33:53 -0800
From:	"Lovich, Vitali" <vlovich@...lcomm.com>
To:	Evgeniy Polyakov <zbr@...emap.net>
CC:	Johann Baudy <johaahn@...il.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH] Packet socket: mmapped IO: PACKET_TX_RING



> -----Original Message-----
> From: Evgeniy Polyakov [mailto:zbr@...emap.net]
> Sent: November-12-08 1:46 PM
> To: Lovich, Vitali
> Cc: Johann Baudy; netdev@...r.kernel.org
> Subject: Re: [PATCH] Packet socket: mmapped IO: PACKET_TX_RING
> 
> On Wed, Nov 12, 2008 at 01:23:33PM -0800, Lovich, Vitali
> (vlovich@...lcomm.com) wrote:
> > I don't care whether or not the data was sent - I care whether or not
> the driver
> > might still use the data in the frame the skb is referring to.  In
> the destructor, clearly the
> > driver can't since it gave up its reference.  After dev_queue_xmit,
> we don't know because
> > the driver (or the skb queue layer) may have decided to delay packet
> transmission.
> >
> > Potentially the user might even have written half the payload of a
> packet when the device decides to
> > send out the skb for that frame and thus send out half the payload
> from one
> > packet and half the payload from another.
> 
> And what's the point in waiting for data to be unused?
So that the application in user space can actually send uncorrupted packets.
Think for instance if I have a 1GB pcap dump and I want to replay it - without
Waiting for data to be unused, I could potentially send out corrupt packets or
skip some.
> You want to implement a system, which will behave more consistent than
> existing zero-copy approach, but yet not 100% correctly...
I'm trying to implement a system which is zero-copy with traditional socket send()
semantics (or at least as close as is possible).  I don't see why it wouldn't work
100% correctly.

> > > So you can update whatever flags you have to after return of the
> > > dev_qeueue_xmit() and will get the same behaviour as sendfile:
> > > immediate write into the same memory area results in sending new
> > > content
> > > (on some NICs).
> > But using your approach, how can a user ever know whether or not he
> actually sent
> > a packet?
> 
> There is no way to know that. At all. skb can be dropped by zillions of
> reasons and after it was submitted to the qdisk layer, there is no way
> to know how its life will continue. Well, in some cases it is possible
> to know (when qdisk just frees skb), but it is far from 100% of the
> cases.
Right - so I don't really care if the skb gets silently dropped by lower
layers.  My only concern is that there is some kind of protection that the
user can use to ensure that he doesn't overwrite data that is in the middle of
being transmitted.  I think that's the communication disconnect we're having.
You think I'm concerned about dropped packets - but that's just a side issue.
I recognize that some lower layer may abandon the skb for whatever reason.  The
main issue is data integrity - the pages in the scatter/gather list aren't touched
in user space until the device lets go of the skb.  With your approach, a race is actually
extremely likely (guaranteed if we roll in the ring buffer at all).  Consider:

Thread A
Calls send in loop to flush buffer

Thread B
Fills buffer
As soon as next frame is free (using the status flag), fill it in & update status flag.

Clearly, Thread B would actually see the status flag
cleared before the device actually tried to send the skb (unless the skb managed to
get sent out before dev_queue_xmit returned).

> 
> > Am I missing something fundamental in my understanding?  I don't see
> any way, outside
> > of using the skb destructor, to notify the user when he can safely
> write to a frame
> > without interfering with any pending skbs.
> 
> Having a callback at destruction time does mean that noone uses skb,
> but
> are you sure this is needed? With existing zero-copy (splice/sendfile)
> this is not true, but you want to extend this approach...
> 
> If you _do_ want to make it that way, you can remove destructor at all
> and implement own packet-socket-only allocation policy and thus have
> own
> private destructor without extending skb.
Can you please elaborate on this further.  What do you mean by custom allocation policy
and private destructor?  Isn't that exactly what we're doing now?  We're just trying to
figure out in our destructor which frame the skb was built from.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html