netdev - RE: [PATCH] Packet socket: mmapped IO: PACKET_TX

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <FCC0EC655BD1AE408C047268D1F5DF4C3BA61260@NASANEXMB10.na.qualcomm.com>
Date:	Wed, 12 Nov 2008 09:59:20 -0800
From:	"Lovich, Vitali" <vlovich@...lcomm.com>
To:	Evgeniy Polyakov <zbr@...emap.net>
CC:	Johann Baudy <johaahn@...il.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH] Packet socket: mmapped IO: PACKET_TX_RING



> -----Original Message-----
> From: Evgeniy Polyakov [mailto:zbr@...emap.net]
> Sent: November-12-08 9:41 AM
> To: Lovich, Vitali
> Cc: Johann Baudy; netdev@...r.kernel.org
> Subject: Re: [PATCH] Packet socket: mmapped IO: PACKET_TX_RING
> 
> Hi Vitali.
> 
> On Wed, Nov 12, 2008 at 09:07:43AM -0800, Lovich, Vitali
> (vlovich@...lcomm.com) wrote:
> > I think you misunderstand how the API works.  Yes the user could call
> send from the thread he's filling in data from.  But the idea is that
> he would ideally do this from another thread so that it's possible to
> control latency between packet sends.  Additionally, even without
> trying to control latency, there would still have to be complicated
> logic in userspace to determine when enough packets have been placed in
> the buffer to overcome the cost of a system call.
> >
> > This is why we need to know which frame in the ring buffer the skb is
> associated with.
> 
> What's the problem of invoking send machinery from different thread?
> You can wait in appropriate syscall until multiple packets have been
> sent and then return, or update some shared flag in the mapped area
> which says that something is being sent. When dev_queue_xmit() for
> selected set of packets completes, you can update that variable, and
> based on its value userspace can overwrite the area used by already
> sent packets.
> 
> I see the only reason to have a notification about skb completion is an
> absolute need to send sync data, i.e. packet data can not be
> overwritten
> until packet reaches the media. But getting that existing Linux
> splice/sendfile and any other ->sendpage() users are racy in this
> regard
> for centuries, this does not look as a strong demand for me.
> 
They aren't racy - not in the sense that your suggestion would make it.  With current splice & sendfile, when it returns from the syscall, the user knows that it has been transmitted and thus can continue using the file descriptors & memory (in the case of vmsplice).  However, with your suggestion, the user could actually never know when it's safe to write into memory (even in a single-threaded situation).  Thus it is racy in a single-threaded situation (even on a UP it's potentially racy) which is a pretty amazing feat.  Consider:

User fills up the 100 frames available in the ring buffer.  Calls send.  Send, after calling dev_queue_xmit on each frame marks it as free for use.  Send returns.  User continues filling up buffer.

In this situation, the user now has no clue whether or not the frames were actually sent out.  This is going to cause huge problems because the user won't understand why their data is being sent out corrupted (with also no way to prevent it).  I think this is becomes more and more of a problem if the card can't keep up with the packet rate, and thus it'll still be processing its tx queue as the user-space will be overwriting the ring buffer.

This state-machine approach is the exact parallel to how the RX code works (on receive it tries to place the skb data in a free frame and relies on the user to free that frame).
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html