[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081112191400.GA6291@ioremap.net>
Date: Wed, 12 Nov 2008 22:14:00 +0300
From: Evgeniy Polyakov <zbr@...emap.net>
To: "Lovich, Vitali" <vlovich@...lcomm.com>
Cc: Johann Baudy <johaahn@...il.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH] Packet socket: mmapped IO: PACKET_TX_RING
On Wed, Nov 12, 2008 at 11:05:03AM -0800, Lovich, Vitali (vlovich@...lcomm.com) wrote:
> I still don't see it. This can only be a problem for vmsplice, since I believe
> sendpage & splice copy the data from the source pipe if necessary.
> vmsplice solves this through the SPLICE_F_GIFT flag (if not specified,
> I'm assuming it copies the data into a temporary buffer). So I don't
> believe that these are actually racy functions if used properly.
Sendpage only copies data if underlying device does not support
scatter-gather and hardware checksum capabilities. Effectively what's
being done is a page (no matter if it is anonymous mapping or VFS page
cache) reference counter increase and skb submit, which in the best case
results in dev_queue_xmit() just like in your approach. Then syscall
returns and userspace will never ever know that page was transmitted.
It actually can be dropped just there without even seeing the wire if
hardware decided that, that is why hardware checksumming is needed:
hardware will calculate appropriate checksums over the data which is in
given pages at real send time and not when userspace called sendpage().
> However, your suggestion makes non-racy usage of the tx ring impossible
> unless you know ahead of time how many frames you will need (in which case, resetting
> the status flag is pointless). But for proper ring buffer behaviour, it needs to
> clear the flag in the skb destructor, once we know the data will no longer be used by
> the driver.
Here is the main point: why do you ever care about data that was or was
not transmitted and want to update something at destruction time and not
when dev_qeueue_xmit() returns. As pointed above, destruction time does
not guarantee that skb was sent as long as return from dev_qeueue_xmit().
So you can update whatever flags you have to after return of the
dev_qeueue_xmit() and will get the same behaviour as sendfile:
immediate write into the same memory area results in sending new content
(on some NICs).
> > Please also update your mailer to wrap strings into 80-or-so lines, it
> > is hard to answer into the middle of the paragraph.
> Sorry - I hate using Outlook because it doesn't seem to honour my settings.
> I'll split up the lines manually instead of trusting Outlook.
Non-trivial solution for long mails :)
--
Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists