lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.1.10.0807311142510.3277@nehalem.linux-foundation.org>
Date:	Thu, 31 Jul 2008 11:54:56 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Jamie Lokier <jamie@...reable.org>
cc:	Miklos Szeredi <miklos@...redi.hu>, jens.axboe@...cle.com,
	akpm@...ux-foundation.org, nickpiggin@...oo.com.au,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org
Subject: Re: [patch v3] splice: fix race with page invalidation



On Thu, 31 Jul 2008, Jamie Lokier wrote:
> 
> But did you miss the bit where you DON'T COPY ANYTHING EVER*?  COW is
> able provide _correctness_ for the rare corner cases which you're not
> optimising for.  You don't actually copy more than 0.0% (*approx).

The thing is, just even _marking_ things COW is the expensive part. If we 
have to walk page tables - we're screwed.

> The cost of COW is TLB flushes*.  But for splice, there ARE NO TLB
> FLUSHES because such files are not mapped writable!

For splice, there are also no flags to set, no extra tracking costs, etc 
etc.

But yes, we could make splice (from a file) do something like

 - just fall back to copy if the page is already mapped (page->mapcount 
   gives us that)

 - set a bit ("splicemapped") when we splice it in, and increment 
   page->mapcount for each splice copy.

 - if a "splicemapped" page is ever mmap'ed or written to (either through 
   write or truncate), we COW it then (and actually move the page cache 
   page - it would be a "woc": a reverse cow, not a normal one).

 - do all of this with page lock held, to make sure that there are no 
   writers or new mappers happening.

So it's probably doable. 

(We could have a separate "splicecount", and actually allow non-writable 
mappings, but I suspect we cannot afford the space in teh "struct space" 
for a whole new count).

> You're missing the real point of network splice().
> 
> It's not just for speed.
> 
> It's for sharing data.  Your TCP buffers can share data, when the same
> big lump is in flight to lots of clients.  Think static file / web /
> FTP server, the kind with 80% of hits to 0.01% of the files roughly
> the same of your RAM.

Maybe. Does it really show up as a big thing?

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ