lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120523053403.GB14312@dhcp-172-18-216-138.mtv.corp.google.com>
Date:	Wed, 23 May 2012 01:34:03 -0400
From:	Kent Overstreet <koverstreet@...gle.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	Tejun Heo <tejun@...gle.com>, linux-bcache@...r.kernel.org,
	linux-kernel@...r.kernel.org, dm-devel@...hat.com, agk@...hat.com
Subject: Re: [Bcache v13 11/16] bcache: Core btree code

On Tue, May 22, 2012 at 10:20:54PM -0700, Tejun Heo wrote:
> Hmmm... I would prefer it to be defined explicitly as union.  It's
> rather easy to define it incorrectly (ie. using struct bkey) and then
> pass it around expecting it to have the pad.

Thing is, things don't expect the pad - bkeys are normally just in a big
chunk of memory concatenated together, and the same functions have to
work both with those and with bare bkeys the code occasionally
manipulates.

> It doesn't have to be full bcache.  e.g. words starting with cache can
> simply have 'b' in front and others can use things like bc_ or
> whatever.

Ok, that sounds quite reasonable.

> 
> > > So, apart from the the macros, key is 64bit containing the backend
> > > device and extent offset and size with the ptr fields somehow point to
> > > cache.  Am I understanding it correctly?  If right, I'm *tiny* bit
> > > apprehensive about using only 43bits for offset.  While the block 9
> > > bits means 52bit addressing, the 9 bit block size is now there mostly
> > > to work as buffer between memory bitness growth and storage device
> > > size growth so that we have 9 bit buffer as storage device reaches
> > > ulong addressing limit.  Probably those days are far out enough.
> > 
> > You're exactly right. I had similar thoughts about the offset size,
> > but... it'll last until we have 8 exabyte cache devices, and I can't
> 
> I'm a bit confused.  Cache device or cached device?  Isn't the key
> dev:offset:size of the cached device?

No - bkey->key is the offset on the cached device, PTR_OFFSET is on the
cache.

Confusing, I know. Any ideas for better terminology?

> 
> > > mca_data_alloc() failure path seems like a resource leak but it isn't
> > > because mca_data_alloc() puts it in free list.  Is the extra level of
> > > caching necessary?  How is it different from sl?b allocator cache?
> > > And either way, comments please.
> > 
> > This btree in memory cache code is probably the nastiest, subtlest,
> > trickiest code in bcache. I have some cleanups in a branch somewhere as
> > part of some other work that I need to finish off.
> > 
> > The btree_cache_freed list isn't for caching - we never free struct
> > btrees except on shutdown, to simplify the code. It doesn't cost much
> > memory since the memory usage is dominated by the buffers struct btree
> > points to, and those are freed when necessary.
> 
> Out of curiosity, how does not freeing make the code simpler?  Is it
> something synchronization related?

Yeah - looking up btree nodes in the in memory cache involves checking a
lockless hash table (i.e. using hlist_for_each_rcu()).

It would be fairly trivial to free them with kfree_rcu(), but I'd have
to go through and make sure there aren't any other places where there
could be dangling references - i.e. io related stuff. And I wouldn't be
able to delete any code - I need the btree_cache_freed list anyways so I
can preallocate a reserve at startup.

I all the changes I've made based on your review feedback so far up -
git://evilpiepirate.org/~kent/linux-bcache.git bcache-3.3-dev

Kent Overstreet (7):
      Document some things and incorporate some review feedback
      bcache: Fix a bug in submit_bbio_split()
      bcache: sprint_string_list() -> snprint_string_list()
      Add human-readable units modifier to vsnprintf()
      bcache: Kill hprint()
      bcache: Review feedback
      bcache: Kill popcount()
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ