lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 3 Aug 2006 00:17:41 -0700
From:	Andrew Morton <akpm@...l.org>
To:	reiserfs-dev@...esys.com
Cc:	linux-kernel@...r.kernel.org
Subject: partial reiser4 review comments


I'm only partway through this, but let me unload my observations thus far. 
I still need to find a chunk of time for a file-by-file walkthrough and
it's unobvious where that chunk will come from at present :(


reiser4 as found in 2.6.18-rc2-mm1.

- set_page_dirty_internal() pokes around in VFS internals.  Use
  __set_page_dirty_no_buffers() or create a new library function in
  mm/page-writeback.c.

  In particular, it gets the radix-tree dirty tagging out of sync.

- running igrab() in the writepage() path is really going to hammer
  inode_lock.  Something else will need to be done here.

- The preferred way of solving the above would be to mark the page as
  PageWriteback() with set_page_writeback() prior to unlocking it.  That'll
  pin the page and the inode.  It does require that the page actually get
  written later on.  If we cannot do that then more thought is needed.

- wbq.sem should be using a completion for the "wait until entd finishes",
  not a semaphore.  Because there's a teeny theoretical race when using
  semaphores this way which completions were designed to avoid.  (The waker
  can still be playing with the semaphore when it has gone out of scope on the
  wakee's stack).

- write_page_by_ent(): the "spin until entd thread" thing is gross.

  This function is really lock-intensive.

- If poss, use wake_up_process() rather than wake_up().  That'll save some
  locking.

- Running iput() in entd() is a bit surprising.  iirc there are various ways
  in which this can recur into the filesystem, perform I/O, etc.  I guess it
  works..

  But again, it will hammer inode_lock.

- entd_flush(): bug:

	rq->wbc->range_start = rq->page->index << PAGE_CACHE_SHIFT;

  this can overflow on 32-bit.  Need to cast rq->page->index to loff_t.

- the writeout logic in entd_flush() is interesting (as in "holy cow"). 
  It's very central and really needs some good comments describing what's
  going on in there - what problems are being solved, which decisions were
  taken and why, etc.

  The big comment in page_cache.c is useful.  Please maintain it well.

  Boy, it has some old stuff in it.

- writeout() is a poor name for a global function.  Even things like
  "txn_restart" are a bit generic-sounding.  Low-risk, but the kernel's
  getting bigger...

  If it were mine, I'd prefix all these symbols with "r4_".

  prepare_to_sleep(), page_io(), drop_page(), statfs_type(),
  pre_commit_hook(), etc, etc, etc, etc.  Much namespace pollution.

- reiser4_wait_page_writeback() needs commenting.

- reading the comment in txnmgr.c regarding MAP_SHARED pages: a number of
  things have changed since then.  We have page-becoming-writeable
  notifications and probably soon we'll always take a pagefault when a
  MAP_SHARED page transitions from pte-clean to pte-dirty (although I wouldn't
  recommend that a filesystem rely upon the latter for a while yet).

- invalidate_list() is a poorly-chosen global identifier.  We already have
  an invalidate_list() in fs/inode.c, too.

  Please audit all of reiser4's global identifiers (use nm *.o) for suitable
  naming choices.

- semaphores are deprecated.  Please switch to mutexes and/or completions
  where appropriate and possible.

- page_cache.c: yes, mpage_end_io_write() and mpage_end_io_read() are pretty
  generic - we might as well export them.

- drop_page() is a worry.  Why _does_ reiser4 need to remove pages from
  pagecache?  That isn't a filesystem function.

  drop_page() appears to leave the no-longer-present page tagged as dirty in
  the radix-tree.

- truncate_jnodes_range() looks wrong to me.  When we populate gang[], there
  can be any number of NULL entries placed in it.  But the loop which iterates
  across the now-populated gang[] will bale out when it its the _first_ NULL
  entry.  Any following entries will have a leaked jref() against them.

- reiser4_invalidate_pages() is a mix of reiser4 things and of
  things-which-the-vfs-is-supposed-to-do.  It is uncommented and I am unable
  to work out why it was necessary to do this, and hence what we can do about
  it.

- reiser4_readpages() shouldn't need to clean up the remaining pages on
  *pages.  read_cache_pages() does that now.

- <wonders what formatted and unformatted nodes are> A brief glossary might
  help.

- REISER4_ERROR_CODE_BASE actually overlaps real errnos (see
  include/linux/errno.h).  Suggest that it be changed to 1000000 or something.

- blocknr_set_add() modifies a list_head without any apparent locking. 
  Certainly without any _documented_ locking...

  Ditto blocknr_set_destroy().  I'm sure there's locking, but it's harder
  than it should be to work out what it is.  Given that proper locking is in
  place, the filesystem seems to use list_*_careful() a lot more than is
  necessary?

- It would be clearer to remove `struct blocknr_set' and just use list_head.

- Waaaaaaaaaaaaaaay too many typedefs.

- There are many coding-style nits.  One I will mention is very large number
  of unneeded braces:

	if (foo) {
		bar();
	}

  it'd be nice to fix these up sometime.


- General comment: the lack of support for extended attributes, access
  control lists and direct-io is a big problem and it's getting bigger.  I
  don't see how a vendor could support reiser4 without these features and
  permanent lack of vendor support will hurt.

  What's the plan here?


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ