lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.58.0701141902010.2649@be1.lrz>
Date:	Sun, 14 Jan 2007 19:51:21 +0100 (CET)
From:	Bodo Eggert <7eggert@....de>
To:	Bill Davidsen <davidsen@....com>
cc:	7eggert@....de,
	Linux Kernel mailing List <linux-kernel@...r.kernel.org>
Subject: Re: O_DIRECT question

On Sat, 13 Jan 2007, Bill Davidsen wrote:

> Bodo Eggert wrote:
> 
> > (*) This would allow fadvise_size(), too, which could reduce fragmentation
> >     (and give an early warning on full disks) without forcing e.g. fat to
> >     zero all blocks. OTOH, fadvise_size() would allow users to reserve the
> >     complete disk space without his filesizes reflecting this.
> 
> Please clarify how this would interact with quota, and why it wouldn't 
> allow someone to run me out of disk.

I fell into the "write-will-never-fail"-pit. Therefore I have to talk 
about the original purpose, write with O_DIRECT, too.

- Reserved blocks should be taken out of the quota, since they are about
  to be written right now. If you emptied your quota doing this, it's
  your fault. It it was the group's quota, just run fast enough.-)

- If one write failed that extended the reserved range, the reserved area 
  should be shrunk again. Obviously you'll need something clever here.
  * You can't shrink carelessly while there are O_DIRECT writes.
  * You can't just try to grab the semaphore[0] for writing, this will 
    deadlock with other write()s.
  * If you drop the read lock, it will work out, because you aren't
    writing anymore, and if you get the write lock, there won't be anybody 
    else writing. Therefore you can clear the reservation for the not-
    written blocks. You may unreserve blocks that should stay reserved,
    but that won't harm much. At worst, you'll get fragmentation, loss
    of speed and an aborted (because of no free space) write command.
    Document this, it's a feature.-)

- If you fadvise_size on a non-quota-disk, you can possibly reserve it 
  completely, without being the easy-to-spot offender. You can do the
  same by actually writing these files, keeping them open and unlinking 
  them. The new quality is: You can't just look at the file sizes in
  /proc in order to spot the offender. However, if you reflect the 
  reserved blocks in the used-blocks-field of struct stat, du will
  work as expected and the BOFH will know whom to LART.

  BTW: If the fs supports holes, using du would be the right thing
  to do anyway.


BTW2: I don't know if reserving without actually assigning blocks is 
supported or easy to support at all. These reservations are the result of 
"These blocks are not yet written, therefore they contain possibly secret 
data that would leak on failed writes, therefore they may not be actually 
assigned to the file before write finishes. They may not be on the free 
list either. And hey, if we support pre-reserving blocks to the file, we 
may additionally use it for fadvise_size. I'll mention that briefly."




[0] r/w semaphore, read={r,w}_odirect, write=ftruncate

-- 
Fun things to slip into your budget
Paradigm pro-activator (a whole pack)
	(you mean beer?)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ