lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 15 Oct 2012 14:05:41 +0000
From:	Arnd Bergmann <arnd@...db.de>
To:	Changman Lee <cm224.lee@...il.com>
Cc:	Vyacheslav Dubeyko <slava@...eyko.com>,
	Jaegeuk Kim <jaegeuk.kim@...il.com>,
	김재극 <jaegeuk.kim@...sung.com>,
	"viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>,
	"Theodore Ts'o" <tytso@....edu>,
	"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"chur.lee@...sung.com" <chur.lee@...sung.com>,
	"cm224.lee@...sung.com" <cm224.lee@...sung.com>,
	"jooyoung.hwang@...sung.com" <jooyoung.hwang@...sung.com>
Subject: Re: [PATCH 11/16] f2fs: add inode operations for special inodes

On Monday 15 October 2012, Changman Lee wrote:
> 2012년 10월 15일 월요일에 Arnd Bergmann<arnd@...db.de>님이 작성:
> > It is only a performance hint though, so it is not a correctness issue the
> > file system gets it wrong. In order to do efficient garbage collection, a log
> > structured file system should take all the information it can get about the
> > expected life of data it writes. I agree that the list, even in the form of
> > mkfs time settings, is not a clean abstraction, but in the place of an Android
> > phone manufacturer I would still enable it if it promises a significant
> > performance advantage over not using it. I guess it would be nice if this
> > could be overridden in some form, e.g. using an ioctl on the file as ext4 does.
> >
> Right. This is related with HOT/COLD separation policy of f2fs. If we know
> that data is COLD, we can manage gc effectively.
> I think that ext lists are placed in sb is better like your advice because
> it's difficult to fix user app. Although it's nasty way.

Ok. I think you should adapt the terminology though. Right now, the optimization
is to mark the data as COLD because we expect it to be written less often than
other kinds of data. However, the hot/cold terms are usually only applied to
data that we assume is going to be written soon or not based on how often
the same data has been accessed in the past.

Anything you detect from the file name is not really a hint on hot/cold
files, but rather on the expected access pattern: These files are going
to be written once, and will be read-only after that, they are probably
multiple megabytes in size, and if you have a lot of them, they are likely
to live for the same time.

It may well be possible that we later decide to use the hint in a different
way, e.g. to put these files into yet another separate log, aside from
other hot or cold files.

> > We should also take the kinds of access we have seen on a file into account.
> > E.g. if someone opens a file O_RDWR and performs seek or pwrite on it, we can
> > assume that it's not in the category of typical media files, and a file that
> > gets written to disk linearly in multiple megabytes might belong into the
> > category even if it is named otherwise.
> >
> This is more general but it's hard to adapt now.

I think it's important to leave the option open for a future optimization.
Right now, what we have to get agreement on is the on-disk format, because
we absolutely don't want to make incompatible changes to that once f2fs
has been merged into the kernel and is getting used on real systems.

This is independent of how the code is implemented at the moment, and
any tuning regarding how to group different kinds of data into the six
logs is completely up to how things work out in practice. But you should
definitely ensure that those changes don't require changing the format
if we decide to use a different number of logs in the future, or to
use the logs differently.

The split between logs for nodes on the one hand and data on the other
is something that can well be hardcoded, and it's ok to have a hard
upper bound on the number of logs in the file system, possibly higher
than 6.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ