[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20180815194811.9423-26-krisman@collabora.co.uk>
Date: Wed, 15 Aug 2018 15:48:11 -0400
From: Gabriel Krisman Bertazi <krisman@...labora.co.uk>
To: tytso@....edu
Cc: linux-ext4@...r.kernel.org, kernel@...labora.com,
Gabriel Krisman Bertazi <krisman@...labora.co.uk>
Subject: [PATCH v2 25/25] docs: ext4.txt: Document encoding and case-insensitive lookups
Introduces the encoding-awareness feature for ext4, explains some of the
design decisions and the mount options to enabled it.
Signed-off-by: Gabriel Krisman Bertazi <krisman@...labora.co.uk>
---
Documentation/filesystems/ext4.txt | 37 ++++++++++++++++++++++++++++++
1 file changed, 37 insertions(+)
diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt
index 7f628b9f7c4b..57ce78c18b26 100644
--- a/Documentation/filesystems/ext4.txt
+++ b/Documentation/filesystems/ext4.txt
@@ -99,6 +99,8 @@ Note: More extensive information for getting started with ext4 can be
* large block (up to pagesize) support
* efficient new ordered mode in JBD2 and ext4 (avoid using buffer head to force
the ordering)
+* Encoding aware file names
+* Case insensitive file name lookups
[1] Filesystems with a block size of 1k may see a limit imposed by the
directory hash tree having a maximum depth of two.
@@ -122,6 +124,32 @@ grouping of bitmaps and inode tables. Some test results available here:
- http://www.bullopensource.org/ext4/20080818-ffsb/ffsb-write-2.6.27-rc1.html
- http://www.bullopensource.org/ext4/20080818-ffsb/ffsb-readwrite-2.6.27-rc1.html
+2.3 Encoding-aware file names and case-insensitive lookups
+==========================================================
+
+Ext4 optionally supports filesystem-wide charset knowledge when handling
+file names, which allows the user to perform file system lookups using
+charset equivalent versions of the same file name, and optionally ensure
+that no invalid names are held by the filesystem. charset encoding
+awareness is also essential for performing case-insensitive lookups,
+because it is what defines the casefold operation.
+
+The case-insensitive file name lookup feature is supported in a smaller
+granularity, on a per-directory basis, allowing the user to mix
+case-insensitive and case-sensitive directories in the same filesystem.
+It is enabled by flipping a file attribute on an empty directory. For
+the reason stated above, the filesystem must have encoding enabled to
+use this feature.
+
+When we change from filenames as opaque byte sequences to seeing them as
+encoded strings we need to address what happens when a program tries to
+create a file with an invalid name. The Natural Language System within
+the kernel leaves the decision of what to do to the filesystem, via
+configuring the NLS strict mode. When Ext4 encounters one of those
+strings, it falls back to considering the entire string as one opaque
+byte sequence, which still allows the user to operate on that file but
+the case-insensitive and equivalent sequence lookups won't work.
+
3. Options
==========
@@ -388,6 +416,15 @@ dax Use direct access (no page cache). See
Documentation/filesystems/dax.txt. Note that
this option is incompatible with data=journal.
+encoding Enable a specific encoding for file name lookups.
+ This cannot be used with per-directory encryption and
+ will fail on filesystems that have that flag enabled.
+
+encoding_flags A bitmask to configure how the encoding aware mechanism
+ should function. It specifies whether to refuse invalid
+ sequences and the specific normalization and casefold
+ operations to use.
+
Data Mode
=========
There are 3 different data modes:
--
2.18.0
Powered by blists - more mailing lists