[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <B9A0A2A9-25B3-4238-A24D-4F77DD1FEABC@dilger.ca>
Date: Thu, 18 Jun 2020 20:31:05 -0600
From: Andreas Dilger <adilger@...ger.ca>
To: Eric Sandeen <sandeen@...hat.com>
Cc: "linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH 0/1] ext4: fix potential negative array index in do_split
On Jun 17, 2020, at 1:01 PM, Eric Sandeen <sandeen@...hat.com> wrote:
>
> We recently had a report of a panic in do_split; the filesystem in question
> panicked a distribution kernel when trying to add a new directory entry;
> the behavior/bug persists upstream.
>
> The directory block in question had lots of unused and un-coalesced
> entries, like this, printed from the loop in ext4_insert_dentry():
>
> [32778.024654] reclen 44 for name len 36
> [32778.028745] start: de ffff9f4cb5309800 top ffff9f4cb5309bd4
> [32778.034971] offset 0 nlen 28 rlen 40, rlen-nlen 12, reclen 44 name <empty>
> [32778.042744] offset 40 nlen 28 rlen 28, rlen-nlen 0, reclen 44 name <empty>
> [32778.050521] offset 68 nlen 32 rlen 32, rlen-nlen 0, reclen 44 name <empty>
> [32778.058294] offset 100 nlen 28 rlen 28, rlen-nlen 0, reclen 44 name <empty>
> [32778.066166] offset 128 nlen 28 rlen 28, rlen-nlen 0, reclen 44 name <empty>
> [32778.074035] offset 156 nlen 28 rlen 28, rlen-nlen 0, reclen 44 name <empty>
> [32778.081907] offset 184 nlen 24 rlen 24, rlen-nlen 0, reclen 44 name <empty>
> [32778.089779] offset 208 nlen 36 rlen 36, rlen-nlen 0, reclen 44 name <empty>
> [32778.097648] offset 244 nlen 12 rlen 12, rlen-nlen 0, reclen 44 name REDACTED
> [32778.105227] offset 256 nlen 24 rlen 24, rlen-nlen 0, reclen 44 name <empty>
> [32778.113099] offset 280 nlen 24 rlen 24, rlen-nlen 0, reclen 44 name REDACTED
> [32778.122134] offset 304 nlen 20 rlen 20, rlen-nlen 0, reclen 44 name REDACTED
> [32778.130780] offset 324 nlen 16 rlen 16, rlen-nlen 0, reclen 44 name REDACTED
> [32778.138746] offset 340 nlen 24 rlen 24, rlen-nlen 0, reclen 44 name <empty>
> [32778.146616] offset 364 nlen 28 rlen 28, rlen-nlen 0, reclen 44 name <empty>
> [32778.154487] offset 392 nlen 24 rlen 24, rlen-nlen 0, reclen 44 name <empty>
> [32778.162362] offset 416 nlen 16 rlen 16, rlen-nlen 0, reclen 44 name <empty>
> ...
>
> the file we were trying to insert needed a record length of 44, and none of the
> non-coalesced <empty> slots were big enough, so we failed and told do_split
> to get to work.
>
> However, the sum of the non-empty entries didn't exceed half the block size, so
> the loop in do_split() iterated over all of the entries, ended at "count," and
> told us to split at (count - move) which is zero, and eventually:
>
> continued = hash2 == map[split - 1].hash;
>
> exploded on the negative index.
>
> It's an open question as to how this directory got into this format; I'm not
> sure if this should ever happen or not. But at a minimum, I think we should
> be defensive here, hence [PATCH 1/1] will do that as an expedient fix and
> backportable patch for this situation. There may be some other underlying
> probem which led to this directory structure if it's unexpected, and maybe that
> can come as another patch if anyone can investigate.
I thought this might be a bit of a conundrum. There is *supposed* to be
merging of adjacent entries, but in some quick testing on RHEL7 (kernel
3.10.0-957.12.1.el7, same with Debian 4.14.79) shows this to be broken
if the files are deleted in dirent order (which would seem to be the most
common order):
# mkdir tmp; cd tmp
# touch file{1..100}
# rm file{33,36,37,39,41,42,43,46,47}
# debugfs -c -R "ls -ld tmp" /dev/sda1
366 100644 (1) 0 0 0 18-Jun-2020 18:43 file30
< 369> 0 (1) 0 0 <reclen= 16> <deleted> file33
< 372> 0 (1) 0 0 <reclen= 16> <deleted> file36
< 373> 0 (1) 0 0 <reclen= 16> <deleted> file37
< 375> 0 (1) 0 0 <reclen= 16> <deleted> file39
< 377> 0 (1) 0 0 <reclen= 16> <deleted> file41
< 378> 0 (1) 0 0 <reclen= 16> <deleted> file42
< 379> 0 (1) 0 0 <reclen= 16> <deleted> file43
< 382> 0 (1) 0 0 <reclen= 16> <deleted> file46
< 383> 0 (1) 0 0 <reclen= 16> <deleted> file47
386 100644 (1) 0 0 0 18-Jun-2020 18:43 file50
Above shows (with modified debugfs to show reclen for deleted files)
that the dirents are *not* combined. If the dirent *before* the
other entries is deleted, then they are merged:
# rm file30
< 366> 0 (1) 0 0 <reclen= 160> <deleted> file30
< 369> 0 (1) 0 0 <reclen= 16> <deleted> file33
< 372> 0 (1) 0 0 <reclen= 16> <deleted> file36
< 373> 0 (1) 0 0 <reclen= 16> <deleted> file37
< 375> 0 (1) 0 0 <reclen= 16> <deleted> file39
< 377> 0 (1) 0 0 <reclen= 16> <deleted> file41
< 378> 0 (1) 0 0 <reclen= 16> <deleted> file42
< 379> 0 (1) 0 0 <reclen= 16> <deleted> file43
< 382> 0 (1) 0 0 <reclen= 16> <deleted> file46
< 383> 0 (1) 0 0 <reclen= 16> <deleted> file47
386 100644 (1) 0 0 0 18-Jun-2020 18:43 file50
Cheers, Andreas
Download attachment "signature.asc" of type "application/pgp-signature" (874 bytes)
Powered by blists - more mailing lists