[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87o73cygtq.fsf@gmail.com>
Date: Tue, 22 Oct 2024 08:39:05 +0530
From: Ritesh Harjani (IBM) <ritesh.list@...il.com>
To: Michael Ellerman <mpe@...erman.id.au>, linuxppc-dev@...ts.ozlabs.org
Cc: kasan-dev@...glegroups.com, linux-mm@...ck.org, Marco Elver <elver@...gle.com>, Alexander Potapenko <glider@...gle.com>, Heiko
Carstens <hca@...ux.ibm.com>, Nicholas Piggin <npiggin@...il.com>, Madhavan Srinivasan <maddy@...ux.ibm.com>, Christophe Leroy <christophe.leroy@...roup.eu>, Hari Bathini <hbathini@...ux.ibm.com>, "Aneesh Kumar K . V" <aneesh.kumar@...nel.org>, Donet Tom <donettom@...ux.vnet.ibm.com>, Pavithra Prakash <pavrampu@...ux.vnet.ibm.com>, LKML <linux-kernel@...r.kernel.org>, Disha Goel <disgoel@...ux.ibm.com>
Subject: Re: [PATCH v3 01/12] powerpc: mm/fault: Fix kfence page fault reporting
Michael Ellerman <mpe@...erman.id.au> writes:
> Hi Ritesh,
>
> "Ritesh Harjani (IBM)" <ritesh.list@...il.com> writes:
>> copy_from_kernel_nofault() can be called when doing read of /proc/kcore.
>> /proc/kcore can have some unmapped kfence objects which when read via
>> copy_from_kernel_nofault() can cause page faults. Since *_nofault()
>> functions define their own fixup table for handling fault, use that
>> instead of asking kfence to handle such faults.
>>
>> Hence we search the exception tables for the nip which generated the
>> fault. If there is an entry then we let the fixup table handler handle the
>> page fault by returning an error from within ___do_page_fault().
>>
>> This can be easily triggered if someone tries to do dd from /proc/kcore.
>> dd if=/proc/kcore of=/dev/null bs=1M
>>
>> <some example false negatives>
>> ===============================
>> BUG: KFENCE: invalid read in copy_from_kernel_nofault+0xb0/0x1c8
>> Invalid read at 0x000000004f749d2e:
>> copy_from_kernel_nofault+0xb0/0x1c8
>> 0xc0000000057f7950
>> read_kcore_iter+0x41c/0x9ac
>> proc_reg_read_iter+0xe4/0x16c
>> vfs_read+0x2e4/0x3b0
>> ksys_read+0x88/0x154
>> system_call_exception+0x124/0x340
>> system_call_common+0x160/0x2c4
>
> I haven't been able to reproduce this. Can you give some more details on
> the exact machine/kernel-config/setup where you saw this?
w/o this patch I am able to hit this on book3s64 with both Radix and
Hash. I believe these configs should do the job. We should be able to
reproduce it on qemu and/or LPAR or baremetal.
root-> cat .out-ppc/.config |grep -i KFENCE
CONFIG_HAVE_ARCH_KFENCE=y
CONFIG_KFENCE=y
CONFIG_KFENCE_SAMPLE_INTERVAL=100
CONFIG_KFENCE_NUM_OBJECTS=255
# CONFIG_KFENCE_DEFERRABLE is not set
# CONFIG_KFENCE_STATIC_KEYS is not set
CONFIG_KFENCE_STRESS_TEST_FAULTS=0
CONFIG_KFENCE_KUNIT_TEST=y
root-> cat .out-ppc/.config |grep -i KCORE
CONFIG_PROC_KCORE=y
root-> cat .out-ppc/.config |grep -i KUNIT
CONFIG_KFENCE_KUNIT_TEST=y
CONFIG_KUNIT=y
CONFIG_KUNIT_DEFAULT_ENABLED=y
Then doing running dd like below can hit the issue. Maybe let it run for
few mins and see?
~ # dd if=/proc/kcore of=/dev/null bs=1M
Otherwise running this kfence kunit test also can reproduce the same
bug [1]. Above configs have kfence kunit config shown as well which will
run during boot time itself.
[1]: https://lore.kernel.org/linuxppc-dev/210e561f7845697a32de44b643393890f180069f.1729272697.git.ritesh.list@gmail.com/
Note: This was originally reported internally in which the tester was
doing - perf test 'Object code reading' [2]
[2]: https://github.com/torvalds/linux/blob/master/tools/perf/tests/code-reading.c#L737
Thanks for looking into this. Let me know if this helped.
-ritesh
Powered by blists - more mailing lists