[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpF=FpMvGGzVr5E+9R629SfXB8oWm2NbowHg=mksUQ113A@mail.gmail.com>
Date: Sat, 17 May 2025 09:39:30 -0700
From: Suren Baghdasaryan <surenb@...gle.com>
To: David Wang <00107082@....com>
Cc: kent.overstreet@...ux.dev, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: BUG: unable to handle page fault for address
On Sat, May 17, 2025 at 12:02 AM David Wang <00107082@....com> wrote:
>
>
> At 2025-05-17 08:11:24, "Suren Baghdasaryan" <surenb@...gle.com> wrote:
> >On Fri, May 16, 2025 at 10:03 AM Suren Baghdasaryan <surenb@...gle.com> wrote:
> >>
> >> Hi David,
> >>
> >> On Fri, May 16, 2025 at 6:13 AM David Wang <00107082@....com> wrote:
> >> >
> >> > Hi,
> >> >
> >> > I caught a page fault when I was changing my nvidia driver:
> >> > (This happens randomly, I can reproduce it with about 1/3 probability)
> >> >
> >> > [Fri May 16 12:05:41 2025] BUG: unable to handle page fault for address: ffff9d28984c3000
> >> > [Fri May 16 12:05:41 2025] #PF: supervisor read access in kernel mode
> >> > [Fri May 16 12:05:41 2025] #PF: error_code(0x0000) - not-present page
> >> > ...
> >> > [Fri May 16 12:05:41 2025] RIP: 0010:release_module_tags+0x103/0x1b0
> >> > ...
> >> > [Fri May 16 12:05:41 2025] Call Trace:
> >> > [Fri May 16 12:05:41 2025] <TASK>
> >> > [Fri May 16 12:05:41 2025] codetag_unload_module+0x135/0x160
> >> > [Fri May 16 12:05:41 2025] free_module+0x19/0x1a0
> >> > ...
> >> > (full kernel logs are pasted at the end.)
> >> >
> >> > Using a image with DEBUG_INFO, the calltrack parses as:
> >> >
> >> > RIP: 0010:release_module_tags (./include/linux/alloc_tag.h:134 lib/alloc_tag.c:352 lib/alloc_tag.c:573)
> >> > [Fri May 16 12:05:41 2025] codetag_unload_module (lib/codetag.c:355)
> >> > [Fri May 16 12:05:41 2025] free_module (kernel/module/main.c:1305)
> >> > [Fri May 16 12:05:41 2025] __do_sys_delete_module (kernel/module/main.c:795)
> >> >
> >> > The offending lines in my codebase:
> >> > 126 static inline struct alloc_tag_counters alloc_tag_read(struct alloc_tag *tag)
> >> > 127 {
> >> > ...
> >> > 131
> >> > 132 for_each_possible_cpu(cpu) {
> >> > 133 counter = per_cpu_ptr(tag->counters, cpu);
> >> > >>>> 134 v.bytes += counter->bytes; <--------------here
> >> > 135 v.calls += counter->calls;
> >> >
> >> >
> >> > Nvidia drivers are out-tree... there could be some strange behavior in it causes this.. but,
> >> > when I check the code, I got concerned about lifecycle of tag->counters.
> >> > Based on following defination:
> >> > 108 #define DEFINE_ALLOC_TAG(_alloc_tag) \
> >> > 109 static DEFINE_PER_CPU(struct alloc_tag_counters, _alloc_tag_cntr); \
> >> > 110 static struct alloc_tag _alloc_tag __used __aligned(8) \
> >> > 111 __section(ALLOC_TAG_SECTION_NAME) = { \
> >> > 112 .ct = CODE_TAG_INIT, \
> >> > 113 .counters = &_alloc_tag_cntr };
> >> > 114
> >> > _alloc_tag_cntr is the data referenced by tag->counters, but they are in different section,
> >> > and alloc_tag only prepare storage for section ALLOC_TAG_SECTION_NAME.
> >> > right?
> >> > Then what happens to those ".data..percpu" section when module is unloaded?
> >> > Is it safe to keep using those ".data..percpu" section after module unloaded,
> >> > or even during module is unloading?
> >>
> >> Yes, I think you are right, free_module() calls percpu_modfree() which
> >> would free the per-cpu memory allocated for the module. Before
> >> 0db6f8d7820a ("alloc_tag: load module tags into separate contiguous
> >> memory") we would not unload the module if there were tags which were
> >> still in use. After that change we load module tags into separate
> >> memory, so I expected this to work but due to this external reference
> >> it indeed should lead to UAF.
> >> I think the simplest way to fix this would be to bypass
> >> percpu_modfree() inside free_module() when there are module tags still
> >> referenced, store mod->percpu inside alloc_tag_module_section and free
> >> it inside clean_unused_module_areas_locked() once we know the counters
> >> are not used anymore. I'll take a stab at it and will send a patch for
> >> testing today.
> >
> >Ok, I went with another implementation, instead dynamically allocating
> >percpu memory for modules at the module load time. This has another
> >advantage of not needing extra PERCPU_MODULE_RESERVE currently
> >required for memory allocation tagging to work.
> >David, the patch is posted at [1]. Please give it a try and let me
> >know if the fix works for you.
> >Thanks,
> >Suren.
> >
> >[1] https://lore.kernel.org/all/20250517000739.5930-1-surenb@google.com/
> >
> >
> >> Thanks,
> >> Suren.
> >>
>
> Hi, the patch does fix my issue.
> I now have another similar concern about modules RO data,
> The codetag defined as
> 24 struct codetag {
> 25 unsigned int flags; /* used in later patches */
> 26 unsigned int lineno;
> 27 const char *modname;
> 28 const char *function;
> 29 const char *filename;
> 30 } __aligned(8);
>
> Those modname/function/filename would refer to RO data section, right?
> When module unloaded, its RO data section would be released at some point.
> My question is is it safe to use RO data during module unload? because these
> lines seems to access those data:
>
> + pr_info("%s:%u module %s func:%s has %llu allocated at module unload\n",
> + tag->ct.filename, tag->ct.lineno, tag->ct.modname,
> + tag->ct.function, counter.bytes);
These lines are called from release_module_tags() using this call chain:
delete_module()
free_module()
codetag_unload_module()
release_module_tags()
and codetag_unload_module() is called at the very beginning of
free_module(), when no other module memory has been freed yet. So,
from what I understand, this should be safe.
After we unload the module these pointers inside the tags will be
dandling but we should not be using them anymore since we do not
report unloaded modules. Do you see some usage that I missed?
>
>
>
> Thanks
> David
>
>
Powered by blists - more mailing lists