Overview
About vulnerability
In the Linux kernel, the following vulnerability has been resolved:
perf: Fix sample vs do_exit()
Baisheng Gao reported an ARM64 crash, which Mark decoded as being a synchronous external abort – most likely due to trying to access MMIO in bad ways.
The crash further shows perf trying to do a user stack sample while in exit_mmap()’s tlb_finish_mmu() – i.e. while tearing down the address space it is trying to access.
It turns out that we stop perf after we tear down the userspace mm; a receipie for disaster, since perf likes to access userspace for various reasons.
Flip this order by moving up where we stop perf in do_exit().
Additionally, harden PERF_SAMPLE_CALLCHAIN and PERF_SAMPLE_STACK_USER to abort when the current task does not have an mm (exit_mm() makes sure to set current->mm = NULL; before commencing with the actual teardown). Such that CPU wide events don’t trip on this same problem.
Details
- Affected product:
- AlmaLinux 9.2 ESU , CentOS 7 ELS , CentOS 8.4 ELS , CentOS 8.5 ELS , CentOS Stream 8 ELS , CloudLinux 7 ELS , Oracle Linux 7 ELS , RHEL 7 ELS , TuxCare 9.6 ESU , Ubuntu 16.04 ELS , Ubuntu 18.04 ELS , Ubuntu 20.04 ELS
- Affected packages:
- linux-hwe @ 4.15.0 (+13 more)
In the Linux kernel, the following vulnerability has been resolved:
perf: Fix sample vs do_exit()
Baisheng Gao reported an ARM64 crash, which Mark decoded as being a synchronous external abort – most likely due to trying to access MMIO in bad ways.
The crash further shows perf trying to do a user stack sample while in exit_mmap()’s tlb_finish_mmu() – i.e. while tearing down the address space it is trying to access.
It turns out that we stop perf after we tear down the userspace mm; a receipie for disaster, since perf likes to access userspace for various reasons.
Flip this order by moving up where we stop perf in do_exit().
Additionally, harden PERF_SAMPLE_CALLCHAIN and PERF_SAMPLE_STACK_USER to abort when the current task does not have an mm (exit_mm() makes sure to set current->mm = NULL; before commencing with the actual teardown). Such that CPU wide events don’t trip on this same problem.