Vulnerability in netfilter code allows local privilege escalation
Many high-level technologies in the IT industry, in fact most of them, are built on top of existing features. Containers are a prime example of this. This lightweight virtualization layer is built on top of a foundation that, among other things, relies heavily on cgroups. Cgroups, aka control groups, are an abstraction that allows specific system components like memory, CPU, or networking to be separated by access levels. This is useful for hiding parts of the system from specific processes running on it, enabling some processes to run “isolated” from the rest of the system or “inside a container”.
But this layering of technologies brings some risks. When a vulnerability that affects one of the foundation components appears, the whole stack is potentially vulnerable. This is what happened with CVE-2022-25636, a recently divulged vulnerability that affects the cgroups’ networking code. It impacts distributions running Kernel 4.18.0-240.el8 and above (RHEL 8 and derivatives like AlmaLinux 8, CentOS 8, Oracle EL 8, as well as Ubuntu and others). It allows privilege escalation for local users. KernelCare Enterprise patches will be made available soon and this post will be updated to reflect such availability as it happens.
The actual issue exists in the function nft_fwd_dup_netdev_offload, in the file net/netfilter/nf_dup_netdev.c inside the Linux Kernel source code tree. This is part of netfilter, which in turn is the code responsible for network packet management and underpins common things like iptables firewall technology. The specific code is used, primarily, in the context of network cgroups, in turn by Containers, where it enables network traffic segregation between different containers and the host system.
It was discovered that a local user could trigger an out-of-bounds memory access in this function and corrupt specific memory locations. The original vulnerability report included exploit code that showcased the problem and enabled local privilege escalation. This vulnerability was assigned a 7.8 (CVSS 3) score that reflects its impact and ease of exploitation, as well as the existence of readily available exploit code.
While the affected code was introduced with version 5.4 of the Linux kernel, as part of normal support processes, it was backported to older versions like the ones included in the Enterprise Linux 8 family of distributions (RHEL 8, CentOS 8, AlmaLinux 8, etc.). For these, you can check if the running kernel version is 4.18.0-240.el8 or greater, and if so, the system will be vulnerable.
There are mitigations that can be applied to affected systems until a patch is available, but they are highly dependent on the affected functionality not being critical. If the system is running containers, they will be impacted, and best judgment should be employed in weighing the pros and cons of such mitigations. It is possible to avoid the affected functionality by running:
|echo 0 > /proc/sys/user/max_user_namespaces
This change will revert upon system reboot. For a more detailed and permanent method of disabling this functionality, check the mitigation section found here. Again, it is important to stress that this will impact regular container execution and should only be deployed where this would be acceptable. For example, if the system is not running containerized workloads.
Live patches for supported and affected distributions are being prepared and will be made available for KernelCare Enterprise service subscribers in the coming days. This post will be updated to reflect actual patch availability as development and testing is finalized for each distribution.
For more information about KernelCare Enterprise and other TuxCare products that enhance your system security without affecting your uptime, check https://tuxcare.com/live-patching-services/.