Overview
About vulnerability
In the Linux kernel, the following vulnerability has been resolved:
f2fs: fix IS_CHECKPOINTED flag inconsistency issue caused by concurrent atomic commit and checkpoint writes
During SPO tests, when mounting F2FS, an -EINVAL error was returned from f2fs_recover_inode_page. The issue occurred under the following scenario
Thread A Thread B f2fs_ioc_commit_atomic_write
- f2fs_do_sync_file // atomic = true
- f2fs_fsync_node_pages : last_folio = inode folio : schedule before folio_lock(last_folio) f2fs_write_checkpoint
- block_operations// writeback last_folio
- schedule before f2fs_flush_nat_entries : set_fsync_mark(last_folio, 1) : set_dentry_mark(last_folio, 1) : folio_mark_dirty(last_folio)
- __write_node_folio(last_folio) : f2fs_down_read(&sbi->node_write)//block
- f2fs_flush_nat_entries : {struct nat_entry}->flag |= BIT(IS_CHECKPOINTED)
- unblock_operations : f2fs_up_write(&sbi->node_write) f2fs_write_checkpoint//return : f2fs_do_write_node_page() f2fs_ioc_commit_atomic_write//return SPO
Thread A calls f2fs_need_dentry_mark(sbi, ino), and the last_folio has already been written once. However, the {struct nat_entry}->flag did not have the IS_CHECKPOINTED set, causing set_dentry_mark(last_folio, 1) and write last_folio again after Thread B finishes f2fs_write_checkpoint.
After SPO and reboot, it was detected that {struct node_info}->blk_addr was not NULL_ADDR because Thread B successfully write the checkpoint.
This issue only occurs in atomic write scenarios. For regular file fsync operations, the folio must be dirty. If block_operations->f2fs_sync_node_pages successfully submit the folio write, this path will not be executed. Otherwise, the f2fs_write_checkpoint will need to wait for the folio write submission to complete, as sbi->nr_pages[F2FS_DIRTY_NODES] > 0. Therefore, the situation where f2fs_need_dentry_mark checks that the {struct nat_entry}->flag /wo the IS_CHECKPOINTED flag, but the folio write has already been submitted, will not occur.
Therefore, for atomic file fsync, sbi->node_write should be acquired through __write_node_folio to ensure that the IS_CHECKPOINTED flag correctly indicates that the checkpoint write has been completed.
Details
- Affected product:
- AlmaLinux 9.2 ESU , CentOS 8.4 ELS , CentOS 8.5 ELS , CentOS Stream 8 ELS , Oracle Linux 7 ELS , TuxCare 9.6 ESU , Ubuntu 16.04 ELS , Ubuntu 18.04 ELS , Ubuntu 20.04 ELS
- Affected packages:
- kernel @ 5.14.0 (+8 more)
In the Linux kernel, the following vulnerability has been resolved:
f2fs: fix IS_CHECKPOINTED flag inconsistency issue caused by concurrent atomic commit and checkpoint writes
During SPO tests, when mounting F2FS, an -EINVAL error was returned from f2fs_recover_inode_page. The issue occurred under the following scenario
Thread A Thread B f2fs_ioc_commit_atomic_write
- f2fs_do_sync_file // atomic = true
- f2fs_fsync_node_pages : last_folio = inode folio : schedule before folio_lock(last_folio) f2fs_write_checkpoint
- block_operations// writeback last_folio
- schedule before f2fs_flush_nat_entries : set_fsync_mark(last_folio, 1) : set_dentry_mark(last_folio, 1) : folio_mark_dirty(last_folio)
- __write_node_folio(last_folio) : f2fs_down_read(&sbi->node_write)//block
- f2fs_flush_nat_entries : {struct nat_entry}->flag |= BIT(IS_CHECKPOINTED)
- unblock_operations : f2fs_up_write(&sbi->node_write) f2fs_write_checkpoint//return : f2fs_do_write_node_page() f2fs_ioc_commit_atomic_write//return SPO
Thread A calls f2fs_need_dentry_mark(sbi, ino), and the last_folio has already been written once. However, the {struct nat_entry}->flag did not have the IS_CHECKPOINTED set, causing set_dentry_mark(last_folio, 1) and write last_folio again after Thread B finishes f2fs_write_checkpoint.
After SPO and reboot, it was detected that {struct node_info}->blk_addr was not NULL_ADDR because Thread B successfully write the checkpoint.
This issue only occurs in atomic write scenarios. For regular file fsync operations, the folio must be dirty. If block_operations->f2fs_sync_node_pages successfully submit the folio write, this path will not be executed. Otherwise, the f2fs_write_checkpoint will need to wait for the folio write submission to complete, as sbi->nr_pages[F2FS_DIRTY_NODES] > 0. Therefore, the situation where f2fs_need_dentry_mark checks that the {struct nat_entry}->flag /wo the IS_CHECKPOINTED flag, but the folio write has already been submitted, will not occur.
Therefore, for atomic file fsync, sbi->node_write should be acquired through __write_node_folio to ensure that the IS_CHECKPOINTED flag correctly indicates that the checkpoint write has been completed.