Skip to content

Commit feb2697

Browse files
LiBaokun96BobBeckett
authored andcommitted
[FROM-UPSTREAM] ext4: fix race between writepages and remount
commit 745f17a upstream. We got a WARNING in ext4_add_complete_io: ================================================================== WARNING: at fs/ext4/page-io.c:231 ext4_put_io_end_defer+0x182/0x250 CPU: 10 PID: 77 Comm: ksoftirqd/10 Tainted: 6.3.0-rc2 torvalds#85 RIP: 0010:ext4_put_io_end_defer+0x182/0x250 [ext4] [...] Call Trace: <TASK> ext4_end_bio+0xa8/0x240 [ext4] bio_endio+0x195/0x310 blk_update_request+0x184/0x770 scsi_end_request+0x2f/0x240 scsi_io_completion+0x75/0x450 scsi_finish_command+0xef/0x160 scsi_complete+0xa3/0x180 blk_complete_reqs+0x60/0x80 blk_done_softirq+0x25/0x40 __do_softirq+0x119/0x4c8 run_ksoftirqd+0x42/0x70 smpboot_thread_fn+0x136/0x3c0 kthread+0x140/0x1a0 ret_from_fork+0x2c/0x50 ================================================================== Above issue may happen as follows: cpu1 cpu2 ----------------------------|---------------------------- mount -o dioread_lock ext4_writepages ext4_do_writepages *if (ext4_should_dioread_nolock(inode))* // rsv_blocks is not assigned here mount -o remount,dioread_nolock ext4_journal_start_with_reserve __ext4_journal_start __ext4_journal_start_sb jbd2__journal_start *if (rsv_blocks)* // h_rsv_handle is not initialized here mpage_map_and_submit_extent mpage_map_one_extent dioread_nolock = ext4_should_dioread_nolock(inode) if (dioread_nolock && (map->m_flags & EXT4_MAP_UNWRITTEN)) mpd->io_submit.io_end->handle = handle->h_rsv_handle ext4_set_io_unwritten_flag io_end->flag |= EXT4_IO_END_UNWRITTEN // now io_end->handle is NULL but has EXT4_IO_END_UNWRITTEN flag scsi_finish_command scsi_io_completion scsi_io_completion_action scsi_end_request blk_update_request req_bio_endio bio_endio bio->bi_end_io > ext4_end_bio ext4_put_io_end_defer ext4_add_complete_io // trigger WARN_ON(!io_end->handle && sbi->s_journal); The immediate cause of this problem is that ext4_should_dioread_nolock() function returns inconsistent values in the ext4_do_writepages() and mpage_map_one_extent(). There are four conditions in this function that can be changed at mount time to cause this problem. These four conditions can be divided into two categories: (1) journal_data and EXT4_EXTENTS_FL, which can be changed by ioctl (2) DELALLOC and DIOREAD_NOLOCK, which can be changed by remount The two in the first category have been fixed by commit c8585c6 ("ext4: fix races between changing inode journal mode and ext4_writepages") and commit cb85f4d ("ext4: fix race between writepages and enabling EXT4_EXTENTS_FL") respectively. Two cases in the other category have not yet been fixed, and the above issue is caused by this situation. We refer to the fix for the first category, when applying options during remount, we grab s_writepages_rwsem to avoid racing with writepages ops to trigger this problem. Fixes: 6b523df ("ext4: use transaction reservation for extent conversion in ext4_end_io") Cc: stable@vger.kernel.org Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230524072538.2883391-1-libaokun1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit a937cf1) Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
1 parent 0602e5b commit feb2697

File tree

2 files changed

+16
-1
lines changed

2 files changed

+16
-1
lines changed

fs/ext4/ext4.h

+2-1
Original file line numberDiff line numberDiff line change
@@ -1674,7 +1674,8 @@ struct ext4_sb_info {
16741674

16751675
/*
16761676
* Barrier between writepages ops and changing any inode's JOURNAL_DATA
1677-
* or EXTENTS flag.
1677+
* or EXTENTS flag or between writepages ops and changing DELALLOC or
1678+
* DIOREAD_NOLOCK mount options on remount.
16781679
*/
16791680
struct percpu_rw_semaphore s_writepages_rwsem;
16801681
struct dax_device *s_daxdev;

fs/ext4/super.c

+14
Original file line numberDiff line numberDiff line change
@@ -6425,6 +6425,7 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
64256425
struct ext4_mount_options old_opts;
64266426
ext4_group_t g;
64276427
int err = 0;
6428+
int alloc_ctx;
64286429
#ifdef CONFIG_QUOTA
64296430
int enable_quota = 0;
64306431
int i, j;
@@ -6465,7 +6466,16 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
64656466

64666467
}
64676468

6469+
/*
6470+
* Changing the DIOREAD_NOLOCK or DELALLOC mount options may cause
6471+
* two calls to ext4_should_dioread_nolock() to return inconsistent
6472+
* values, triggering WARN_ON in ext4_add_complete_io(). we grab
6473+
* here s_writepages_rwsem to avoid race between writepages ops and
6474+
* remount.
6475+
*/
6476+
alloc_ctx = ext4_writepages_down_write(sb);
64686477
ext4_apply_options(fc, sb);
6478+
ext4_writepages_up_write(sb, alloc_ctx);
64696479

64706480
if ((old_opts.s_mount_opt & EXT4_MOUNT_JOURNAL_CHECKSUM) ^
64716481
test_opt(sb, JOURNAL_CHECKSUM)) {
@@ -6683,6 +6693,8 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
66836693
if ((sb->s_flags & SB_RDONLY) && !(old_sb_flags & SB_RDONLY) &&
66846694
sb_any_quota_suspended(sb))
66856695
dquot_resume(sb, -1);
6696+
6697+
alloc_ctx = ext4_writepages_down_write(sb);
66866698
sb->s_flags = old_sb_flags;
66876699
sbi->s_mount_opt = old_opts.s_mount_opt;
66886700
sbi->s_mount_opt2 = old_opts.s_mount_opt2;
@@ -6691,6 +6703,8 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
66916703
sbi->s_commit_interval = old_opts.s_commit_interval;
66926704
sbi->s_min_batch_time = old_opts.s_min_batch_time;
66936705
sbi->s_max_batch_time = old_opts.s_max_batch_time;
6706+
ext4_writepages_up_write(sb, alloc_ctx);
6707+
66946708
if (!test_opt(sb, BLOCK_VALIDITY) && sbi->s_system_blks)
66956709
ext4_release_system_zone(sb);
66966710
#ifdef CONFIG_QUOTA

0 commit comments

Comments
 (0)