# git rev-parse -q --verify d3110f256d126b44d34c1f662310cd295877c447^{commit} d3110f256d126b44d34c1f662310cd295877c447 already have revision, skipping fetch # git checkout -q -f -B kisskb d3110f256d126b44d34c1f662310cd295877c447 # git clean -qxdf # < git log -1 # commit d3110f256d126b44d34c1f662310cd295877c447 # Merge: d0df9aabefda ee2e3f50629f # Author: Linus Torvalds # Date: Wed Mar 10 10:01:35 2021 -0800 # # Merge tag 'for-linus-2021-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux # # Pull detached mounts fix from Christian Brauner: # "Creating a series of detached mounts, attaching them to the # filesystem, and unmounting them can be used to trigger an integer # overflow in ns->mounts causing the kernel to block any new mounts in # count_mounts() and returning ENOSPC because it falsely assumes that # the maximum number of mounts in the mount namespace has been reached, # i.e. it thinks it can't fit the new mounts into the mount namespace # anymore. # # Without this fix heavy use of the new mount API with move_mount() will # cause the host to become unuseable and thus blocks some xfstest # patches I want to resend. # # Depending on the number of mounts in your system, this can be # reproduced on any kernel that supportes open_tree() and move_mount(). # # A reproducer has been sent for inclusion with xfstests. It takes care # to do this in another mount namespace, not in the host's mount # namespace so there shouldn't be any risk in running it but if one did # run it on the host it would require a reboot in order to be able to # mount again. See # # https://lore.kernel.org/fstests/20210309121041.753359-1-christian.brauner@ubuntu.com # # The root cause of this is that detached mounts aren't handled # correctly when source and target mount are identical and reside on a # shared mount causing a broken mount tree where the detached source # itself is propagated which propagation prevents for regular # bind-mounts and new mounts. # # This ultimately leads to a miscalculation of the number of mounts in # the mount namespace. # # Detached mounts created via 'open_tree(fd, path, OPEN_TREE_CLONE)' are # essentially like an unattached bind-mount. They can then later on be # attached to the filesystem via move_mount() which calls into # attach_recursive_mount(). # # Part of attaching it to the filesystem is making sure that mounts get # correctly propagated in case the destination mountpoint is MS_SHARED, # i.e. is a shared mountpoint. This is done by calling into # propagate_mnt() which walks the list of peers calling propagate_one() # on each mount in this list making sure it receives the propagation # event. The propagate_one() function thereby skips both new mounts and # bind mounts to not propagate them "into themselves". Both are # identified by checking whether the mount is already attached to any # mount namespace in mnt->mnt_ns. The is what the IS_MNT_NEW() helper is # responsible for. # # However, detached mounts have an anonymous mount namespace attached to # them stashed in mnt->mnt_ns which means that IS_MNT_NEW() doesn't # realize they need to be skipped causing the mount to propagate "into # itself" breaking the mount table and causing a disconnect between the # number of mounts recorded as being beneath or reachable from the # target mountpoint and the number of mounts actually recorded/counted # in ns->mounts ultimately causing an overflow which in turn prevents # any new mounts via the ENOSPC issue. # # So teach propagation to handle detached mounts by making it aware of # them. I've been tracking this issue down for the last couple of days # and then verifying that the fix is correct by unmounting everything in # my current mount table leaving only /proc and /sys mounted and running # the reproducer above overnight verifying the number of mounts counted # in ns->mounts. With this fix the counts are correct and the ENOSPC # issue can't be reproduced. # # This change will only have an effect on mounts created with the new # mount API since detached mounts cannot be created with the old mount # API so regressions are extremely unlikely. # # Here's an illustration: # # #### mount(): # ubuntu@f1-vm:~$ sudo mount --bind /mnt/ /mnt/ # ubuntu@f1-vm:~$ findmnt | grep -i mnt # ├─/mnt /dev/sda2[/mnt] ext4 rw,relatime # # #### open_tree(OPEN_TREE_CLONE) + move_mount() with bug: # ubuntu@f1-vm:~$ sudo ./mount-new /mnt/ /mnt/ # ubuntu@f1-vm:~$ findmnt | grep -i mnt # ├─/mnt /dev/sda2[/mnt] ext4 rw,relatime # │ └─/mnt /dev/sda2[/mnt] ext4 rw,relatime # # #### open_tree(OPEN_TREE_CLONE) + move_mount() with the fix: # ubuntu@f1-vm:~$ sudo ./mount-new /mnt /mnt # ubuntu@f1-vm:~$ findmnt | grep -i mnt # └─/mnt /dev/sda2[/mnt] ext4 rw,relatime" # # * tag 'for-linus-2021-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: # mount: fix mounting of detached mounts onto targets that reside on shared mounts # < /opt/cross/kisskb/br-mipsel-o32-full-2016.08-613-ge98b4dd/bin/mipsel-linux-gcc --version # < /opt/cross/kisskb/br-mipsel-o32-full-2016.08-613-ge98b4dd/bin/mipsel-linux-ld --version # < git log --format=%s --max-count=1 d3110f256d126b44d34c1f662310cd295877c447 # < make -s -j 48 ARCH=mips O=/kisskb/build/linus_mips-defconfig_mipsel CROSS_COMPILE=/opt/cross/kisskb/br-mipsel-o32-full-2016.08-613-ge98b4dd/bin/mipsel-linux- defconfig # < make -s -j 48 ARCH=mips O=/kisskb/build/linus_mips-defconfig_mipsel CROSS_COMPILE=/opt/cross/kisskb/br-mipsel-o32-full-2016.08-613-ge98b4dd/bin/mipsel-linux- help # make -s -j 48 ARCH=mips O=/kisskb/build/linus_mips-defconfig_mipsel CROSS_COMPILE=/opt/cross/kisskb/br-mipsel-o32-full-2016.08-613-ge98b4dd/bin/mipsel-linux- olddefconfig # make -s -j 48 ARCH=mips O=/kisskb/build/linus_mips-defconfig_mipsel CROSS_COMPILE=/opt/cross/kisskb/br-mipsel-o32-full-2016.08-613-ge98b4dd/bin/mipsel-linux- FIT description: Linux 5.12.0-rc2-95256-gd3110f256d1 Created: Thu Mar 11 06:40:53 2021 Image 0 (kernel@0) Description: Linux 5.12.0-rc2-95256-gd3110f256d1 Created: Thu Mar 11 06:40:53 2021 Type: Kernel Image Compression: gzip compressed Data Size: 5252848 Bytes = 5129.73 KiB = 5.01 MiB Architecture: MIPS OS: Linux Load Address: 0x80100000 Entry Point: 0x809a7f68 Hash algo: sha1 Hash value: 22a90bef10666978dc85b315554dd5783c25388d Image 1 (fdt@boston) Description: img,boston Device Tree Created: Thu Mar 11 06:40:53 2021 Type: Flat Device Tree Compression: uncompressed Data Size: 3793 Bytes = 3.70 KiB = 0.00 MiB Architecture: MIPS Hash algo: sha1 Hash value: 4799f50d688573234da6e9d7701234d394759ef4 Image 2 (fdt@ni169445) Description: NI 169445 device tree Created: Thu Mar 11 06:40:53 2021 Type: Flat Device Tree Compression: uncompressed Data Size: 1871 Bytes = 1.83 KiB = 0.00 MiB Architecture: MIPS Hash algo: sha1 Hash value: 51b89b31605ee62038c8468c429af091dfc75ec7 Image 3 (fdt@ocelot_pcb123) Description: MSCC Ocelot PCB123 Device Tree Created: Thu Mar 11 06:40:53 2021 Type: Flat Device Tree Compression: uncompressed Data Size: 4659 Bytes = 4.55 KiB = 0.00 MiB Architecture: MIPS Hash algo: sha1 Hash value: 5bcb6e4f21e8e5372544aa130b3bd097355a9050 Image 4 (fdt@ocelot_pcb120) Description: MSCC Ocelot PCB120 Device Tree Created: Thu Mar 11 06:40:53 2021 Type: Flat Device Tree Compression: uncompressed Data Size: 5418 Bytes = 5.29 KiB = 0.01 MiB Architecture: MIPS Hash algo: sha1 Hash value: 93d882f2009a217e0fa9dab94788535ed2be8476 Image 5 (fdt@xilfpga) Description: MIPSfpga (xilfpga) Device Tree Created: Thu Mar 11 06:40:53 2021 Type: Flat Device Tree Compression: uncompressed Data Size: 2708 Bytes = 2.64 KiB = 0.00 MiB Architecture: MIPS Hash algo: sha1 Hash value: 63d058b780f65e22da30f0a183433765f1807f1d Default Configuration: 'conf@default' Configuration 0 (conf@default) Description: Generic Linux kernel Kernel: kernel@0 Configuration 1 (conf@boston) Description: Boston Linux kernel Kernel: kernel@0 FDT: fdt@boston Configuration 2 (conf@ni169445) Description: NI 169445 Linux Kernel Kernel: kernel@0 FDT: fdt@ni169445 Configuration 3 (conf@ocelot_pcb123) Description: Ocelot Linux kernel Kernel: kernel@0 FDT: fdt@ocelot_pcb123 Configuration 4 (conf@ocelot_pcb120) Description: Ocelot Linux kernel Kernel: kernel@0 FDT: fdt@ocelot_pcb120 Configuration 5 (conf@xilfpga) Description: MIPSfpga Linux kernel Kernel: kernel@0 FDT: fdt@xilfpga Completed OK # rm -rf /kisskb/build/linus_mips-defconfig_mipsel # Build took: 0:01:48.255880