# git rev-parse -q --verify 11f5acce2fa43b015a8120fa7620fa4efd0a2952^{commit} 11f5acce2fa43b015a8120fa7620fa4efd0a2952 already have revision, skipping fetch # git checkout -q -f -B kisskb 11f5acce2fa43b015a8120fa7620fa4efd0a2952 # git clean -qxdf # < git log -1 # commit 11f5acce2fa43b015a8120fa7620fa4efd0a2952 # Author: Alexey Kardashevskiy # Date: Wed Feb 13 14:38:18 2019 +1100 # # powerpc/powernv/ioda: Fix locked_vm counting for memory used by IOMMU tables # # We store 2 multilevel tables in iommu_table - one for the hardware and # one with the corresponding userspace addresses. Before allocating # the tables, the iommu_table_group_ops::get_table_size() hook returns # the combined size of the two and VFIO SPAPR TCE IOMMU driver adjusts # the locked_vm counter correctly. When the table is actually allocated, # the amount of allocated memory is stored in iommu_table::it_allocated_size # and used to decrement the locked_vm counter when we release the memory # used by the table; .get_table_size() and .create_table() calculate it # independently but the result is expected to be the same. # # However the allocator does not add the userspace table size to # .it_allocated_size so when we destroy the table because of VFIO PCI # unplug (i.e. VFIO container is gone but the userspace keeps running), # we decrement locked_vm by just a half of size of memory we are # releasing. # # To make things worse, since we enabled on-demand allocation of # indirect levels, it_allocated_size contains only the amount of memory # actually allocated at the table creation time which can just be a # fraction. It is not a problem with incrementing locked_vm (as # get_table_size() value is used) but it is with decrementing. # # As the result, we leak locked_vm and may not be able to allocate more # IOMMU tables after few iterations of hotplug/unplug. # # This sets it_allocated_size in the pnv_pci_ioda2_ops::create_table() # hook to what pnv_pci_ioda2_get_table_size() returns so from now on we # have a single place which calculates the maximum memory a table can # occupy. The original meaning of it_allocated_size is somewhat lost now # though. # # We do not ditch it_allocated_size whatsoever here and we do not call # get_table_size() from vfio_iommu_spapr_tce.c when decrementing # locked_vm as we may have multiple IOMMU groups per container and even # though they all are supposed to have the same get_table_size() # implementation, there is a small chance for failure or confusion. # # Fixes: 090bad39b237 ("powerpc/powernv: Add indirect levels to it_userspace") # Signed-off-by: Alexey Kardashevskiy # Reviewed-by: David Gibson # Signed-off-by: Michael Ellerman # < /opt/cross/kisskb/korg/gcc-5.5.0-nolibc/powerpc64-linux/bin/powerpc64-linux-gcc --version # < /opt/cross/kisskb/korg/gcc-5.5.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld --version # < git log --format=%s --max-count=1 11f5acce2fa43b015a8120fa7620fa4efd0a2952 # < make -s -j 8 ARCH=powerpc O=/kisskb/build/powerpc-next_83xx_mpc8315_rdb_defconfig_powerpc-gcc5 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-5.5.0-nolibc/powerpc64-linux/bin/powerpc64-linux- 83xx/mpc8315_rdb_defconfig # make -s -j 8 ARCH=powerpc O=/kisskb/build/powerpc-next_83xx_mpc8315_rdb_defconfig_powerpc-gcc5 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-5.5.0-nolibc/powerpc64-linux/bin/powerpc64-linux- INFO: Uncompressed kernel (size 0x5de1f0) overlaps the address of the wrapper(0x400000) INFO: Fixing the link_address of wrapper to (0x600000) Image Name: Linux-5.0.0-rc2-g11f5acce2fa4 Created: Thu Feb 28 16:36:40 2019 Image Type: PowerPC Linux Kernel Image (gzip compressed) Data Size: 3002676 Bytes = 2932.30 KiB = 2.86 MiB Load Address: 00000000 Entry Point: 00000000 Completed OK # rm -rf /kisskb/build/powerpc-next_83xx_mpc8315_rdb_defconfig_powerpc-gcc5 # Build took: 0:01:03.429592