numa: fix /proc/<pid>/numa_maps for THP

commit 28093f9f34cedeaea0f481c58446d9dac6dd620f upstream.

In gather_pte_stats() a THP pmd is cast into a pte, which is wrong
because the layouts may differ depending on the architecture.  On s390
this will lead to inaccurate numa_maps accounting in /proc because of
misguided pte_present() and pte_dirty() checks on the fake pte.

On other architectures pte_present() and pte_dirty() may work by chance,
but there may be an issue with direct-access (dax) mappings w/o
underlying struct pages when HAVE_PTE_SPECIAL is set and THP is
available.  In vm_normal_page() the fake pte will be checked with
pte_special() and because there is no "special" bit in a pmd, this will
always return false and the VM_PFNMAP | VM_MIXEDMAP checking will be
skipped.  On dax mappings w/o struct pages, an invalid struct page
pointer would then be returned that can crash the kernel.

This patch fixes the numa_maps THP handling by introducing new "_pmd"
variants of the can_gather_numa_stats() and vm_normal_page() functions.

Signed-off-by: Gerald Schaefer <>
Cc: Naoya Horiguchi <>
Cc: "Kirill A . Shutemov" <>
Cc: Konstantin Khlebnikov <>
Cc: Michal Hocko <>
Cc: Vlastimil Babka <>
Cc: Jerome Marchand <>
Cc: Johannes Weiner <>
Cc: Dave Hansen <>
Cc: Mel Gorman <>
Cc: Dan Williams <>
Cc: Martin Schwidefsky <>
Cc: Heiko Carstens <>
Cc: Michael Holzheu <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>
Signed-off-by: Lee Jones <>
Change-Id: I9e71374115af2a3ff116ba5a31d06f843aceb955
3 files changed