Skip to content

Commit ea63763

Browse files
jiez-aditorvalds
authored andcommitted
nommu: fix malloc performance by adding uninitialized flag
The NOMMU code currently clears all anonymous mmapped memory. While this is what we want in the default case, all memory allocation from userspace under NOMMU has to go through this interface, including malloc() which is allowed to return uninitialized memory. This can easily be a significant performance penalty. So for constrained embedded systems were security is irrelevant, allow people to avoid clearing memory unnecessarily. This also alters the ELF-FDPIC binfmt such that it obtains uninitialised memory for the brk and stack region. Signed-off-by: Jie Zhang <jie.zhang@analog.com> Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org> Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Greg Ungerer <gerg@snapgear.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 parent 5dc3764 commit ea63763

File tree

5 files changed

+60
-4
lines changed

5 files changed

+60
-4
lines changed

Documentation/nommu-mmap.txt

+26
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,32 @@ FURTHER NOTES ON NO-MMU MMAP
119119
granule but will only discard the excess if appropriately configured as
120120
this has an effect on fragmentation.
121121

122+
(*) The memory allocated by a request for an anonymous mapping will normally
123+
be cleared by the kernel before being returned in accordance with the
124+
Linux man pages (ver 2.22 or later).
125+
126+
In the MMU case this can be achieved with reasonable performance as
127+
regions are backed by virtual pages, with the contents only being mapped
128+
to cleared physical pages when a write happens on that specific page
129+
(prior to which, the pages are effectively mapped to the global zero page
130+
from which reads can take place). This spreads out the time it takes to
131+
initialize the contents of a page - depending on the write-usage of the
132+
mapping.
133+
134+
In the no-MMU case, however, anonymous mappings are backed by physical
135+
pages, and the entire map is cleared at allocation time. This can cause
136+
significant delays during a userspace malloc() as the C library does an
137+
anonymous mapping and the kernel then does a memset for the entire map.
138+
139+
However, for memory that isn't required to be precleared - such as that
140+
returned by malloc() - mmap() can take a MAP_UNINITIALIZED flag to
141+
indicate to the kernel that it shouldn't bother clearing the memory before
142+
returning it. Note that CONFIG_MMAP_ALLOW_UNINITIALIZED must be enabled
143+
to permit this, otherwise the flag will be ignored.
144+
145+
uClibc uses this to speed up malloc(), and the ELF-FDPIC binfmt uses this
146+
to allocate the brk and stack region.
147+
122148
(*) A list of all the private copy and anonymous mappings on the system is
123149
visible through /proc/maps in no-MMU mode.
124150

fs/binfmt_elf_fdpic.c

+2-1
Original file line numberDiff line numberDiff line change
@@ -380,7 +380,8 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
380380
down_write(&current->mm->mmap_sem);
381381
current->mm->start_brk = do_mmap(NULL, 0, stack_size,
382382
PROT_READ | PROT_WRITE | PROT_EXEC,
383-
MAP_PRIVATE | MAP_ANONYMOUS | MAP_GROWSDOWN,
383+
MAP_PRIVATE | MAP_ANONYMOUS |
384+
MAP_UNINITIALIZED | MAP_GROWSDOWN,
384385
0);
385386

386387
if (IS_ERR_VALUE(current->mm->start_brk)) {

include/asm-generic/mman-common.h

+5
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,11 @@
1919
#define MAP_TYPE 0x0f /* Mask for type of mapping */
2020
#define MAP_FIXED 0x10 /* Interpret addr exactly */
2121
#define MAP_ANONYMOUS 0x20 /* don't use a file */
22+
#ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
23+
# define MAP_UNINITIALIZED 0x4000000 /* For anonymous mmap, memory could be uninitialized */
24+
#else
25+
# define MAP_UNINITIALIZED 0x0 /* Don't support this flag */
26+
#endif
2227

2328
#define MS_ASYNC 1 /* sync memory asynchronously */
2429
#define MS_INVALIDATE 2 /* invalidate the caches */

init/Kconfig

+22
Original file line numberDiff line numberDiff line change
@@ -1079,6 +1079,28 @@ config SLOB
10791079

10801080
endchoice
10811081

1082+
config MMAP_ALLOW_UNINITIALIZED
1083+
bool "Allow mmapped anonymous memory to be uninitialized"
1084+
depends on EMBEDDED && !MMU
1085+
default n
1086+
help
1087+
Normally, and according to the Linux spec, anonymous memory obtained
1088+
from mmap() has it's contents cleared before it is passed to
1089+
userspace. Enabling this config option allows you to request that
1090+
mmap() skip that if it is given an MAP_UNINITIALIZED flag, thus
1091+
providing a huge performance boost. If this option is not enabled,
1092+
then the flag will be ignored.
1093+
1094+
This is taken advantage of by uClibc's malloc(), and also by
1095+
ELF-FDPIC binfmt's brk and stack allocator.
1096+
1097+
Because of the obvious security issues, this option should only be
1098+
enabled on embedded devices where you control what is run in
1099+
userspace. Since that isn't generally a problem on no-MMU systems,
1100+
it is normally safe to say Y here.
1101+
1102+
See Documentation/nommu-mmap.txt for more information.
1103+
10821104
config PROFILING
10831105
bool "Profiling support (EXPERIMENTAL)"
10841106
help

mm/nommu.c

+5-3
Original file line numberDiff line numberDiff line change
@@ -1143,9 +1143,6 @@ static int do_mmap_private(struct vm_area_struct *vma,
11431143
if (ret < rlen)
11441144
memset(base + ret, 0, rlen - ret);
11451145

1146-
} else {
1147-
/* if it's an anonymous mapping, then just clear it */
1148-
memset(base, 0, rlen);
11491146
}
11501147

11511148
return 0;
@@ -1343,6 +1340,11 @@ unsigned long do_mmap_pgoff(struct file *file,
13431340
goto error_just_free;
13441341
add_nommu_region(region);
13451342

1343+
/* clear anonymous mappings that don't ask for uninitialized data */
1344+
if (!vma->vm_file && !(flags & MAP_UNINITIALIZED))
1345+
memset((void *)region->vm_start, 0,
1346+
region->vm_end - region->vm_start);
1347+
13461348
/* okay... we have a mapping; now we have to register it */
13471349
result = vma->vm_start;
13481350

0 commit comments

Comments
 (0)