The code you write is itself a form of memory use. Every class, method, and string constant in your application must be loaded into RAM when it is executed. The larger your application's code base, the more memory it will consume just to exist.
Android loads executable code from your .apk (like .oat or .so files) using mmap. This means the code is file-backed.
Crucially, Android uses demand paging. When your app starts, the kernel does not load the entire APK into RAM immediately. Instead, it only maps the file into the process's virtual address space. As your app executes, and the CPU jumps to a new function, it triggers a “page fault.” The kernel pauses the thread, reads that specific 4KB page of code from the storage into physical RAM, and resumes execution.
This means that code you package but never execute does not use physical memory for the code pages themselves. However, unused libraries still increase the overall APK size and can significantly increase the memory used by the system's internal metadata (like DEX indices and class descriptors), which must be read to even know that the code exists. Furthermore, many libraries contain static initializers or are touched by dependency injection frameworks during app startup, causing them to be paged into RAM anyway.
Because file-backed memory can always be re-read from storage, the kernel considers these pages “clean”. When the system experiences memory pressure, the kernel will evict (drop) these clean code pages from RAM to make room for other things.
If your app later needs to execute that code again, the CPU will fault, and the kernel must re-read the page from storage. The more code your app has, the more vulnerable it is to having its code evicted. When a user returns to your bloated app after using other apps, they will experience random jank and slowdowns as the CPU constantly stalls waiting for code to be paged back in from storage.
The Cost of a Page Fault: While it varies heavily based on the device's storage speed (UFS vs. eMMC) and kernel state, a major page fault (reading 4KB from storage) can cost anywhere from 0.5ms to 5ms. If your startup path touches 500 different pages of unoptimized code, you could easily introduce several hundred milliseconds of pure I/O latency to your app startup time.
To build an intuition for how your Java or Kotlin code translates into native machine code (and thus memory bytes), you can use Compiler Explorer.
Android support is built directly into Godbolt. It allows you to see how different parts of the Android toolchain (D8, R8, and dex2oat) transform your source code.
d8: Shows the Dalvik bytecode (.dex). This is the closest representation to your original code and is easier to read.r8: Shows how the R8 optimizer shrinks and optimizes your bytecode.dex2oat: Shows the final ARM64 machine code that actually executes on the device. This is where you can see the real memory impact (4 bytes per instruction). dex2oat can target different ISAs, but ARM64 is the most common for mobile phones.Every instruction you see in the dex2oat output targeting the ARM64 ISA takes up 4 bytes in your app's executable (.odex or .oat) file.
Try entering code that utilizes different language features and study the compiler’s output:
int[] might compile to ~10 instructions (~40 bytes).List implicitly uses an Iterator. This can result in 30-40 instructions (~160 bytes) because of the extra method calls (hasNext(), next()) and the allocation of the iterator object itself.List is proven to be an ArrayList), the R8 optimizer can transform a foreach loop back into a simple indexed loop, eliminating the iterator overhead and reducing both code size and runtime memory churn.vtable, and then branching. This usually takes 4-5 instructions (~20 bytes).bl (Branch with Link) instruction (4 bytes).By using Compiler Explorer, you can see how sophisticated language features (like Kotlin lambdas, stream APIs, or heavy use of generics) impact the final compiled size of your application, and how optimizers such as R8 can counteract the cost of language abstractions in some cases. This tool can help you make informed tradeoffs in designing and implementing an app.
Broadly speaking, more complexity in your app's code leads to higher memory use. Conversely, simpler code - or code that is simplified by R8 - results in a smaller representation as CPU instructions and bytes in storage and RAM.
meminfo and showmapYou can use the standard Android memory tools to see how much memory your app's code is consuming.
dumpsys meminfoWhen you run adb shell dumpsys meminfo <package>, the Code category in the App Summary section provides a high-level view of code-related memory:
App Summary Pss(KB) ------ Java Heap: 3244 Native Heap: 5412 Code: 24512 # <--- Sum of .so, .dex, .oat, .art, etc.
showmapFor a more granular view, use showmap. It reveals regions out of specific files being mapped to memory:
adb shell showmap $(pidof <package>) | grep -E "\.oat|\.odex|\.dex|\.apk"
You will see entries for your application's compiled code:
size RSS PSS clean dirty clean dirty swap swapPSS object ------- -------- -------- -------- -------- -------- -------- -------- -------- ---------------- 12288 8192 8192 8192 0 0 0 0 0 /data/app/.../base.odex
Because every executed method takes up memory, having a “bloated” app with unnecessary initializations or unused libraries can severely impact startup performance and baseline memory usage.
This is why tools like R8 (ProGuard) are critical. R8 analyzes your application's bytecode and removes any classes or methods that are never called (“dead code stripping”).
To demonstrate the impact of code size, we have created two versions of an application in the samples/CodeBloat/ directory.
The build script generate_code.sh artificially creates 300 Java classes, each with 500 methods containing unique strings.
The MainActivity in both apps attempts to touch all 300 classes on a background thread during startup.
m CodeBloat CodeBloatOptimized adb install -r $OUT/system/app/CodeBloat/CodeBloat.apk adb install -r $OUT/system/app/CodeBloatOptimized/CodeBloatOptimized.apk
To maximize the file-backed memory impact, we will use the cmd package compile tool to ahead-of-time (AOT) compile the apps into .oat files.
adb shell cmd package compile -m speed -f com.android.codebloat adb shell cmd package compile -m speed -f com.android.codebloat.optimized
Please note that this is a synthetic example. Typically, apps will use the speed-profile compilation mode (see further below).
Launch the unoptimized app and check its memory footprint:
adb shell am start -W -n com.android.codebloat/.MainActivity sleep 5 # Wait for the background thread to load classes adb shell dumpsys meminfo -s com.android.codebloat
Now do the same for the optimized app:
adb shell am start -W -n com.android.codebloat.optimized/com.android.codebloat.MainActivity sleep 5 adb shell dumpsys meminfo -s com.android.codebloat.optimized
The Results:
If you look at the Code row in the App Summary section, you will see a massive difference:
Code: ~30,000 KB (30 MB)Code: ~2,000 KB (2 MB)Because R8 determined that the 500 methods inside those classes were never actually doing anything useful (the doSomething() method only calls method0(), and the results are ignored), it stripped almost all of the artificially generated code out of the final APK.
The impact of this code bloat is extremely visible during application startup.
Unoptimized App (mem.rss.file climbs massively):
In the unoptimized app, the mem.rss.file track (representing file-backed memory) increases dramatically during the application startup phase. As the app touches the bloated, artificially generated classes, the operating system is forced to page in large amounts of code from the compiled .oat file on storage. You can visually see this impact in the thread state track for the main thread: the high frequency of yellow slices indicates the thread is frequently blocked and stalling on file I/O while waiting for these code pages to be read. The bottom panel shows a massive delta value, adding up to over 141MB of file-backed memory paged into RAM. This heavy I/O causes the startup to take over 1.2 seconds, resulting in a noticeably sluggish user experience.
Note: the trace screenshots demonstrate a memory trend, but actual magnitudes will vary by device characteristics.
Optimized App (mem.rss.file peaks at a lower value):
<--! TODO retake screenshots, showing the breakdown of thread state time, and zooming on classloading slices. -->
In the optimized app, R8 has stripped the dead code out of the APK during the build process, leaving far fewer executable pages to read from storage. The mem.rss.file track climbs much less (a delta of only ~114MB), and the total startup time is drastically reduced to roughly 743ms. This prevents I/O stalls and leaves more free memory for the rest of the system.
Startup Comparison:
| Metric | Unoptimized (CodeBloat) | Optimized (CodeBloatOptimized) |
|---|---|---|
| Startup Time | ~1.25 seconds | ~743 ms |
mem.rss.file Delta | ~141 MB | ~114 MB |
You can run a query to track the maximum amount of file-backed memory that any codebloat application touched during its execution:
SELECT p.name AS process_name, max(c.value)/1024.0/1024.0 AS max_rss_file_mb FROM counter c JOIN process_counter_track t ON c.track_id = t.id JOIN process p USING (upid) WHERE p.name LIKE 'com.android.codebloat%' AND t.name = 'mem.rss.file' GROUP BY p.name;
The Android Runtime (ART) can compile your application code in one of several different modes, also known as compiler filters. The selected compiler filter has a direct impact on your app's memory footprint.
verify: ART only performs bytecode verification. No AOT compilation is performed. Code is executed via the Interpreter or compiled at runtime by the JIT compiler.JIT Cache (anonymous dirty memory).speed: ART performs full AOT compilation of all methods..odex size. Maximizes file-backed (clean) memory usage.speed-profile: ART only compiles methods that have been marked as “hot” in a startup profile.The most common filter is speed-profile, which is used when installing user apps. This is configured in the system properties pm.dexopt.install and pm.dexopt.bg-dexopt, and is typically set in build/make/target/product/runtime_libart.mk.
Some system apps will use speed compilation, and will also compile at system image build time. verify is typically only used in development use cases.
| Use case | Typical compiler filter |
|---|---|
| Development | verify |
| System image | speed |
| User apps | speed-profile |
We can use the CodeBloat app to see how these filters affect memory. To reproduce these measurements:
adb shell dumpsys meminfo com.android.codebloat.Note: These measurements were taken on a Pixel 10 Pro Fold. Actual numbers will vary by device; these are for scale.
Mode: verify (No AOT)
adb shell cmd package compile -m verify -f com.android.codebloat adb shell am force-stop com.android.codebloat adb shell am start -W -n com.android.codebloat/.MainActivity sleep 5 adb shell dumpsys meminfo com.android.codebloat
In verify mode, the app Summary shows: * Code PSS: ~8,000 KB * Dalvik Other (JIT): ~25,000 KB
Because no code is compiled AOT, the runtime must JIT-compile hot methods into the JIT Cache, which shows up as dirty anonymous memory (Dalvik Other).
Mode: speed (Full AOT)
adb shell cmd package compile -m speed -f com.android.codebloat adb shell am force-stop com.android.codebloat adb shell am start -W -n com.android.codebloat/.MainActivity sleep 5 adb shell dumpsys meminfo com.android.codebloat
In speed mode, the results shift dramatically: * Code PSS: ~24,000 KB * Dalvik Other (JIT): ~5,000 KB
The application's code is now mapped from the .odex file as clean file-backed memory. This reduces pressure on the JIT cache and makes the memory eligible for eviction under pressure, rather than being “stuck” as dirty RAM.
Mode: speed-profile (Selective AOT)
Modern apps may bundle a baseline.prof startup profile. ART uses this to selectively compile only the code needed for a fast, memory-efficient startup.
In this exercise we will create a baseline profile to list the app's startup classes. However in reality the compiler may also receive profiles from external sources such as from the application store (“cloud profiles”), which can provide crowdsourced startup profiles for apps regardless of whether the developer also bundled a baseline profile that they generated.
To see the impact of speed-profile, you can generate your own profile on-device:
Reset and Start:
adb shell am force-stop com.android.codebloat
Interact: Start the app and let it run its startup sequence.
Dump Profile:
adb shell kill -s SIGUSR1 $(pidof com.android.codebloat)
(This forces the app to write its current profile to disk).
Install Profile:
adb shell cp /data/misc/profiles/cur/0/com.android.codebloat/primary.prof \ /data/misc/profiles/ref/com.android.codebloat/primary.prof
Compile:
adb shell cmd package compile -m speed-profile -f com.android.codebloat
When you launch again, you'll see a balance: Code PSS will be lower than speed (e.g., ~16,000 KB) because only the “hot” startup methods were compiled, leaving the rest to be handled by the interpreter or JIT only if they are ever actually used.
See:
If you want to see exactly what instructions ART is generating, refer to the Disassembly Guide.
It provides detailed instructions on using:
oatdump: To see ARM64 instructions inside an existing .odex file.dex2oat: To simulate compilation with verbose debug flags.One reason compiled code can grow unexpectedly is method inlining. The compiler may decide to copy the body of a small, frequently called method directly into its callers.
In our CodeBloat app, the doSomething() method in every generated class simply calls method0(). When compiled in speed mode, ART's Optimizing compiler will likely inline method0() into doSomething().
Exercise: Verify this using oatdump on your device:
# 1. Find the path to the application's APK and compiled .odex file adb shell pm path com.android.codebloat # Output: package:/data/app/~~.../base.apk adb shell "dumpsys package com.android.codebloat | grep 'location is' | head -n 1" # Example output: [location is /data/app/~~.../oat/arm64/base.odex] # 2. Run oatdump (substituting the correct path to base.odex) adb shell oatdump --oat-file=/data/app/~~.../oat/arm64/base.odex \ --class-filter=com.android.codebloat.GeneratedClass0
Look for the doSomething method in the output. If it was inlined, you will see the instructions to load the long string constant directly within doSomething, rather than a bl instruction targeting method0.
To see exactly when the compiler decided to inline the method, you can produce a Control Flow Graph (CFG). This shows the state of the code at every stage of the optimization pipeline, with every transformation over the compiler's Intermediate Representation (IR) until the code is lowered to the target ISA (e.g. ARM64).
Run dex2oat with dump flags: Use the --verbose-methods flag to limit the output to specific methods; otherwise, the .cfg file for a large app can grow to several gigabytes.
# Substitution of actual paths required: adb shell dex2oat64 --dex-file=/data/app/~~.../base.apk \ --oat-file=/data/local/tmp/dump.odex \ --compiler-filter=speed \ --dump-cfg=/data/local/tmp/codebloat.cfg \ --verbose-methods=doSomething
Pull and View: Pull the .cfg file to your workstation and open it with IR Hydra.
Find the Inliner: In IR Hydra, load the compilation artifacts and search for doSomething. Compare the representation before and after the Inliner pass. You will see the graph expand as the instructions from method0 are merged into the caller.
Alternatively, use the Opt Pipeline tool in Compiler Explorer (as described in the section above) and enter similar code to see a similar transformation performed at the Inliner pass.
In the MemoryLab app, the mGarbageSink field is marked as volatile. This ensures that the compiler doesn't optimize away our garbage allocations.
public volatile byte[] mGarbageSink;
In the ARM64 disassembly, you will see that every store to this field is accompanied by a Memory Barrier (dmb ish) or use of Load-Acquire/Store-Release instructions (ldar/stlr). This ensures thread visibility but adds a few extra instructions to every access, increasing the code size slightly compared to a regular field.
Exercise: Find the field accesses and the associated memory barriers in the disassembly.
If you disassemble a loop, like the one in generateAllocationChurn, you will notice a curious instruction at the end of the loop body:
ldr x21, [x21]
This is an Implicit Suspend Check. ART uses this to allow the Garbage Collector to pause threads safely. Register x21 normally points to itself. When the GC needs to suspend the thread, it “poisons” that memory location. The next time the thread executes that ldr, it will trigger a fault, which the runtime catches and uses to transition the thread into a suspended state.
This pattern is repeated in every loop and at the start of every method, contributing to the total code size of your application.
Exercise: Find all the implicit suspend checks in the method disassembly, and try to correlate them to the original source code.
Next: Threads and Memory