Big nfs_inode_cache

The story

Boxes with various kernel versions have weird free memory problems. After examining the memory usage it seems that processes don’t add up to the actual memory that is being used.

Taking a look at /proc/meminfo we see something like this:

MemTotal:      8161544 kB
MemFree:        115676 kB
Buffers:          3900 kB
Cached:         200520 kB
SwapCached:      42336 kB
Active:         546824 kB
Inactive:       138336 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      8161544 kB
LowFree:        115676 kB
SwapTotal:     2096472 kB
SwapFree:       547480 kB
Dirty:            1020 kB
Writeback:           0 kB
AnonPages:      453480 kB
Mapped:          66928 kB
Slab:          7250176 kB
PageTables:      75408 kB
...

Notice that Slab is about 7.5GB, almost the whole memory (8GB) (!).

Slab is the kernel memory and we can see where it is allocated by examining /proc/slabinfo. Here’s an excerpt:

# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
nfs_direct_cache       0      0    136   28    1 : tunables  120   60    8 : slabdata      0      0      0
nfs_write_data        62     63    832    9    2 : tunables   54   27    8 : slabdata      7      7      0
nfs_read_data        215    297    832    9    2 : tunables   54   27    8 : slabdata     33     33     54
nfs_inode_cache   5384386 5399040   1032    3    1 : tunables   24   12    8 : slabdata 1799680 1799680     40
nfs_page             534    750    128   30    1 : tunables  120   60    8 : slabdata     25     25    264
rpc_buffers            8      8   2048    2    1 : tunables   24   12    8 : slabdata      4      4      0
...

Notice the nfs_inode_cache which is 5.3M objects of 1032 bytes each, adding up to about 5.4GB.

The workaround

Looking a bit about this on the internet we see that this is most probably a bug. Fortunately there are two workaround: A slow and a fast one:

Slow workaround: Login to that box and run “sync”. Then leave it alone for a couple of minutes while the nfs_inode_cache memory goes down and down. It make take a couple of minutes before starting going down and there may be pauses in the process. It can take more than an hour to free the memory.

Fast workaround: Login to that box and run:

# sync
# echo 2 > /proc/sys/vm/drop_caches

I’m not sure why the first one works, but it looks like it is triggering a chain reaction that frees the memory.

One comment

  1. Thank you, thank you, thank you. This post just saved us from having to reboot a box.

    Details for future Googlers: We had several pdflush processes running wild and almost all RAM was slab memory. Running sync cleared a few GB of slab memory and made pdflush quiet down, but several GB of slab memory remained. We then dropped caches as above and that cleared out the rest of the slab waste.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.