Findlargedir: Find all “blackhole” directories with a huge amount of filesystem entries

Findlargedir is a tool written to help quickly identify “black hole” directories on any filesystem having more than 100k entries in a single flat structure.

findlargedir

When a directory has many entries (directories or files), getting a directory listing gets slower and slower, impacting the performance of all processes attempting to get a directory listing. Processes reading large directory inodes get frozen while doing so and end up in the uninterruptible sleep (“D” state) for longer and longer periods. Depending on the filesystem, this might become visible with 100k entries and a very noticeable performance impact with 1M+ entries.

Such directories cannot shrink back even if the content gets cleaned up since most Linux and Un*x filesystems do not support directory inode shrinking. This often happens with forgotten Web sessions directory (PHP sessions folder where GC interval was configured to several days), various cache folders (CMS compiled templates and caches), POSIX filesystem emulating object storage, etc.

The program will attempt to identify any such events and report on them based on calibration, i.e. how many assumed directory entries are packed in each directory inode for each filesystem. While doing so, it will determine the directory inode growth ratio to the number of entries/inodes and will use that ratio to quickly scan the filesystem, avoiding doing expensive/slow directory lookups.

“One of my roles in the previous team was Head of Storage Department and we had many storage clusters totaling in 300 PB of raw disk space. One of the frequent issues for our customers was accumulating many files in a single flat directory, typically caused by cache files or object storage emulation, that would eventually cause visible performance degradation. The exact moment of directory lookups being heavily performance impacted depends on several factors such as storage performance, filesystem in use as well etc. Still, we typically observe issues when there are above 1M of files in a single directory. We have been identifying such issues initially by regular BSD and Linux system tools. However, it was painfully obvious that many core tools were never designed to cope with modern high IOPS and high IOdepth systems,” Dinko Korunic, the author of the tool, told Help Net Security.

While many tools scan the filesystem (find, du, ncdu, etc.), none of them use heuristics to avoid expensive lookups since they are designed to be fully accurate, while this tool is meant to use heuristics and alert on issues without getting stuck on problematic folders.

Findlargedir will not follow symlinks and requires r/w permissions to calibrate the directory to calculate a directory inode size to number of entries ratio and estimate a number of entries in a directory without actually counting them. While this method approximates the actual number of entries in a directory, it is good enough to scan for offending directories quickly.

Findlargedir is available for free on GitHub.

Must read:

Don't miss