Linux2 June 2026

`grep` uses Boyer-Moore-Horspool to skip most input without reading it character by character

GNU `grep` uses the Boyer-Moore-Horspool algorithm, not a naive left-to-right scan. For a pattern of length m, it pre-computes a skip table: for each possible byte value, how many characters it can safely skip. This lets grep skip m characters at a time in the best case, scanning far less than 100% of the input. On large files with rare or long patterns, grep is faster than mmap reads of the same data because the algorithm skips huge chunks entirely.

This is why `grep -F` (fixed string) is often faster than `grep` with a regex: fixed strings let the algorithm compute its skip table from the literal pattern. A regex must fall back to a DFA/NFA engine which cannot skip ahead safely because the pattern can match at any position.

# Fixed-string grep uses the skip table fully
grep -F "specific literal string" largefile.log

# ripgrep (rg) uses SIMD and Rust's regex engine for even faster results
rg "pattern" largefile.log

# Count matches without printing lines: faster I/O
grep -c "pattern" largefile.log

grep -F is the fastest option when you have a literal string to search for

Linuxalgorithmstoolingsystems