KMP (Knuth-Morris-Pratt) is an efficient string matching algorithm. It uses information from previous matches to avoid restarting from scratch, achieving O(n+m) time complexity through the 'prefix table' (or next array).

-
Click 'Start' to view animation

Core Concepts

  • Prefix table records 'longest proper prefix which is also suffix' for each position
  • On mismatch, use prefix table to skip positions that can't possibly match
  • Pattern doesn't restart from beginning, jumps to position indicated by prefix table

Example: 'ABAB' has longest proper prefix-suffix 'AB', length 2.

Time and Space Complexity

Phase Complexity Description
Build Prefix Table O(M) M is pattern length
Matching Phase O(N) N is text length
Total Time O(N+M) Much better than naive O(N×M)
Space Complexity O(M) Store prefix table

Applications

  • Text editor find functionality
  • Search engine keyword matching
  • DNA sequence pattern search
  • Network intrusion detection (signature matching)
  • Data compression algorithms
Text Search Pattern Matching Bioinformatics Network Security

Implementation

def build_next(pattern):
    """构建前缀表(Next数组)"""
    m = len(pattern)
    next_arr = [0] * m
    j = 0
    
    for i in range(1, m):
        while j > 0 and pattern[i] != pattern[j]:
            j = next_arr[j - 1]
        if pattern[i] == pattern[j]:
            j += 1
        next_arr[i] = j
    
    return next_arr


def kmp_search(text, pattern):
    """KMP字符串匹配"""
    if not pattern:
        return []
    
    next_arr = build_next(pattern)
    results = []
    j = 0
    
    for i in range(len(text)):
        while j > 0 and text[i] != pattern[j]:
            j = next_arr[j - 1]
        if text[i] == pattern[j]:
            j += 1
        if j == len(pattern):
            results.append(i - j + 1)
            j = next_arr[j - 1]
    
    return results


# 使用示例
text = "ABABDABACDABABCABAB"
pattern = "ABABCABAB"
positions = kmp_search(text, pattern)
print(f"匹配位置: {positions}")
# 输出: 匹配位置: [10]
import java.util.*;

public class KMP {
    
    /**
     * 构建前缀表(Next数组)
     */
    private static int[] buildNext(String pattern) {
        int m = pattern.length();
        int[] next = new int[m];
        int j = 0;
        
        for (int i = 1; i < m; i++) {
            while (j > 0 && pattern.charAt(i) != pattern.charAt(j)) {
                j = next[j - 1];
            }
            if (pattern.charAt(i) == pattern.charAt(j)) {
                j++;
            }
            next[i] = j;
        }
        
        return next;
    }
    
    /**
     * KMP字符串匹配
     */
    public static List<Integer> kmpSearch(String text, String pattern) {
        List<Integer> results = new ArrayList<>();
        if (pattern.isEmpty()) return results;
        
        int[] next = buildNext(pattern);
        int j = 0;
        
        for (int i = 0; i < text.length(); i++) {
            while (j > 0 && text.charAt(i) != pattern.charAt(j)) {
                j = next[j - 1];
            }
            if (text.charAt(i) == pattern.charAt(j)) {
                j++;
            }
            if (j == pattern.length()) {
                results.add(i - j + 1);
                j = next[j - 1];
            }
        }
        
        return results;
    }
    
    public static void main(String[] args) {
        String text = "ABABDABACDABABCABAB";
        String pattern = "ABABCABAB";
        List<Integer> positions = kmpSearch(text, pattern);
        System.out.println("匹配位置: " + positions);
        // 输出: 匹配位置: [10]
    }
}