Z algorithm (Linear time pattern searching Algorithm)
Last Updated : 30 Mar, 2026
Given two strings text (the text) and pattern (the pattern), consisting of lowercase English alphabets, find all 0-based starting indices where the pattern occurs as a substring in the text.
In the Naive String Matching algorithm, we compare the pattern with every substring of the text of the same size one by one, which takes O(n × m) time, where n is the length of the text and m is the length of the pattern. This becomes inefficient for large inputs. The Z Algorithm improves this by preprocessing the string and efficiently matching prefixes, allowing pattern searching in O(n + m) time.
Core Idea
Create a string that is concatenation of 3 things, pattern, a separator (that should be present in pattern) and text
Now we consider the above created string and at any point (starting from second index), if we find a substring that is also a prefix and is of same length as pattern, then the pattern is present at that point.
Z-Array to Track Prefix Matches
For a string s of length n, the Z-array stores:
z[i] = length of the longest substring starting from i which is also a prefix of s
Example: s = "aabxaab".
The Z-array is: z = [0, 1, 0, 0, 3, 1, 0]
Z[0] is always defined as 0 because it would represent comparing the entire string with itself, which is trivial and not useful. Hence, by convention, we set: z[0]=0
Z[1] = 1, Only the first character 'a' matches with the prefix.
Z[2] = 0, 'b' does not match the first character of the prefix 'a'.
Z[3] = 0, 'x' does not match the first character of the prefix 'a'.
Z[4] = 3, The substring "aab" matches the prefix "aab".
Z[5] = 1, Only 'a' matches with the first character of the prefix.
Z[6] = 0, 'b' does not match the first character of the prefix 'a'.
Calculation of Z Array
The naive way checks, for each index i, how many characters from s[i...] match the prefix s[0...], which can take O(n²) time. The Z-algorithm improves this and computes all values in O(n) time.
While computing, we maintain a window [l, r] called the Z-box, which stores the rightmost substring that matches the prefix.
l = start of the match
r = end of the match
s[l...r] matches s[0...(r-l)]
This helps reuse previous results instead of comparing again. When processing index i, there are two possibilities:
If i > r (outside the Z-box): Compare characters from scratch and update [l, r].
If i ≤ r:
Let k be the position corresponding to i within the prefix (k = i - l).
Use the value Z[k] as a reference.
If Z[k] is strictly less than the remaining length in [l, r], assign Z[i] = Z[k].
Otherwise, begin comparing characters beyond the current window to extend the match.
After extending, update the window [l, r] if a longer match was found.
The key idea is to preprocess a new string formed by combining the pattern and the text, separated by a special delimiter (e.g., $) that doesn’t appear in either string. This avoids accidental overlaps.
We construct a new string as: s = pattern + '$' + text
We then compute the Z-array for this combined string. The Z-array at any position i tells us the length of the longest prefix of the pattern that matches the substring of the text starting at that position (adjusted for offset due to the pattern and separator).
So, whenever we find a position i such that: Z[i] == length of pattern
It means the entire pattern matches the text at a position: match position = i - (pattern length + 1)
C++
#include<iostream>#include<vector>usingnamespacestd;// Z-function to compute Z-arrayvector<int>zFunction(string&s){intn=s.length();vector<int>z(n);intl=0,r=0;for(inti=1;i<n;i++){if(i<=r){intk=i-l;// Case 2: reuse the previously computed valuez[i]=min(r-i+1,z[k]);}// Try to extend the Z-box beyond rwhile(i+z[i]<n&&s[z[i]]==s[i+z[i]]){z[i]++;}// Update the [l, r] window if extendedif(i+z[i]-1>r){l=i;r=i+z[i]-1;}}returnz;}// Function to find all occurrences of pattern in textvector<int>search(string&text,string&pattern){strings=pattern+'$'+text;vector<int>z=zFunction(s);vector<int>pos;intm=pattern.size();for(inti=m+1;i<z.size();i++){if(z[i]==m){// pattern match starts here in textpos.push_back(i-m-1);}}returnpos;}intmain(){stringtext="aabxaabxaa";stringpattern="aab";vector<int>matches=search(text,pattern);for(intpos:matches)cout<<pos<<" ";return0;}
Java
importjava.util.ArrayList;importjava.util.Arrays;publicclassGfG{// Z-function to compute Z-arraystaticArrayList<Integer>zFunction(Strings){intn=s.length();ArrayList<Integer>z=newArrayList<>();for(inti=0;i<n;i++){z.add(0);}intl=0,r=0;for(inti=1;i<n;i++){if(i<=r){intk=i-l;// Case 2: reuse the previously computed valuez.set(i,Math.min(r-i+1,z.get(k)));}// Try to extend the Z-box beyond rwhile(i+z.get(i)<n&&s.charAt(z.get(i))==s.charAt(i+z.get(i))){z.set(i,z.get(i)+1);}// Update the [l, r] window if extendedif(i+z.get(i)-1>r){l=i;r=i+z.get(i)-1;}}returnz;}// Function to find all occurrences of pattern in textstaticArrayList<Integer>search(Stringtext,Stringpattern){Strings=pattern+'$'+text;ArrayList<Integer>z=zFunction(s);ArrayList<Integer>pos=newArrayList<>();intm=pattern.length();for(inti=m+1;i<z.size();i++){if(z.get(i)==m){// pattern match starts here in textpos.add(i-m-1);}}returnpos;}publicstaticvoidmain(String[]args){Stringtext="aabxaabxaa";Stringpattern="aab";ArrayList<Integer>matches=search(text,pattern);for(intpos:matches)System.out.print(pos+" ");}}
Python
defzFunction(s):n=len(s)z=[0]*nl,r=0,0foriinrange(1,n):ifi<=r:k=i-l# Case 2: reuse the previously computed valuez[i]=min(r-i+1,z[k])# Try to extend the Z-box beyond rwhilei+z[i]<nands[z[i]]==s[i+z[i]]:z[i]+=1# Update the [l, r] window if extendedifi+z[i]-1>r:l=ir=i+z[i]-1returnzdefsearch(text,pattern):s=pattern+'$'+textz=zFunction(s)pos=[]m=len(pattern)foriinrange(m+1,len(z)):ifz[i]==m:# pattern match starts here in textpos.append(i-m-1)returnposif__name__=='__main__':text='aabxaabxaa'pattern='aab'matches=search(text,pattern)forposinmatches:print(pos,end=' ')
C#
usingSystem;usingSystem.Collections.Generic;publicclassGfG{// Z-function to compute Z-arraystaticList<int>zFunction(strings){intn=s.Length;List<int>z=newList<int>(newint[n]);intl=0,r=0;for(inti=1;i<n;i++){if(i<=r){intk=i-l;// Case 2: reuse the previously computed// valuez[i]=Math.Min(r-i+1,z[k]);}// Try to extend the Z-box beyond rwhile(i+z[i]<n&&s[z[i]]==s[i+z[i]]){z[i]++;}// Update the [l, r] window if extendedif(i+z[i]-1>r){l=i;r=i+z[i]-1;}}returnz;}// Function to find all occurrences of pattern in textstaticList<int>search(stringtext,stringpattern){strings=pattern+'$'+text;List<int>z=zFunction(s);List<int>pos=newList<int>();intm=pattern.Length;for(inti=m+1;i<z.Count;i++){if(z[i]==m){// pattern match starts here in textpos.Add(i-m-1);}}returnpos;}publicstaticvoidMain(){stringtext="aabxaabxaa";stringpattern="aab";List<int>matches=search(text,pattern);foreach(intposinmatches)Console.Write(pos+" ");}}
JavaScript
functionzFunction(s){letn=s.length;letz=newArray(n).fill(0);letl=0,r=0;for(leti=1;i<n;i++){if(i<=r){letk=i-l;// Case 2: reuse the previously computed valuez[i]=Math.min(r-i+1,z[k]);}// Try to extend the Z-box beyond rwhile(i+z[i]<n&&s[z[i]]===s[i+z[i]]){z[i]++;}// Update the [l, r] window if extendedif(i+z[i]-1>r){l=i;r=i+z[i]-1;}}returnz;}functionsearch(text,pattern){lets=pattern+'$'+text;letz=zFunction(s);letpos=[];letm=pattern.length;for(leti=m+1;i<z.length;i++){if(z[i]===m){// pattern match starts here in textpos.push(i-m-1);}}returnpos;}// Driver Codelettext='aabxaabxaa';letpattern='aab';letmatches=search(text,pattern);console.log(matches.join(" "));
Output
0 4
Time Complexity: O(n + m), where n is the length of the text and m is the length of the pattern, since the combined string and Z-array are processed linearly. Auxiliary Space: O(n + m), used to store the combined string and the Z-array for efficient pattern matching.
Advantages of Z-Algorithm
Linear Time Complexity for pattern matching.
Uses prefix comparison, avoiding re-evaluation of matched characters.
Easier to code than KMP; works directly with prefix matches.
Useful for preprocessing in multiple string problems beyond pattern matching.
Real-Life Applications
Search Tools in Text Editors (e.g., VsCode, Sublime)
Plagiarism Detection Systems (detect repeated blocks)