Implement strStr()
leetcode: https://leetcode.com/problems/implement-strstr/
Description:
Implement strStr().
Given two strings needle
and haystack
, return the index of the first occurrence of needle
in haystack
, or -1
if needle
is not part of haystack
.
Clarification:
What should we return when needle
is an empty string? This is a great question to ask during an interview.
For the purpose of this problem, we will return 0 when needle
is an empty string. This is consistent to C’s strstr() and Java’s indexOf().
Idea:
Solution1:
Step1: use loop to traverse the original string, and get the substring from the current index to the current index + the length of needed string, if equal, then return current index, otherwise, return -1;
Code:
class Solution {
public int strStr(String haystack, String needle) {
for(int i=0; i<haystack.length(); i++){
if(i+needle.length() <= haystack.length()){
if(haystack.substring(i,i+needle.length()).equals(needle)){
return i;
}
}
}
return -1;
}
}
Solution2: KMP Algorithm
(1)KMP Algorithm is used to match string;
(2)The idea of it is that when there is a string mismatch, you can know a part of the text content that has been matched before, you can use this information to avoid doing the matching from scratch. Use next Array to record these content. (Next array is a Prefix Table).
(3)Prefix: all substrings starts with first character that don’t include last character in the string.
Suffix: all substrings ends with the last character that don’t include first character in the string.
Example:
abcd
prefix: a,ab,abc
suffix: d,cd,bcd
(4) build the Prefix Table: find the longest length of (prefix = suffix)
Example: the string is aabaaf
a: 0
aa: 1(prefix=a, suffix =a)
aab:0(prefix=a,aa; suffix =b,ab)
aaba:1(prefix:a,aa,aab ; suffix: a,ba,aba; so the a=a, length is 1)
aabaa: 2 (prefix:a,aa,aab,aaba; suffix:a,aa,baa,abaa; so aa =aa, the length is 2)
aabaaf: 0
The Prefix Table of aabaaf is :
aabaaf
010120
(5) when find the confict character(two character isn’t same), we find the longest length of (prefix = suffix) of the character before this character. Then, set the number of this length as the new start index of match
(6) Get the prefix Array
Step1:initialization, set two pointer(one for the end of suffix, one for the prefix and record the length of (suffix = prefix)), initialize the array
Step2: when the two characters that two pointer pointed are not equal, make the prefix index return to the character that they are equal, if can’t find the same character until the begining of the string, then make index as the beginning of the string.
Step3: if two characters are equal, then make the suffix and prefix pointers point to next
Step4: Update the array, give the array with current length of prefix.
public void GetArray(String s){
int j=0; //set the prefix pointer
int i=1; //set the suffix pointer
int[] next = new int[s.length()];
char[] result = s.toCharArray();
for(;i<s.length();i++){
while(j>0 && result[j]!=result[i]){
j = next[j-1];
}
if(result[i] == result[j]){
j++;
}
next[i]=j;
}
}
Step1: build the prefix Table Array of KMP for the needle string
Step2: Compare the needle string with original string with prefix table
Code:
class Solution {
public int strStr(String haystack, String needle) {
if (needle.length() == 0){
return 0;
}
char[] s1 = haystack.toCharArray();
char[] s2 = needle.toCharArray();
// the prefix table array
int[] result = new int[needle.length()];
// build the prefix table array for needle string
// set two pointers, j is prefix pointer(it also used to record the length of (prefix = suffix)) and i is suffix pointer
int j= 0;
for(int i=1; i<s2.length;i++){
// if two character aren't equal, move back the j, until the character it pointed equal to the i pointed or it back to the begining of the string
while(j>0 && s2[j]!=s2[i]){
j = result[j-1];
}
// j also record the length, and when two characters are euqal, move two pointers to the next(i has moved in the loop)
if(s2[i] == s2[j]){
j++;
}
result[i]=j;
}
// compare the original string and the needle string
// set the pointer for needle
int first = 0;
for (int second = 0; second < s1.length; second++){
// compare the needle and original string
while(first>0 && s1[second] != s2[first]){
first = result[first-1];
}
if(s1[second] == s2[first]){
first++;
}
// when compare all the needle string, and because now the first is equal to the length of needle, so get the index by minius it.
if(first == s2.length){
return (second-first+1);
}
}
return -1;
}
}