[Leetcode 1065] Index Pairs of a String

原题说明

Given a text string and words (a list of strings), return all index pairs [i, j] so that the substring text[i]…text[j] is in the list of words.

Example 1:

Input: text = “thestoryofleetcodeandme”, words = [“story”,”fleet”,”leetcode”]
Output: [[3,7],[9,13],[10,17]]

Example 2:

Input: text = “ababa”, words = [“aba”,”ab”]
Output: [[0,1],[0,2],[2,3],[2,4]]
Explanation: 
Notice that matches can overlap, see “aba” is found in [0,2] and [2,4].

Note:

All strings contains only lowercase English letters.
It’s guaranteed that all strings in words are different.
1 <= text.length <= 100
1 <= words.length <= 20
1 <= words[i].length <= 50
Return the pairs [i,j] in sorted order (i.e. sort them by their first coordinate in case of ties sort them by their second coordinate).

解题思路

这道题可以直接用string find函数来做，但是需要分析一下时间复杂度。取决于具体的函数实现，比如CPP的find函数
没有用KMP实现，所以最坏的情况复杂度是O(M * N)，这样带入本题，时间复杂度是O(M * sum(len(word))). 其中M是text的长度，sum(len(word))是words中word的长度之和。
如果用字典树Trie来实现，则当M < sum(len(word))时，时间复杂度可以优化。
首先建立基于words的字典树trie，然后在text中以每一个位置i为起点向后遍历，并判断往后每一个位置j是否在字典树中，若在则加入要返回的结果rets中。

示例代码 (cpp)

struct Trie {
    vector<Trie*> children = vector<Trie*>(26, nullptr);
    bool is_find = false;
};
class Solution {
    Trie* constructTrie(const vector<string>& words) {
        Trie* trie = new Trie();
        for (const auto& word : words) {
            Trie* cur = trie;
            for (const auto ch : word) {
                if (cur->children[ch - 'a'] == nullptr) {
                    cur->children[ch - 'a'] = new Trie();
                }
                cur = cur->children[ch - 'a'];
            }
            cur->is_find = true;
        }
        return trie;
    }
public:
    vector<vector<int>> indexPairs(string text, vector<string>& words) {
        const Trie* const trie = constructTrie(words);
        vector<vector<int>> rets;
        for (int i = 0; i < text.size(); ++i) {
            const Trie* cur = trie;           
            for (int j = i; j < text.size() && cur != nullptr; ++j) {
                cur = cur->children[text[j] - 'a'];
                if (cur && cur->is_find) {
                    rets.push_back({i, j});
                }
            }
        }
        return rets;
    }
};

示例代码 (java)

class Solution {
    public int[][] indexPairs(String text, String[] words) {
        /*initializing tire and put all word from words into Trie.*/
        Trie trie=new Trie();
        for(String s:words){
            Trie cur=trie;
            for(char c:s.toCharArray()){
                if(cur.children[c-'a']==null){
                    cur.children[c-'a']=new Trie();
                }
                cur=cur.children[c-'a'];
            }
            cur.end=true;       /*mark there is a word*/
        }
        
        /*if text is "ababa", check "ababa","baba","aba","ba","a" individually.*/
        int len=text.length();
        List<int[]> list=new ArrayList<>();
        for(int i=0;i<len;i++){
            Trie cur=trie;
            char cc=text.charAt(i);
            int j=i;   /*j is our moving index*/
            
            while(cur.children[cc-'a']!=null){ 
                cur=cur.children[cc-'a'];
                if(cur.end){   /*there is a word ending here, put into our list*/
                    list.add(new int[]{i,j});
                }
                j++;
                if(j==len){  /*reach the end of the text, we stop*/
                    break;
                }
                else{
                    cc=text.charAt(j);  
                }
            }
        }
        /*put all the pairs from list into array*/
        int size=list.size();
        int[][] res=new int[size][2];
        int i=0;
        for(int[] r:list){
            res[i]=r;
            i++;
        }
        return res;
    }
}
class Trie{
    Trie[] children;
    boolean end;   /*indicate whether there is a word*/
    public Trie(){
        end=false;
        children=new Trie[26];
    }
}

示例代码 (python)

class Solution:
    def indexPairs(self, text: str, words: List[str]) -> List[List[int]]:
        trie={}
        for w in words:
            cur=trie
            for c in w:
                if c not in cur.keys():
                    cur[c]={}
                cur=cur[c]
            cur['#']=True
        
        res=[]
        lens=len(text)
        for i in range(lens):
            cur=trie
            cc=text[i]
            j=i
            while cc in cur.keys():
                cur=cur[cc]
                if '#' in cur.keys():
                    res.append([i,j])
                j+=1
                if j==lens:
                    break
                else:
                    cc=text[j]
        return res

复杂度分析

时间复杂度: O(M^2 + sum(len(word))), M为text的长度
空间复杂度: O(26^max(len(words)))

视频讲解

我们在Youtube上更新了视频讲解，欢迎关注！