Skip to content

17.13. Re-Space

Description

Oh, no! You have accidentally removed all spaces, punctuation, and capitalization in a lengthy document. A sentence like "I reset the computer. It still didn't boot!" became "iresetthecomputeritstilldidntboot''. You'll deal with the punctuation and capi­talization later; right now you need to re-insert the spaces. Most of the words are in a dictionary but a few are not. Given a dictionary (a list of strings) and the document (a string), design an algorithm to unconcatenate the document in a way that minimizes the number of unrecognized characters. Return the number of unrecognized characters.

Note: This problem is slightly different from the original one in the book.

 

Example:


Input: 

dictionary = ["looked","just","like","her","brother"]

sentence = "jesslookedjustliketimherbrother"

Output:  7

Explanation:  After unconcatenating, we got "jess looked just like tim her brother", which containing 7 unrecognized characters.

Note:

  • 0 <= len(sentence) <= 1000
  • The total number of characters in dictionary is less than or equal to 150000.
  • There are only lowercase letters in dictionary and sentence.

Solutions

Solution 1

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
class Solution:
    def respace(self, dictionary: List[str], sentence: str) -> int:
        s = set(dictionary)
        n = len(sentence)
        dp = [0] * (n + 1)
        for i in range(1, n + 1):
            dp[i] = dp[i - 1] + 1
            for j in range(i):
                if sentence[j:i] in s:
                    dp[i] = min(dp[i], dp[j])
        return dp[-1]
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
class Solution {
    public int respace(String[] dictionary, String sentence) {
        Set<String> dict = new HashSet<>(Arrays.asList