Skip to content Skip to sidebar Skip to footer

How Do I Check If A String Is Entirely Made Of The Same Substring?

I have to create a function which takes a string, and it should return true or false based on whether the input consists of a repeated character sequence. The length of the given s

Solution 1:

There’s a nifty little theorem about strings like these.

A string consists of the same pattern repeated multiple times if and only if the string is a nontrivial rotation of itself.

Here, a rotation means deleting some number of characters from the front of the string and moving them to the back. For example, the string hello could be rotated to form any of these strings:

hello (the trivial rotation)
elloh 
llohe 
lohel 
ohell 

To see why this works, first, assume that a string consists of k repeated copies of a string w. Then deleting the first copy of the repeated pattern (w) from the front of the string and tacking it onto the back will give back the same string. The reverse direction is a bit trickier to prove, but the idea is that if you rotate a string and get back what you started with, you can apply that rotation repeatedly to tile the string with multiple copies of the same pattern (that pattern being the string you needed to move to the end to do the rotation).

Now the question is how to check whether this is the case. For that, there’s another beautiful theorem we can use:

If x and y are strings of the same length, then x is a rotation of y if and only if x is a substring of yy.

As an example, we can see that lohel is a rotation of hello as follows:

hellohello
   ^^^^^

In our case, we know that every string x will always be a substring of xx (it’ll appear twice, once at each copy of x). So basically we just need to check if our string x is a substring of xx without allowing it to match at the first or halfway character. Here’s a one-liner for that:

function check(str) {
    return (str + str).indexOf(str, 1) !== str.length;
}

Assuming indexOf is implemented using a fast string matching algorithm, this will run in time O(n), where n is the length of the input string.

Hope this helps!

Solution 2:

You can do it by a capturing group and backreference. Just check it's the repetition of the first captured value.

functioncheck(str) {
  return/^(.+)\1+$/.test(str)
}

console.log(check('aa')) //trueconsole.log(check('aaa')) //trueconsole.log(check('abcabcabc')) //trueconsole.log(check('aba')) //falseconsole.log(check('ababa')) //false

In the above RegExp:

  1. ^ and $ stands for start and end anchors to predict the position.
  2. (.+) captures any pattern and captures the value(except \n).
  3. \1 is backreference of first captured value and \1+ would check for repetition of captured value.

Regex explanation here

For RegExp debugging use: https://regex101.com/r/pqlAuP/1/debugger

Performance : https://jsperf.com/reegx-and-loop/13

Solution 3:

Perhaps the fastest algorithmic approach is building a Z-function in linear time:

The Z-function for this string is an array of length n where the i-th element is equal to the greatest number of characters starting from the position i that coincide with the first characters of s.

In other words, z[i] is the length of the longest common prefix between s and the suffix of s starting at i.

C++ implementation for reference:

vector<int> z_function(string s){
    int n = (int) s.length();
    vector<int> z(n);
    for (int i = 1, l = 0, r = 0; i < n; ++i) {
        if (i <= r)
            z[i] = min (r - i + 1, z[i - l]);
        while (i + z[i] < n && s[z[i]] == s[i + z[i]])
            ++z[i];
        if (i + z[i] - 1 > r)
            l = i, r = i + z[i] - 1;
    }
    return z;
}

JavaScript implementation Added optimizations - building a half of z-array and early exit

functionz_function(s) {
  var n = s.length;
  var z = Array(n).fill(0);
  var i, l, r;
  //for our task we need only a half of z-arrayfor (i = 1, l = 0, r = 0; i <= n/2; ++i) {
    if (i <= r)
      z[i] = Math.min(r - i + 1, z[i - l]);
    while (i + z[i] < n && s[z[i]] == s[i + z[i]])
      ++z[i];

      //we can check condition and return hereif (z[i] + i === n && n % i === 0) returntrue;
    
    if (i + z[i] - 1 > r)
      l = i, r = i + z[i] - 1;
  }
  returnfalse; 
  //return z.some((zi, i) => (i + zi) === n && n % i === 0);
}
console.log(z_function("abacabacabac"));
console.log(z_function("abcab"));

Then you need to check indexes i that divide n. If you find such i that i+z[i]=n then the string s can be compressed to the length i and you can return true.

For example, for

string s= 'abacabacabac'with length n=12`

z-array is

(0, 0, 1, 0, 8, 0, 1, 0, 4, 0, 1, 0)

and we can find that for

i=4
i+z[i] = 4 + 8 = 12 = n
and
n % i = 12 % 4 = 0`

so s might be represented as substring of length 4 repeated three times.

Solution 4:

I read the answer of gnasher729 and implemented it. The idea is that if there are any repetitions, then there must be (also) a prime number of repetitions.

function* primeFactors (n) {
    for (var k = 2; k*k <= n; k++) {
        if (n % k == 0) {
            yield k
            do {n /= k} while (n % k == 0)
        }
    }
    if (n > 1) yield n
}

functioncheck (str) {
    var n = str.length
    primeloop:
    for (var p of primeFactors(n)) {
        var l = n/p
        var s = str.substring(0, l)
        for (var j=1; j<p; j++) {
            if (s != str.substring(l*j, l*(j+1))) continue primeloop
        }
        returntrue
    }
    returnfalse
}

A slightly different algorithm is this:

function check (str) {
    var n = str.length
    for (var p of primeFactors(n)) {
        var l = n/p
        if (str.substring(0, n-l) == str.substring(l)) returntrue
    }
    returnfalse
}

I've updated the jsPerf page that contains the algorithms used on this page.

Solution 5:

Assume the string S has length N and is made of duplicates of the substring s, then the length of s divides N. For example, if S has length 15, then the substring has length 1, 3, or 5.

Let S be made of (p*q) copies of s. Then S is also made of p copies of (s, repeated q times). We have therefore two cases: If N is prime or 1, then S can only be made of copies of the substring of length 1. If N is composite, then we only need to check substrings s of length N / p for primes p dividing the length of S.

So determine N = the length of S, then find all its prime factors in time O (sqrt (N)). If there is only one factor N, check if S is the same string repeated N times, otherwise for each prime factor p, check if S consists of p repeations of the first N / p characters.

Post a Comment for "How Do I Check If A String Is Entirely Made Of The Same Substring?"