This isn't really off-topic, but it doesn't need much discussion, either.
The combinatorics problem is:
Let W be an arbitrary word of length N over an alphabet S.
Define a factor of W as a pair (i, j) s.t. 1<= i < j <= N.
The number of factors of W is N choose 2 = (N! / ((N-2)! * 2!))
Define a substring M of W as a contiguous sequence of symbols from W s.t. |M| <= |W|, or, equivalently, a subsequence of W whose start point and end point are given by a factor of W.
Note that the number of substrings can be smaller than the number of factors, due to repeats.
What is the maximum number of substrings for a word of length N and alphabet S?
I think I solved this for binary alphabets, and that pretty much satisfies me so far.
The binary words with the maximum number of substrings are probably De Bruijn sequences, which contain a different binary sequence of length lg N starting at every position. N naturally is a power of 2.
Ignoring words of lengths other than powers of 2, if a word is a De Bruijn sequence, then every start position i is the start of (N - i - lg N) unique substrings. So the number of unique substrings of length at least lg N is (N - lg N) choose 2.
For substrings of length less than lg N, every bit sequence will appear at least once, so the total number is 2^0 + 2^1 + ... + 2^(lg N - 1) = N
So the number of substrings is ((N - lg N) choose 2) + N, which is O(N^2) .... (= close enough for CS)
I didn't spend much time checking the math, but it sounds good to me, and O(N^2) is the part I care about.
Feel free to post solutions to other cases or correct my solution for the binary case.