Using .Net Regex Balancing Groups to Match Words in Fibonacci Lengths
\b (?<A>\w)+\s+ (?<A>\w)+\s+ (?<-A>\w)+ (?(A)(?!)) \b
The first example captures three words. Besides spaces and word boundaries, the interesting bits read:
(?<A>\w)+– Capture the first and second word. Push each letter to the A stack. A note here is the it might have been more correct to write(?:\w(?<A>))+, so I don’t push a value I don’t use to the stack, but I think the(?<A>\w)is clearer. I could have also compressed it to((?<A>\w)+\s+){2}, but then it would have been even less readable.(?<-A>\w)+– On the third word, this will only match as many letters as matched before. An alternative syntax here would be(\w(?<-A>))+.(?(A)(?!))– Fail the match if there are still letters in the A stack. This one is a must, in case the third word is shorter than the first two.\b– Last but the least, I need to make sure I match at word boundaries.
Note that In the above regex I don’t care if the second word is longer or of equal length to the first. If this is a problem for you, it can be checked using the following pattern. I won’t explain it too much, but the idea here is to push two letters to the B stack for each letter you remove from the A stack:
\b (?<A>\w)+ # First word \s+ (?<B-A>(?<B>)\w)+ # Second word - consume all A's, push an extra B (?(A)(?!)) # Make sure A is finished (?<B>\w)* # Add more B's (or none) \s+ (?<-B>\w)+ # Third word (?(B)(?!)) \b
Finally, here’s a regex to capture a sequence of words, where the length of each word is equal to the sum of the lengths of the two previous words:
\b
(?:
(?<A>\w)+ # First word - push each letter
\s+
(?=
(?<A>\w)+ # Second word - push each letter
\s+
(?<-A>\w)+ # Third word - pop letters
(?(A)(?!))
\b
) #Look Ahead
)+
\w+\s+\w+\b #Capture last two words (already looked at them)
This might look a bit strange: I match a single word at a time, and use the look-ahead to check the following two words. The result is a sequence of words whose lengths make a Fibonacci series.
See Also:
Regex Balancing Group in Depth – An excellent introduction to balancing groups and the basic concepts and tools used in .Net regular expressions.