cancel
Showing results for 
Search instead for 
Did you mean: 

We're looking into an issue with broken attachments right now. Please stay tuned!

Pattern Matching Strategy

danielkcchan
NiCd Battery

I am using regular expression (regex) in the column split function of Paxata.  My conjecture is that it deploys a strategy of returning the shortest instead of the longest match.  Let me explain with an example.

Regex: .*(AAABBB|BBB).*

Intention: to match either pattern "AAABBB" of pattern "BBB" anywhere in a string (i.e. in the values of a column). 

Outcome: value with "AAABBB" will be matched as "BBB" and returned in a new split column.

I was hoping that a longer match will have priority over a shorter match.  I even tried placing the patterns in descending order of length within the regex but it did not help.

Is it the expected behavior or there is some control somewhere that I missed?

Labels (1)
0 Replies