RegExp

RegExp

How to find any sequence of 3+ repeating non-alphabetic characters?

Labels (1)
0 Kudos
5 Replies

Sorry for the confusion but it does not have to be consecutive. 
I want to find the rows that have the same non-alphabetic characters that are repeating more than 3 times.
FirstName Flag
E?E!E+ valid
valid
E?E?E valid
E?E?E? valid
valid
E?E?E?E? invalid
Eren???? invalid
Eren!$%^ valid
$$$$Eren invalid
$Er$$n valid
$$$Eren valid
0 Kudos

I would like to find non-alphabetic characters that are repeating more than 3 times.

"aaaa" is valid.
"+++" is valid.
"????" is invalid.

Based on your answers, I wrote this but still I need to modify:

if(len(regexp(@Test column@, "[^A-Za-z]\\1{3,}","")) <> len(@Test column@),"invalid","valid")

Can you help me to modify this?


0 Kudos

@ernzdl This should give you what you are looking for.

Based on the image below, the regular expression identifies values that contain 2 or more repeated characters. For your use case, you would modify it to identify any values that have 4 repeated consecutive values:

regexp(@value@), "(\\w)\\1{4,}","") 

In order to identify the proper rows, we compare the length of the string before and after the regexp and if different (meaning the value has repeated consecutive characters) then flag it as "Invalid"

Image: https://us.v-cdn.net/6030933/uploads/editor/8a/luzmza3idiho.png

I hope this helps.
0 Kudos

Hello Eren, 

For finding out 3+ consecutive non alphabetical characters, you can use the following regular expression if(regexp(@COL_NAME@ , ".*[/[^A-Za-z]]{3}.*", "true"),"true","false"). This flags consecutive occurrences of non alpha characters as true. Once this is done, you can pull up a filter to find the rows where there are consecutive occurrences of non alphabetical characters.
0 Kudos

I come up with this: 
LEN(STR(@FIRST NAME@)) - LEN(REGEXP(STR(@FIRST NAME@),"[^a-zA-Z]+",""))
But this one only finds how many non-alphabetic I have. I would like to find the repeating ones more than 3. For example, "Eren$$$" is allowed. But, "Eren$$$$" is not allowed. I want to find the rows when the same non-alphabetic character repeats more than 3 times. 
0 Kudos