cancel
Showing results for 
Search instead for 
Did you mean: 

Find a uniqueness of concatenated colummns

Highlighted
NiCd Battery
Hi everyone!

Let's say, I have a dataset with ten columns and I am trying to create a new column by concatenating some of the columns out of those ten columns. My goal is to create a key column by concatenating those columns to get the most uniquenesses. Whether any combination provides the uniqueness. 

Eren
Labels (1)
0 Kudos
2 Replies
Blue LED
Hi Eren,

Here's how I would go about doing this. You want to find the 3-4 columns that have the most unique values. Either via our one-click profiling capability or using filtergrams, determine which columns have the highest cardinality relative to the size of the dataset and the least number of duplicate values. For example, if your dataset has 10000 rows, find the columns that have the highest number of unique values as close to 10000 as possible.

Once you have determined which columns meet that criteria, you can create a key column by concatenating all the values from the selected columns. When you create a filtergram on this new key column, you will know if you are successful in creating a unique key if no duplicate values appear in the filtergram.

Hope this makes sense,

Nenshad

0 Kudos
Highlighted
NiCd Battery
Thank you!
0 Kudos