Let's say, I have a dataset with ten columns and I am trying to create a new column by concatenating some of the columns out of those ten columns. My goal is to create a key column by concatenating those columns to get the most uniquenesses. Whether any combination provides the uniqueness.
Here's how I would go about doing this. You want to find the 3-4 columns that have the most unique values. Either via our one-click profiling capability or using filtergrams, determine which columns have the highest cardinality relative to the size of the dataset and the least number of duplicate values. For example, if your dataset has 10000 rows, find the columns that have the highest number of unique values as close to 10000 as possible.
Once you have determined which columns meet that criteria, you can create a key column by concatenating all the values from the selected columns. When you create a filtergram on this new key column, you will know if you are successful in creating a unique key if no duplicate values appear in the filtergram.