cancel
Showing results for 
Search instead for 
Did you mean: 

How do I do sampling on a specific subset of my dataset?

DC Motor
A customer asked about this in April 2017 - The customer has a 45 million row dataset and she wants to sample the dataset but she only wants to sample the data where Column Name = "Value"

To do this, first create a new dataset which is a subset of the 45 million row dataset and then perform the sampling.

Step 1: Use a Filtergram and select the desired values by which the dataset needs to be sampled. For Example: I want to select all the data where HQ STATE = CA.

Image: https://us.v-cdn.net/6030933/uploads/editor/xa/s4uw2a1132ky.png

Step 2: Create and publish a Lens that stores this view of the dataset for reusability.

Image: https://us.v-cdn.net/6030933/uploads/editor/w6/ldy6v9h37e1c.png

Step 3:  Once this dataset has been created, bring it into a new Paxata Project and use the Sampling tool.
Labels (1)
0 Kudos
1 Reply
Highlighted
DC Motor
Correct.
0 Kudos