cancel
Showing results for 
Search instead for 
Did you mean: 

Why when using different seeds do we get identical results?

GiorgioC
NiCd Battery

Why when using different seeds do we get identical results?

Hi all,

 

In a previous question posted in the community I asked how to clone a project and started it again at different random seeds. While the code below works, it does not de facto starts the project at different random seeds (x, below).

 

Any idea as to why the seeds is not changing and thus generates identical results? 

 

Thank you,

Giorgio

 

@christian 

 

 

#Import Libraries
import datarobot as dr
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('ticks')
sns.set_context('poster')
#Connect to API
dr.Client(config_path='~/Desktop/drconfig.yaml')

 

#Load project in:

project = dr.Project.get('Your_Project_id')
partitioning = dr.StratifiedCV(holdout_pct = 20, reps = 10)

 

#Clone loaded project and start autopilot with different random seeds:
for x in range(3):
new_project = project.clone_project(new_project_name='tes_Etanercpet_stress_test_Stratified_random_seed_'+str(x))
new_project.set_target(target="CDAI_response_status", advanced_options= dr.AdvancedOptions(seed = x), mode='auto', partitioning_method = partitioning, worker_count = -1)


Labels (1)
3 Replies
IraWatt
Laser

Hey @GiorgioC,

What particular part of the project where you wanting to see the difference? I'm not sure if the project seed is passed to all other processes. For instance, the partitioning of Holdout, Validation and Training also has its own seed parameter: API Reference — DataRobot Python Client 2.28.0 documentation (readthedocs-hosted.com) 

GiorgioC
NiCd Battery

Hi @IraWatt ,

 

I am basically trying to change the random seed as you would in the GUI from advanced options. It should create different CV partitions, thus creating different results even if the input is the same.  

0 Kudos
christian
Data Scientist
Data Scientist

Hi Giorgio

 

What Ira says above is correct: the partitioning also has its own seed parameter. I think the advanced option route is useful if you didn't pass the partitioning explicitly. 

So a solution would be instead of 

 

partitioning = dr.StratifiedCV(holdout_pct = 20, reps = 10)

for x in range(3):
  new_project = project.clone_project(new_project_name='test'+str(x))
  new_project.set_target(target="target", advanced_options= dr.AdvancedOptions(seed = x), mode='auto', partitioning_method = partitioning, worker_count = -1)

 

use something like this with the partitioning inside the loop and a seed passed to the partitioning

 

for x in range(3):
  partitioning = dr.StratifiedCV(holdout_pct = 20, reps = 10, seed = x)
  new_project = project.clone_project(new_project_name='test'+str(x))
  new_project.set_target(target="target", mode='auto', partitioning_method = partitioning, worker_count = -1)