Hello
I have a question about Target Leakage.
I got three target leakages in my model.
Two of them are removed, but one of them is not removed from the final feature list.
Also, they have different suffixes on "Data" display.
removed
These features are removed and have suffixes like "[Target Leakage]".
not removed
This feature is not removed and doesn't have a suffix.
What is the difference?
Solved! Go to Solution.
Hi @cookie_yamyam ,
Thank you for the question.
DataRobot checks for target leakage during EDA2 by calculating ACE importance scores (Gini Norm metric) for each feature with regard to the target. Features that exceed the moderate-risk (0.85) threshold are flagged; features exceeding the high risk (0.975) threshold are removed.
I think if the third feature was not marked as 'Target Leakage' by DataRobot, the ACE score might be less than 0.85.
More information on this can be found in our docs here
Regards,
Vinay