Skip to content

Conversation

@fverac
Copy link
Collaborator

@fverac fverac commented Jul 14, 2023

Summary of changes:

  • Add enable_missing arg to most CATE estimators.
    _OrthoLearner and subclasses
    Metalearners
    OrthoForest models
  • Note of the estimators that accept missing values, most only accept missing in W.
  • Fix bug where DoWhyWrapper did not work with DMLOrthoForest when discrete_treatment=True

@fverac fverac force-pushed the fverac/enable_missing_values branch from e07c5c2 to ef790e5 Compare July 14, 2023 16:24
Copy link
Collaborator

@kbattocchi kbattocchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but please add at least one test exercising this functionality.

@fverac fverac force-pushed the fverac/enable_missing_values branch from 90f6fff to 00dd506 Compare July 18, 2023 20:47
@fverac fverac linked an issue Jul 19, 2023 that may be closed by this pull request
Copy link
Collaborator

@kbattocchi kbattocchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.

@esbraun
Copy link

esbraun commented Jul 24, 2023

Greetings! Per the office hours call - may we let the user opt in to nan's for X in the non parametric DML, the doubly robust learner, S learner, T learner and X learner? All of these models potentially allow for a CATE model able to handle nan's.

@fverac fverac force-pushed the fverac/enable_missing_values branch from 9812ef4 to e34e85f Compare August 15, 2023 14:58
@kbattocchi kbattocchi marked this pull request as ready for review August 21, 2023 22:19
@kbattocchi kbattocchi marked this pull request as draft August 21, 2023 22:20
Copy link
Collaborator

@kbattocchi kbattocchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These updated changes look good to me.

@fverac fverac changed the title enable nans in W add arg to allow missing values in W and sometimes X Aug 24, 2023
@fverac fverac marked this pull request as ready for review September 29, 2023 19:33
Signed-off-by: Fabio Vera <[email protected]>
Signed-off-by: Fabio Vera <[email protected]>
@fverac fverac merged commit 25c3b3b into main Sep 29, 2023
@fverac fverac deleted the fverac/enable_missing_values branch September 29, 2023 21:37
kbattocchi pushed a commit that referenced this pull request Oct 31, 2023
* enable nans in W

Signed-off-by: Fabio Vera <[email protected]>

* linting

Signed-off-by: Fabio Vera <[email protected]>

* add tests for ests that handle missing in W

Signed-off-by: Fabio Vera <[email protected]>

* allow missing in X for some ortholearner subclasses

Signed-off-by: Fabio Vera <[email protected]>

* refactor keyword arg to be bool only, add more tests

Signed-off-by: Fabio Vera <[email protected]>

* linting

Signed-off-by: Fabio Vera <[email protected]>

* enable missing for metalearners and orf, fix dowhywrapped discretetreat dmlorf

Signed-off-by: Fabio Vera <[email protected]>

* update arg name to allow_missing, add docstrings

Signed-off-by: Fabio Vera <[email protected]>

* add warning when missing values detected

Signed-off-by: Fabio Vera <[email protected]>

* dummy commit

Signed-off-by: Fabio Vera <[email protected]>

* dummy commit revert

Signed-off-by: Fabio Vera <[email protected]>

---------

Signed-off-by: Fabio Vera <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Can imputation of X, W before cross-fitting induce bias or overconfidence? Best practices to handle NaN/missing values in W or X features?

4 participants