Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with - in Variable Names #113

Open
allicamm opened this issue Oct 24, 2020 · 2 comments
Open

Issues with - in Variable Names #113

allicamm opened this issue Oct 24, 2020 · 2 comments

Comments

@allicamm
Copy link

Hi Brandon,

I'm having an issue using pdp for a dataset with dashes in variable names.
When I run this line of code:
partial(model, train = training_final, pred.var = 'marital_status_Married-civ-spouse', plot = TRUE)

It looks like some code in PDP is losing the quotes for this and hence the variable name is getting cut off at the dash:

Error in eval(expr, envir, enclos) :
object 'marital_status_Married' not found

Obviously this could be fixed on my end with changing variable names before creating my model, but figured this might be an issue others run into as well.

Thanks!

@bgreenwell
Copy link
Owner

Thanks @allicamm ill try to fix this in the next release!

@bgreenwell
Copy link
Owner

@allicamm Looks like the issue is in plotPartial() (which relies on lattice graphics and is the default plotting engine whenever plot = TRUE). However, partial() and autoplot() work fine:

library(ggplot2)
library(pdp)
library(xgboost)

trn <- vip::gen_friedman(seed = 101)
X <- data.matrix(subset(trn, select = -y))
y <- trn$y

# Add chyphens to feature names
colnames(X) <- paste0(colnames(X), "-", "test")

# Fit a quick model
fit <- xgboost(X, label = y, nrounds = 50)

# Works
pd <- partial(fit, pred.var = "x1-test", train = X, type = "regression")

# Works
autoplot(pd)
partial(fit, pred.var = "x1-test", train = X, type = "regression", plot = TRUE, plot.engine = "ggplot2")

# Fails
plotPartial(pd)
partial(fit, pred.var = "x1-test", train = X, type = "regression", plot = TRUE)  # plot.engine = "lattice" (this is the default)

Might be tough to fix, but I'll work on it soon. Thanks again for pointing out the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants