@navronaman

Thank you for this

@mateoq2810

Thank you, I spent hours searching tutorials for this model and none of them worked for me.

@MrTuck12345

Exceptional video, thank you so much❤

@ryanp8974

Good stuff, ty!

@saimajahan1000

Thanks for awosome video. Your dependent variable ViolentCrimePerPop has skewed function. You used default loss function. I think default loss function assume normal distribution. I think negative log likelihood, which is mainly used loss function for xg classification can improve your RMSE. I wonder if you could answer my comment.

@Joga_gospel

Thank you for this video. it helped me in immeasurable ways. please, how can I get the R2?

@srinivasanbalan5903

Good tutorial. Thanks for sharing. I have a question on the feature importance. How can we get Feature importance from XGBoost? Can you add that xgb.importance object for our reference?

@micalaravena1857

Hi Spencer, Can be possible  you can do a model whit AdaBoost algorithm for variable cuantitative continuous? is necessary transform a target variable numeric to category to apply this algorithm? thank for your conteny is wonderfull for us!

@jakewhitworth3116

Hey Spencer! Sorry to be commenting so long after this post haha. This is very impressive, your walk through was so much more in depth than other xgboost for regression guide I've found. I've got a question for you: I have 14 predictor variables, most are binary, some are continuous, and then there are three categorical. I'm worried about one of those categorical variables because it has 14 levels. Will including that variable (after I've changed all categorical variables to numeric) mess anything up with the model? I was thinking I should just not include it, but it looks like your data has multiple categorical variables that also have more than a few levels.
For background, I'm only using the xgboost decision tree for variable selection and an insight into variable importance. I will be plugging in the recommended variables into a LR model for interpretability purposes.
Let me know what you think! Great content! I'm glad I found your page!

@kellychen6202

Fantestic video. BTW, how to calculate the AUC of this model?

@Aman-cy4zj

Hey Spencer, Thanks a lot for the video. Really liked it.
I wanted to point out though, you encoded 'county' and 'state' column as numeric in your data pre-processing stage. This seems like an incorrect way to encode these columns as XGBoost will see it as ordinal values rather than nominal data type. This can result in a brittle model and over-fitting easily.
Hope this helps.
Please keep creating more content, much appreciate the work!

@juanwang3705

Hi Spencer Pao:

Thank you for your video. i have two questions. 
1. how could I know which independent attributes are important in the regression?
2. Why other people use the following coding for GBDT. why yours are very different than theirs?
bst_model <- xgb.train(params = xgb_params,
                                      data = train_matrix,
                                      nrounds = 1000,
                                      watchlist = watchlist,
                                      eta = 0.001,
                                      max.depth = 6,
                                      gamma = 0,
                                      subsample = 1,
                                      colsample_bytree = 1,
                                      missing = NA).

@bhuvanaelango1305

Hi, 

i tried running your code, however when i ran the xgb_tune i got this error "Error: Please make sure that the outcome column is a factor or numeric . The class(es) of the column: 'tbl_df', 'tbl', 'data.frame'. what do i do know ?

@nikeshnavele8054

Can you also show how to get Gini for the model?

@ststwangkaiyan5605

mark