Skip to content

EswarAditya5/Project_Flu_Shot_Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Project_Flu_Shot_Learning

About the data:

The data for this competition comes from the National 2009 H1N1 Flu Survey (NHFS).

In their own words:

The National 2009 H1N1 Flu Survey (NHFS) was sponsored by the National Center for Immunization and Respiratory Diseases (NCIRD) and conducted jointly by NCIRD and the National Center for Health Statistics (NCHS), Centers for Disease Control and Prevention (CDC). The NHFS was a list-assisted random-digit-dialing telephone survey of households, designed to monitor influenza immunization coverage in the 2009-10 season.

The target population for the NHFS was all persons 6 months or older living in the United States at the time of the interview. Data from the NHFS were used to produce timely estimates of vaccination coverage rates for both the monovalent pH1N1 and trivalent seasonal influenza vaccines.

The Project involves the following key steps:

Data Collection: Driven Data: https://www.drivendata.org/competitions/66/flu-shot-learning/page/210/

Data Preprocessing: Cleaning and preprocessing the dataset to handle missing values, outliers, and inconsistencies. This step also involves transforming categorical variables into numerical representations, normalizing numeric features, and splitting the dataset into training and testing subsets.

Model Training: The procedure involves employing logistic regression with relevant libraries or frameworks to train the model. Training encompasses the adjustment of the model's parameters, such as coefficients, through the utilization of training data. The goal is to optimize the model's performance by minimizing the disparities between its predicted class labels and the actual class labels in the training dataset.

Model Evaluation: Assessing the performance of the trained logistic regression model involves using classification-specific evaluation metrics, such as accuracy, precision, recall, F1-score, ROC curve, AUC, and the confusion matrix, to gauge how well the model classifies and discriminates between different categories.

Contributing:

Contributions to this project are welcome. If you would like to contribute, please follow these steps:

License:

This project is licensed under the GPU License.

Acknowledgments

  • The dataset used in this project is sourced from: Driven data
  • The Random Forest algorithm is implemented using the scikit-learn library.

Contact

If you have any questions or suggestions regarding this project, please feel free to contact me at eadityar@gmail.com

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published