Skip to content

CSCI 499 Special topics in Advanced Data Science at CSU Chico (Spring 2019)

License

Notifications You must be signed in to change notification settings

MoreDataScience/CSCI499-Spring2019

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Advanced Data Science

Course Overview

This class is a structured, collaborative study of advanced topics in Data Science. During the semester, students will apply the data analytics lifecycle to a research topic of their choosing. Students will select appropriate predictive analytical methods for their topics and evaluate its social and ethical implications. Individual work will complement peer collaboration as students explore issues of visualizing and communicating data to each other and to the public.

Class repository is maintained on GitHub/MoreDataScience and hosted on MoreDataScience.github.io/CSCI499-Spring2019/.

Office Hours

OCNL 220 Tuesdays, 11am - 3pm Or by appointment

Learning Modules

Topic Activities Due Date Lead
Collaboration and Version Control with R projects Create a GitHub repository for your project portfolio and make it host a public site that will include your code and blog and review how to maintain version control and collaborate using Git. Join the class slack channel and use it for all out-of-class communication, including questions and when you need assistance. As a first entry for your blog, identify a /r/dataisbeautiful post that interests you and summarize a critique for it. As a first commit to your code, identify a research topic and edit your README.md to provide a brief expanation, including where you expect to find the source(s) of your data. Finally, submit a Pull Request to edit this document with the topic area you choose to lead and a link to your site. February 8 Kevin Buffardi
Ethics and Data Science in Society Resources: Weapons of Math Destruction chapters 3, 6; Podcast: "Science Vs - Gentrification: what's really happening". Add at least one blog entry to communicate your thoughts on ethics and societal impact of data science and how it applies to your topic. February 15 Grant Esparza
Data Analytics Lifecycle Resources: R for Data Science chapters 4 and 8; Commit code to your project that organizes "where your data (and analysis) lives" and begin exploring your data by identifying what needs to be cleaned and what questions you might be able to get insight to with the information available. Follow best practices, as guided by the reading. Write a blog that documents what you've done so far, including where you found the data, what you've discovered about the dataset. Make sure you provide enough detail that what you have done can be replicated. March 15 Lizz Arriaza
Regression models and Classification Resources: Introduction to Statistical Learning chapters 2-4; Explore how to apply the information to your project and write a blog entry that explains your decisions with justification, again with enough detail that someone can replicate your work. March 29 Eduardo Gomez
Resampling and Tree based methods Resources: Introduction to Statistical Learning chapters 5, 8; Similarly to the previous module, apply the reading to your project and write a blog entry about it. Continue to analyze your data and document your process while you commit and push your new versions. April 5 Jerry Tucay
Information Visualization Resources: Edward Tufte keynote (video); The Schneiderman Information Visualization Mantra (video); Learning Data Visualization (via Lynda) chapter 5: Visual Dispay; Using the principles you learned from this module's materials, create at least one visualization of your data that provides useful insights. In your blog, post your visualizations and also discuss what design decisions you made that help communicate the insights the visualizations provide. April 26 Eisley Adoremos
Peer Review and Replication Before meeting, make sure your project code and documentation is all committed to your project. During class, we will perform a pull request review so that a peer can verify that they can replicate, review, and critique your results May 3 Kevin Buffardi

Projects

| Student | Project Porfolio Link |

Name Topic repo
Eisley Adoremos How Much Better are NBA Players Today Compared to the Past?
Eduardo Gomez Crime in the United States
Grant Esparza Public Perception of Tech Companies Following Security Leaks
Lizz Arriaza World Travel
Jerry Tucay Sales Forecasting

About

CSCI 499 Special topics in Advanced Data Science at CSU Chico (Spring 2019)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published