Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes for Third Edition #57

Open
13 tasks done
christophergandrud opened this issue May 30, 2015 · 1 comment
Open
13 tasks done

Changes for Third Edition #57

christophergandrud opened this issue May 30, 2015 · 1 comment

Comments

@christophergandrud
Copy link
Owner

christophergandrud commented May 30, 2015

Please list possible changes for the third edition here.

  • Convert book source to RMarkdown

  • Use rio for data importing and exporting.

  • Follow system vs. package naming conventions.

  • repmis no longer supports downloading data from Dropbox due to changes in the Dropbox API. Remove this discussion and the package from the book.

  • Remove ZeligBayesian as it is no longer needed by Zelig. Also remove example in 9.3.5 with the package. Changes in Zelig have broken the example.

  • R Markdown Notebooks

  • Discuss databases

  • Google R style guide link update

  • Update broken dropbox link (though maybe just use different data example) Dropbox Link for FinRegulator Data #78

  • installr

  • Discuss Jupyter notebooks

  • Use pacman for package installation in Front Matter: used xfun::pkg_attach2() instead as it is installed with knitr

  • Use here

@christophergandrud
Copy link
Owner Author

christophergandrud commented Sep 16, 2018

The 3rd edition will include a number of updates:

  • to reflect new R capabilities,

  • address URL "link rot",

  • discuss Jupyter notebooks their use (and abuse),

It will also reflect a number of experiences that I (and others) have had using these tools in the intervening years since the 2nd edition working and teaching in academics and industry.

I will convert the current Rnw source to bookdown.

Required R Packages

The book itself should practice what it preaches, i.e. be reproducible. This chapter instructs readers on what R packages (and other ancillary software) to install in order to complete the examples and reproduce the book. There have been a number of improvements to the R echo system that make reproducing the book easier and there are more modern packages that replace the functionality of those included in the 2nd edition.

  • Use pacman for package installation [NOTE: actually discussed pacman and pkg_attach2 from xfun.]

  • Use tinytex rather than LaTeX

  • Remove ZeligBayesian and Zelig as they are no longer needed to demonstrate the capabilities discussed in Chapter 9.

  • Remove repmis. It no longer supports downloading data from Dropbox due to changes in the Dropbox API. Many of its capabilities for handling data input/output are now better handled by rio. Remove this discussion and the package from the book and largely replace with rio.

  • Give examples from installr, for Windows dependencies such as RTools.

  • Follow system vs. package naming conventions.

Chapter 1

  • Include more detailed examples of using reproducible research in industry settings based on my recent experiences. It is particularly important for onboarding new team members, avoiding effort duplication (reintroducing previously tested features), and new data governance concerns (e.g. GDPR).

  • Discuss tinytext in addition to full LaTeX install.

  • Update recommended books to include Xie et al. (2018) R Markdown: The Definitive Guide and Kross (2018) The Unix Workbench and Kitzes et al (2017) The Practice of Reproducible Research.

Chapter 2

Chapter 3

  • Update Figure 3.1 with updated Startup Console

  • Remove discussion of Rtex (this seems to be rarely if ever used in the wild).

  • (new) Appendix section on Jupyter notebooks with IRKernel. Also discuss notebooks generally (e.g. experience in machine learning industry and "The First Notebook War")

Chapter 4

  • Expand naming files section.

  • Discuss how to use here for more stable file path management across systems.

Chapter 5

  • Remove discussion of Dropbox Public folder as this is no longer supported

  • Update data download link to: https://www.dropbox.com/s/130c5ol3o2jjmgk/public.fin.msm.model.csv?dl=1

  • Use import from rio, though doesn't has data like source_data in repmis. So, consider keeping repmis discussion. Also include discussion of mirroring external data sources to avoid breaking code if dependency breaks, e.g. due to "link rot".

Chapter 6

  • Reiterate avoiding linking to external data sets, provide a mirror when possible.

  • Update URLs that have fallen to link rot.

  • Update list of Data APIs and Feeds.

Chapter 7

  • Add discussion of str.

  • (new) tibble examples

  • (new) Examples with the easier pivot_longer() and pivot_wider() functions instead of the more awkward gather() and spread() functions for pivot- ing data frames.

Chapter 9

  • Highlight that a real benefit of knitr for longer documents is typically in producing tables and figures. Execution of data collection/cleaning/analysis often makes more sense with makefiles in these contexts.

  • Remove ZeligBayesian as it is no longer needed by Zelig. Also remove example in 9.3.5 with the package. Changes in Zelig have broken the example. Consider replacing with example using brms.

## Chapter 10

  • Update data URLs.

  • Update caterpillar plot for new posterior densities from brms.

Chapter 11

  • knitr engines tikz and d3 (could add latter as example in Chapter 13.

(old) Chapter 12

  • Removed chapter as content is better covered by Xie's Bookdown book.

Chapter 12

  • Remove discussion of discontinued Dropbox Public folder hosting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant