Skip to content
lilybradley edited this page Sep 25, 2014 · 3 revisions

Welcome to the questions-answers-faqs wiki!

Q: What is the best way to find continuous NHANES variables?

A: Go to: http://wwwn.cdc.gov/nchs/nhanes/search/default.aspx Enter a pertinent search term.

  • If you want only a single 2-year cycle select that by the “show results for this cycle only”.
    • For example, try selecting 2005-2006 to keep from getting inundated with all the variables in all cycles.
  • Some creativity may be necessary.
    • For example, entering “A1c” will get you the interview questions which included “A1c” but not the A1c lab values. But entering “hemoglobin” will include LBXGH since glycohemoglobin is in the LBXGH variable label. You might then back out to find by searching for LBXGH in 2005-2006, then search “LBXGH” for all cycles to discover that A1c values are included in all cycles 1999-2012.
  • This is a very handy tool CDC has provided. After working with NHANES for many years, I only recently learned about this.

Q: When will 2011-2012 dietary data be released?

A: _______

Q: When will 2011-2012 Hepatitis C follow-up questionnaire microdata be released?

  • The 2009-2010 microdata were released in April of 2012 and the 2007-2008 microdata were released in July of 2010.

A: ______

Q: What does this mean:

Combination of these data with other cycles is recommended when doing analysis. These data should not be used with sample weights to make national estimates due to small sample size and a response rate below 50%. (Only 35 out of 107 eligible completed the questionnaire.)

  • Does that mean that in order to make national estimates, you have to pool multiple cycles? If pooled data are still insufficient to be used for national estimates, then what it is the analytic purpose of the data?
  • Thank you! NHANES is an incredible resource.

A: ______

Q: 2001-2 for race/ethnicity, why are there two variables: RIDRETH1, RIDRETH2?

A: RIDRETH1: This race/ethnicity variable is derived by combining responses to questions on race and Hispanic origin. The recode category “5” (Other race – including multiracial) includes following groups: those with single racial/ethnic identity other than Mexican American, Other Hispanic, Non-Hispanic White, and Non-Hispanic Black; those who report more than one racial identity (multiracial); and those with missing values on race/ethnicity.

  • RIDRETH2 is the race/ethnicity recode that can be linked to the NHANES III race/ethnicity variable. Those who indicated more than one race (multiracial) and then selected a main race as black (non-Hispanic) or white (non-Hispanic) were recoded into those respective categories. The recode category “4” (Other race – including multiracial) includes all remaining single race responses, those who indicated more than one race but did not select a main race, those who indicated a verbatim response to non-specific multiracial heritage (e.g., multiracial, Mulatto), and those with missing values on race.

Q: Where can I find A1C blood test in the NHANES database years 2001-6.

A: Under Laboratory Data, the variable is LBXGH. Attached is a screenshot from the 2005-6 laboratory variable list. Screenshot 2005-2006 laboratory variable list NHANES

Q: Where is Two Hour Glucose Tolerance Test (OGTT) in the Laboratory Tests for the years 2001-2004 and 2001-2012 NHANES data?

A: _____

Q: Is the label change the only "data" change that should have happened with this revision?

  • The 2011-12 BioPro_G.xpt file downloaded on 3/4/2014 compared to the version downloaded on 1/31/14 differ.
  • The ONLY difference that SAS Proc Compare found between the 2 datasets was that the label for LBXSCK has been changed to Creatine Phosphokinase(CPK) (IU/L) -- from Creatinine Phosphokinase(CPK) (IU/L).
  • The February 2014 announcement says that BIOPRO_G "data and documentation have been changed". The documentation now contains a section on LBXSCK that was missing before ("9. Creatine phosphokinase") but no explanation as to what the February 2014 revisions were, as sometimes appear.
  • However, the data seem unchanged other than the 1 label. There are still 6,549 observations and 38 variables.

A: ______

Q: To use mixtran SAS macro, should I rename all the dietary recall data variables to be the same in each day, and then append day 1 and day2?

  • In the NHANES 2007-2008 and 2009-2010 cycles, the variables are named differently in day 1 and day 2.
  • I will use the mixtran SAS macro, but my first questions are setting up the data.

A: Yes that approach is correct one… But you may want to check the description for each variable, just to be sure that you are not collating 2 variables of different meaning.

Q: Are there other steps preceding running the macro?

  • Now that have the data organized, I have entered all the mandatory parameters into the mixtran SAS macro.
  • But I get nothing – no output, no log messages, no datasets created.

Q: Where are the public use 2005-2006 cardiovascular fitness data and documentation?

  • I’ve looked at the tutorials and have found the documentation easy to locate for 2000-2001 and 2003-2004, but haven’t been able to find it for 2005-2006

A: _____

Q: When I analyze associations between an exposure and metabolic syndrome itself (a derived variable with fasting and non-fasting parameters), am I still required to use the fasting weights?

  • I would like to confirm the appropriate weight to use to look at metabolic syndrome in NHANES.
  • Many of the individual parameters in the definition of metabolic syndrome come from the morning fasting subsample, and analysis of these variables require the morning fasting subsample weight.
  • I wanted to confirm my assumption before unnecessarily restricting the sample.

A: That is correct. Use the fasting weights for all metabolic syndrome analysis with objectively measured biomarkers in NHANES. A: This analysis of MetS used fasting weights, which is the weight of the components with the lowest sample size and thus the weights that are recommended. The online supplement contains the sample sizes by year/race/gender that were used.

Q: Where are income and poverty variables for 2011-2012? [2013/12/16]

  • I found several variables for income and poverty in the 2009-2011 demographic and questionnaire files but cannot find any variables for the 2011-2012 cycle.
    A: 2011-2012 NHANES has not released the poverty income ratio variable yet, but did collect it (indfmpir). A: The release of these data has been delayed.

Q: Does NHANES-III have income/poverty variable(s)?

A: NHANES III has a poverty income ratio variable (DMPPIR). A: Several other income variables such as “HFF18” for total family 12 month income < $20,000, “HFF19R” for total family 12 month income group (ph1), and “HFF20R” for total family income, last month, group (ph1). They are all in the Household Adult File.

Q: Are serum glucose and/or serum vitamin D available variables for 2011-2012 NHANES?

  • If so, how do I find the data to analyze?
  • If I use SPSS can it be converted into SPSS or do I need to enter all of the data in myself?

Q: Is anyone aware of sleep related questions (including hours of sleep) which were asked to NHANES III participants?

A: NHANES III has a few questions about insomnia symptoms (but not about sleep duration). An analysis of these data is in this paper

Q: NHANES accelerometer data file with errors?

A: The pam_perday.sas7bdat file should be okay. The error occurred in the last part of the PAXMSTR.SAS code that summarized data in pam_per day to the person level data set. For those who have the erroneous code, the error was a missing “by seqn” statement in the final merge of accelerometer data and demographic data.

Q: Has anyone performed this analysis for Vitamin D?

  • In order to perform an analysis of the 2003/2004 and 2005/2006 NHANES cycles combined, I need to control for energy-adjusted vitamin intake.
  • To control for energy-adjusted vitamin intake, I performed regressions of the log-transformed vitamin intake on the log-transformed energy intake (both averaged over 2 diet recall days) and output the residuals to be included as covariates in models. See code for Vitamin A below.
  • proc reg data=nhanes;
  • model logvita=logtkcal;
  • output out=nhanes1 residual=vitar;
  • run;
    • If I cannot access the Vitamin D intake data, how I might obtain the residuals from this same regression for Vitamin D?

A: NHANES may not have information on vitamin D intake, though it does measure vitamin D levels in the blood and uses these levels to define adequate vitamin D levels.

  • Note that NHANES is a complex sample survey that requires the use of sample design variables and respondent sampling weights to obtain correct answers. Use of PROC REG does not account for the sample design. Use PROC SURVEYREG instead. A: 2007-2008 is the earliest survey cycle in which vitamin D was included among the nutrients in the dietary data. http://www.cdc.gov/nchs/nhanes/nhanes2007-2008/DR1IFF_E.htm
Clone this wiki locally