clear capture log close set more off capture log using NHANES_transform.log, replace ******************************************************************************** ** Data Transformation for Data Analysis - Sample Stata Program: ** ** Created for the CSP Summer Seminar in 2005 ** ** This program, NHANES_transform.do, makes the file ** ** NHANES_transformed.dta ** ******************************************************************************** use NHANES svyset [pweight=wtmec2yr],strata(sdmvstra) psu (sdmvpsu) ** Data Transformations ** Get data ready to analyse Patient Factors Associated with A1c & hisugar. ** Transform gender variable to a zero/one variable gen female = (riagendr - 1) ** Error Check tab female riagendr,missing ** Rename and label education variable, code refused and don't know as missing gen educ3=dmd140 recode educ3 7 9 = . label define educ3l 1 "<12yr" 2 "12yr/GED" 3 ">12yrs" label values educ3 educ3l **Error check tab educ3 dmd140, m **Rename the age in years variable "ridageyr" to something more user-friendly gen age=ridageyr **Race **Group race into White, African-American, Other gen race=ridreth2 recode race 5=4 label define racel 1 "White" 2 "African-American" 3 "Mexican-American" 4 "Other" label values race racel **Error Check tab race ridreth2,missing **collapse and label income variable gen income=indhhinc recode income 77 99 12 =. 01 02 03 04 13=1 05 06 07=2 08 09 10=3 11=4 label define income 1 "<$20,000" 2 "$20,000-44,999" 3 "$45,000-74,999" 4 ">=$75,000" label values income income **Make labels for DM questions gen haveDM=diq010 recode haveDM 2=0 7 9=. label define haveDM 1 "Yes" 0 "No" 3 "Borderline" label values haveDM haveDM gen whenDM=diq040q recode whenDM 99999=. ** no general question re high sugar sx. We'll use the question which asks if they have had numbness in the past 3 months. gen numb=diq100 recode numb 7=. 9=. 2=0 **Error check tab numb diq100, m **Create a three level variable for no meds, oral meds ONLY, or insulin *diq050 are you taking insulin now? *diq070 are you taking oral meds now? for var diq050 diq070 : recode X 7 9=. gen meds3=. replace meds3=1 if diq050==2 & diq070==2 replace meds3=2 if diq070==1 & diq050==2 replace meds3=3 if diq050==1 label define meds3 1 "no meds" 2 "insulin only" 3 "oral meds" label values meds3 meds3 **Error Checks tab diq050 meds3, missing tab diq070 meds3, missing **HbA1c gen a1c=lbxgh ** Make an HbA1c variable with 3 Categories gen a1c_3=a1c recode a1c_3 1/8=1 8.1/9.5=2 9.6/30=3 lab var a1c_3 "HbA1c 3 Categories" lab define a1c3l 1 "<= 8%" 2 ">8%-9.5%" 3 ">9.5%" label values a1c_3 a1c_3l **Error Check sum a1c if a1c_3==1 sum a1c if a1c_3==2 sum a1c if a1c_3==3 **make label for overall physical activity variable gen phys_act=paq180 recode phys_act 7 9=. label define pa 1 "sits during the day and does not walk about very much" 2 "stands or walks about a lot during the day, but does not have to carry or lift things very often" 3 " lifts light load or has to climb stairs or hills often" 4 "does heavy work or carries heavy loads" label values phys_act pa **make vigorous activity variable gen vigorous=pad200 recode vigorous 2=0 7 9 =. label define vigorous 1 ">=10 min vigorous activity in past 30 days" 0 "no vigorous activity in past 30 days" 3 "unable to do vigorous activity" label values vigorous vigorous **make moderate activity variable gen moderate=pad320 recode moderate 2=0 7 9 =. label define moderate 1 ">=10 min moderate activity in past 30 days" 0 "no moderate activity in past 30 days" 3 "unable to do moderate activity" label values moderate moderate **make t.v./computer variable gen tv_comp=pad575 label var tv_comp "hours a day spent watching tv or on computer--all ages" recode tv_comp 6=0 77 99=. label define tv 0 "<1 hour" 5 "over five hours" **Dicotomize immunization variables **A value of 1 or greater indicates presence of antibody for var lbxme lbxru lbxvar \ new measles rubella varicella : gen Y=X \ replace Y=1 if Y>=1 & Y!=. \replace Y=0 if Y<1 **Error check for var measles rubella varicella \ var lbxme lbxru lbxvar : bysort X : sum Y ** Duration. Subtract age when told had DM from current age. gen duration=age-diq040q if diq040q!=99999 **Error check l age diq040q duration in 1/100 if duration!=. **Give huq010 a better name gen healthst=huq010 recode healthst 7=. 9=. label define healthst 1 "excellent" 2 "very good" 3 "good" 4 "fair" 5 "poor" label values healthst healthst **look at health now vs. one year ago gen health_change=huq020 recode health_change 7 9 =. label define healthch 1 "Better" 2 "Worse" 3 "About the same" label values health_change healthch **Routine Place for Health Care gen routine_hc=huq030 recode routine_hc 2=0 7 9 =. label define routine 1 "I have place for routine health care" 0 "No place for routine health care" 3 ">1 place for routine health care" label values routine_hc routine **Type place most often go for health care gen type_hc=huq040 recode type_hc 7 9 =. label define type 1 "Clinic or health center" 2 "Doctor's office or HMO" 3 "Hospital emergency room" 4 "Hospital outpatient department" 5 "Some other place" label values type_hc type **# times received healthcare over past year gen times_outpatient=huq050 recode times_outpatient 77 99 =. label define times 0 "None" 1 "1" 2 "2 to 3" 3 "4 to 9" 4 "10 to 12" 5 "13 or more" label values times_outpatient times **How long since last health care visit rename huq060 since_last_visit recode since_last_visit 7 9=. label define since 1 "<=6 months" 2 ">6 months >1 year" 3 ">1 year <3 years" 4 ">3 years" 5 "Never" label values since_last_visit since **Overnight hospital patient in last year rename huq070 inpatient recode inpatient 7 9=. 2=0 **# times overnight hospital patient in last year rename hud080 times_inpatient recode times_inpatient 77 99=. label define timesinpt 1 "Once" 2 "Twice" 3 "3 Times" 4 "4 Times" 5 "5 Times" 6 ">=6 times" label values times_inpatient timesinpt **Seen mental health professional rename huq090 mental_health recode mental_health 7 9=. 2=0 for var mcq010 mcq090 mcq160a mcq160b mcq160c mcq160d mcq160e mcq160f mcq160g mcq160h mcq160i mcq160j mcq160k mcq030 mcq040 mcq050 mcq100 mcq110 mcq120b mcq170k \ new asthma chickenpox arthritis chf coronaryheart angina heartattack stroke emphysema goiter thyroid overweight chronbronch asthma_still asthma_attack ed_for_asthma hypertension hypertension_meds ge3ear_infect bronch_still: gen Y=X \ recode Y 7 9=. 2=0 **age diagnosed with asthma rename mcq020 when_asthma recode when_asthma 99999=. **# school days missed due to injury/illness gen schooldays_missed=mcq150q recode schooldays_missed 77777 99999=. **# work days missed due to illness/maternity gen workdays_missed=mcq245b recode workdays_missed 77777 99999=. for var mcq180* \ new age_arthritis age_heart_failure age_coronary_heart_dz age_angina age_heart_attack age_stroke age_emphysema age_chron_bronch : gen Y=X \ recode Y 77777 99999=. gen bmi=bmxbmi **ADL label define ADL 1 "no difficulty" 2 "some difficulty" 3 "much difficulty" 4 "unable to do" for var pfq060* : recode X 7 9 =. \ label values X ADL drop riagendr ridageyr ridreth2-indhhinc lbxgh- huq050 mcq010 mcq030- demo_mcqmerge demo_bmxmerge demo_bpxmerge notes: Dataset NHANES_transformed.dta was made using NHANES.dta and NHANES_transform.do notes: New variables have all been error checked save NHANES_transformed,replace capture log close