Do you need this or any other assignment done for you from scratch?
We have qualified writers to help you.
We assure you a quality paper that is 100% free from plagiarism and AI.
You can choose either format of your choice ( Apa, Mla, Havard, Chicago, or any other)
NB: We do not resell your papers. Upon ordering, we do an original paper exclusively for you.
NB: All your data is kept safe from the public.
Introduction
Asthma is a highly prevalent disease among the U.S. population, especially in the case of school-age children. According to Magzamen and Tager, it is the main reason for non-injury hospitalization and missing school (583). Asthma is considered to be a spectrum disease due to the multiplicity of its manifestations. Not only the symptoms but also, the severity of asthma can differ significantly. The problem is, however, that not only the symptoms vary, but also etiology can be different.
The disease can be caused by genetic, allergic, or environmental factors. The truth lies in the fact that some population groups experience a higher risk of asthma morbidity than others. Magzamen and Tager claim that the chances are consistently higher among poor, urban black children (583). Therefore, the investigation of factors predicting asthma is essential for future prevention practices.
The projects primary objective is to identify the links between different factors and the risk of asthma morbidity. That is why the project will endeavor to answer two research questions relating to the predictors of asthma prevalence and the emergency department visits. First, the project is going to identify what individual and environmental factors best predict asthma prevalence. To answer this question, a set of such variables as gender, race, economic status of a family, or home location will be investigated.
Secondly, the project is going to answer what factors best predict ER visits and whether these factors are the same that predict asthma prevalence. For this purpose, the very variables used in question 1 will be tested again to identify the differences. It is expected that the derived information will shed light on asthma etiology and the main risk factors for the American children.
The current survey study is based on the data regarding the population of American middle school children. Data collection was conducted in Oakland, CA, based on 20 middle schools from the Oakland Unified School District (OUSD). A total of 4,017 have completed a short self-reported survey about the presence of the diagnosis or symptoms of asthma and treatment experience. Additionally, they were requested to fill the information on their grade, sex, race/ethnicity, language spoken at home, the language of survey (English or Spanish), and home address (Magzamen and Tager 584).
Geolocations of the self-reported addresses were mapped to obtain additional variables like NO2 concentration at home location and whether the home is located near a freeway. Other demographic variables were derived from census data to identify physical environmental or demographic predictors associated with asthma morbidity.
Proposed Methods
As there are numerous independent variables, it is suggested to use a machine learning algorithm that will select those variables that will be included in a multiple regression analysis. First, it is necessary to concatenate the census data and the survey data, using addresses to match up data points. Likely, some addresses will not have matches because of the mistakes of data entry. That is why these data points should be omitted. This step of data sorting is especially challenging, as the addresses in the census and survey data are in different orders. Thus, all the rows need to be rearranged appropriately during the data concatenating process.
After the data are cleaned and organized, the H2O automatic machine learning package in R will be utilized to find the most related predictors of asthma prevalence separately. H2O is chosen for this task as it can adjust the parameters automatically to produce the most relevant model. Further, the list of the significant variables will be extracted and analyzed in a multiple regression analysis. Additionally, the best predictors of emergency room visits for children with asthma will be identified.
Multiple regression models will be used to determine the importance of each predictor. Then it is needed to find out how well the lists of predictors for asthma prevalence and ER visits can be compared and identify whether a significant difference between the two tables can be found. For each of the predictors, the regression with different response variables should be analyzed to determine which predictors differ/are the same for asthma prevalence and ER visits.
As the list of data includes some variables that are redundant or missing, the machine learning algorithm will be used both with and without these variables to determine how different the results are. Then it will be decided whether they should be included in the final model or should be omitted. After the H2O Auto ML package analysis is completed, the best suitable algorithms for the current data will be selected.
These models will then be provided to the researcher. For this purpose, the codes of one or several of the best suitable machine learning algorithms from H2O will be handed to the researcher. It is also necessary to supply the codes with comments on how to apply the model so that they can reproduce the analysis in the future if needed.
The outcome of the project will include the ranged list of predictors of asthma prevalence as well as the ER visits. The predictors will be ranged according to their importance for both response variables. Additionally, there will be conclusive statements on the comparison of the same list of predictor variables for both responses. Additionally, the researcher will be provided with the codes and methods for using machine learning algorithms. They can be used for the possible follow-up study to test the changes in asthma prevalence and ER visits in case of the shift in predictor variables.
The results of the study will include the correlations between the control and independent variables. For question 1, the response variable is the presence of asthma [0(no asthma), 1(asthma)]. The list of potential predictors will be analyzed to identify their impact on the control variable. Many scholars identify demographic, economic, environmental, and healthcare-related factors as those of increased importance for asthma morbidity prediction (Hansel et al. 797).
Therefore, the variables analyzed for this question will include the following: Race, household income, language spoken at home, parents education degree, family composition, parents employment status, healthcare access, housing construction type, renter/owner-occupied, location near the highway, heating type, and the exposition to the pollutants. For question 2, the response variables will be the ER visits due to wheezing for the last 12 months [1 (ER visit), 2 (No ER visit)]. As one of the objectives of the research includes the comparison of these predictors, the variables should be the same as in question 1.
The regression model outputs will be used to identify the correlation between each variable and asthma morbidity. This information will help to answer the research questions, and also will provide the grounds for ranging the predictors according to their importance. It is expected that the relevant interactions will be identified between income levels, race, and asthma prevalence, as reported in the research (Magzamen and Tager; Hansel et al.).
Additionally, living conditions contribute to the severity of the disease, increasing the rate of ER visits. However, parents education and healthcare accessibility will also have a significant impact. The obtained analytical conclusions can be utilized for the prediction of the epidemiologic situation and early prevention practices.
References
Hansel, Nadia N., et al. Predicting Future Asthma Morbidity in Preschool Inner-City Children. Journal of Asthma, vol. 48, no. 8, 2011, pp. 797803.
Magzamen, Sheryl, and Ira B. Tager. Factors Related to Undiagnosed Asthma in Urban Adolescents: A Multilevel Approach. Journal of Adolescent Health, vol. 46, no. 6, 2010, pp. 583-591.
Do you need this or any other assignment done for you from scratch?
We have qualified writers to help you.
We assure you a quality paper that is 100% free from plagiarism and AI.
You can choose either format of your choice ( Apa, Mla, Havard, Chicago, or any other)
NB: We do not resell your papers. Upon ordering, we do an original paper exclusively for you.
NB: All your data is kept safe from the public.