Do you need this or any other assignment done for you from scratch?
We have qualified writers to help you.
We assure you a quality paper that is 100% free from plagiarism and AI.
You can choose either format of your choice ( Apa, Mla, Havard, Chicago, or any other)
NB: We do not resell your papers. Upon ordering, we do an original paper exclusively for you.
NB: All your data is kept safe from the public.
Buy N Sell is a business that operates by purchasing houses in disrepair at a low price, fixing them, and reselling them at a higher price to make a profit. It relies on investors to provide the sums for the house purchases and finance the repairs, as this structure enables it to operate with improved flexibility. As such, to retain their interest, the business has to consistently produce high profits that remain attractive compared to other options. As such, the firm considers any project that generates less than $20,000 in profit to be a failure. However, as Trim (2018) notes, the so-called “house flipping” is a risky affair, as it relies on correctly identifying attractive homes, purchasing them below market price, and avoiding overcapitalization. The company is interested in determining the factors that can help it predict whether it can reach its target profit figure based on specific house characteristics.
The project’s primary objective is to identify the relationship between various aspects of a house and its ability to generate a profit for the business. Its first research question is what factors best predict the overall cost of repairs that will need to be done to make the house suitable for resale. Second, it will investigate what characteristics of a home best determine its purchase and sale prices. Using this information, the company will be able to quickly and efficiently choose the most suitable locations for its operations and conduct high-profit operations.
The data for the project were collected using the records of the last three years of Buy N Sell’s operations. They feature general information about 62 houses, such as their purchase and sale prices, repair expenses, and measurements such as size and number of bedrooms. Additionally, the company’s experts have been collecting detailed information about houses that they worked on, such as the number of light fixtures, flooring and landscaping quality, and others. In the future, they will continue doing so and feeding the data into the model to improve its accuracy.
The H2O package for R will be used to create the machine learning models that will conduct the analysis. It can be used to develop frameworks using a variety of different algorithms, which makes it suitable for solving a diverse range of problems. Another advantage of H2O is it’s automatic preprocessing of data, which saves time and effort that can be applied elsewhere. With that said, some degree of preparation will be required, as variables cannot necessarily be effectively put into the program in their original form. First, it will be necessary to determine whether the different variables should be classified as continuous or categorical.
The characteristics that are expressed with small integer numbers, namely the numbers of bedrooms, bathrooms, and garages, should be categorical by their nature. Factors that are based on a subjective evaluation, notably the quality of the flooring, should also be categorical, as that is how most people would arrange them already. On the other hand, the approximate cost of the appliances, the number of light fixtures that need to be replaced, and the owners’ offered sale price should be continuous. The size of the house should be continuous, as prices are often determined on a per-square-foot basis. Its age should be separated into distinct categories using data discretization (Miner et al., 2017) to broadly evaluate its attractiveness and the degree of repairs likely needed. Lastly, the target variables, namely the recommended purchase and resale prices as well as the repair costs, should be continuous.
The problem that is being considered in this report is one of estimating a set of numbers given particular inputs. However, these starting values are immutable, and the program is being asked to guess the results using the relationships it finds between the different variables instead of trying to optimize them. Per Boehmke and Greenwell (2020), this requirement places the problem in the dimension reduction category of unsupervised learning. It aims to separate the most significant correlations between the different variables and eliminate those that exert little to no influence. By selecting specific items and examining their set of relationships, the user can then determine the factors that influence their desired outcomes the most and consider them in future endeavors.
To create the model, the H2O package will be used to develop and train several different models that are based on the various algorithms available. They will first progress independently, using the same portion of the data set to produce potentially meaningfully different results. Next, cross-validation will be used with the remaining parts of the sample to ensure that the models are valid and produce somewhat accurate results. Following this procedure, two stacked ensembles will be created: one that contains all the models and one that has the best one from each algorithm class. After training these ensembles, H2O will produce a leaderboard of the models that have generated the best results and a list of variables ranked by their ability to predict profit. The project’s authors are particularly interested in the relationship between profit and the number of bedrooms as well as the age, cost, and size of the house,
With the factors that determine profit identified, the project can proceed to evaluate their relationship with profit in more specific terms. A generalized linear model will be used to find the particular degree to which changes in predictive variables correspond to differences in the profit made. As mentioned above, the profit made will consist of the sale price minus the purchase and repair costs, and the regression will combine these three factors into a single one. The analysis results can then be applied to any future dataset, providing an estimation of the profit that can be made. As such, the target requested by the company will have been achieved, and the results will be provided to the customer.
The authors will formulate a report that details the findings to Buy N Sell, explaining the structure of the project and the conclusions that were drawn. They will also provide it with the code so that the company can reproduce the results independently. To help Buy N Sell orient itself in the program and be able to run and modify it, the authors will annotate it extensively, illustrating each code segment. Extensive instructions on how the model should be used will also accompany it. If necessary, the authors will conduct on-site training and help integrate the model into Buy N Sell’s software framework. In the case that issues arise, nevertheless, the authors will be ready to provide support if requested for a year following the submission of the completed project to the company.
References
Boehmke, B., & Greenwell, B. M. (2020). Hands-on machine learning with R. CRC Press.
Miner, G., Yale, K., & Nisbet, R. (2017). Handbook of statistical analysis and data mining applications (2nd ed.). Elsevier Science.
Trim, A. (2018). Real estate dangers and how to avoid them: A guide to making smarter decisions as a buyer, seller and landlord. Wiley.
Do you need this or any other assignment done for you from scratch?
We have qualified writers to help you.
We assure you a quality paper that is 100% free from plagiarism and AI.
You can choose either format of your choice ( Apa, Mla, Havard, Chicago, or any other)
NB: We do not resell your papers. Upon ordering, we do an original paper exclusively for you.
NB: All your data is kept safe from the public.