Data mining is a crucial step in forming new knowledge or so-called Knowledge Discovery in Databases. To be more exact, this process is based on extracting useful information from databases. However, to complete data mining, it is necessary to transform the data according to the techniques that are to be used in the process. Overall, it is impossible to efficiently complete the mining step without altering data to extract potentially valuable patterns.
To begin with, transforming coded and text data is highly significant for enabling the completion of the data mining procedure. In other words, as this step of data collecting is dependent on techniques that intend to efficiently identify valuable information, the data needs to be transformed beforehand to suit the methods (Alasadi & Bhaya, 2017). To be more exact, data mining as a process is designed to sort the information to leave the unneeded one at this step of forming new knowledge (Alasadi & Bhaya, 2017). This procedure is recognized for its focus on seeking the value to be used in different processes of Knowledge Discovery in Databases (Alasadi & Bhaya, 2017). Therefore, data mining is acceptable for collecting numerous data types considering that information from databases is properly structured for further analysis (Alasadi & Bhaya, 2017). Overall, numerous types of coded or text data are acceptable for data mining if adapted satisfactorily during the process of data transformation.
To conclude, the necessity of data transformation is notable due to data mining employing numerous techniques for efficiently identifying and structuring valuable information. Furthermore, to implement the methods productively, data should be appropriately adopted before the process itself. In contrast, without transforming the data, this step might not bring fruitful outcomes as selected information would not be valuable for further knowledge collecting.
Reference
Alasadi, S. A., & Bhaya, W. S. (2017). Review of data preprocessing techniques in data mining. Journal of Engineering and Applied Sciences, 12(16), 4102-4107.
With the advancement of technology, data mining has increased dramatically. Data mining refers to the process of analyzing data to identify unrecognized data patterns or trends to make decisions (Bourgeois et al., 2019). Both large and small companies use this technology to ensure that they gather and assess the information regarding their performance and customers’ satisfaction before they can make new investments.
Discussion
The shift in internet usage has led to Web 2.0, where websites and other applications have changed to user interactive. Various bloggers can now gather information from different websites or companies and post such information on social media. On the other hand, the readers can comment and give their views freely. Although Web 2.0 can be used in data mining, it has several disadvantages.
Too much information is posted daily; hence, there is data overload. With Web 2.0, people with different thoughts post a lot of content on the internet. This data can be confusing to the readers and may not be reliable. The freedom allowed by Web 2.0 to post a comment on the posted content allows rivals of the company to give negative feedback (Bourgeois et al., 2019). Therefore, the information gathered may not be reliable for making decisions. There could be an issue of forgery and hacking crimes (Gallaugher, 2015). This happens when an individual impersonates or tries to access data to use maliciously. Such crimes are possible with the use of Web 2.0 in data mining. For example, a person may create a Facebook account with company details and post irrelevant information. This is likely to attract negative comments and may ruin the reputability of the affected company.
Conclusion
In conclusion, although Web 2.0 can be a good way of obtaining data through user comments, its disadvantages are significant. For instance, information overload can be confusing and may lead to unreliability. The freedom to express views allows rivals to post negative comments that may perplex decision-making. Lastly, with the use of Web 2.0, forging and hacking crimes can be experienced, leading to privacy threats over confidential information.
References
Bourgeois, D, Smith, J. L., Wang, S, & Mortati, J. (2019). Information systems for business and beyond. Open Textbooks. Web.
Data mining is a popular method of software analysis used to find patterns in large datasets. Today, various companies use this technology to build marketing strategies, manage credit risk, detect fraud, filter spam, or even define users’ sentiment. Today many specialists are concerned with the security and privacy problems that data mining evolves. There are two significant problems highlighted: data anonymization and validating external sources (Bhuiyan et al., 2018). Both issues are noted based on the different companies’ experiences.
The first problem is correlated with keeping the identity of the person evolved in data mining secret. The anonymization process contributes to a more reliable systematic analysis because the risk of data misuse is minimized. For example, this technique can be applied in hospitals and for insurance coding and billing (Bhuiyan et al., 2018). Even though there are various methods to prevent information fraud, the problem is substantial due to the existence of hackers who can implement de-anonymization techniques. The second issue is the validation of external sources, which the companies often overlook. This process requires many budget allocations and seems insignificant at the first glimpse. However, it is vital the ensure high-quality data protection. Therefore, the second problem is also substantiated by the irresponsibility of the stakeholders and companies’ administrators.
Data Mining Myths
One of the major myths regarding data mining is that it can replace domain knowledge. The market is unexpected, and without the domain expertise specialist, the company will have only a few chances to advance. Data mining tools provide the compression of data that a specialist should interpret (Raiker, 2019). Another myth is that only huge companies need to implement data mining. This tool can be efficiently used by any company disregarding of its status or size. The small amount of money can be efficiently used to analyze particular issues (Raiker, 2019). It is always much more convenient to work with the particular converted information rather than with the whole database.
People talking about data mining usually means analyzing vast amounts of information and various data. This information helps organizations solve multiple problems and tasks, predict trends, reduce possible risks and find new opportunities. Data mining involves searching for patterns, relationships, anomalies, and deformations to solve a particular problem. In data mining, helpful information is created or found, which can play an essential role in the search.
Data mining is an exciting and diverse process that includes many components, some of which are even confused with the mining itself. For example, statistics is an essential element of data mining (Lu, 2021). Data mining and machine learning fall into data science, but they have different principles of operation, despite some similarities. Data mining involves several vital steps or stages (Lu, 2021). One can distinguish the search for the necessary information, the preparation of data, the evaluation of information, and the provision of a solution.
The process of data mining provides people with many means of solving problems in the digital age. Information mining has many advantages, among which the following can be distinguished (Lu, 2021). This process helps companies to collect reliable information from a massive amount of data. Data mining is more efficient and cost-effective than other data processing applications. It helps businesses make profitable production and operational adjustments and uses new and outdated systems. Data mining allows companies to make informed decisions and identify credit risks and fraud. In addition to the above, this analysis method allows data processing specialists to easily and quickly analyze massive volumes, initiate automated forecasts of behavior and trends, and detect hidden patterns. Thus, it is possible to conclude that data mining is a convenient and effective way of processing information, which has many advantages.
Reference
Lu, Z. (2021). Research on the application of computer data mining technology in the era of big data. In Journal of Physics: Conference Series (Vol. 1744, No. 4, p. 042118). IOP Publishing.
Ethnography refers to the study of specific cultures. The study of cultures is of great importance under normal circumstances to enhance the understanding of the same. It is against this background that the study of cultures occupies a central role in society. Anthropology involves the study of human behavior. This involves the history and cultures of people. However, anthropology involves a general approach to the whole aspect of human culture, behavior, and experiences. Ethnography on the other hand focuses on the specific aspect of culture. Normally ethnography involves the selection and study of a specific culture.
This is considered vital since it brings more accuracy and authenticity to the whole field of anthropology. In this case, the knowledge that could not have been achieved through anthropology can be achieved by ethnography. Therefore ethnography is considered as a better way of understanding human behavior specifically culture. Henceforth, ethnography complements anthropology in many ways. Ethnography works in many ways, the collection of information and process of study involves several methods and parameters. One common method used by ethnography is data mining. Data mining aids ethnography since it is through it that the necessary information is obtained, analyzed, and evaluated. This paper aims to take a keen look at the concept of ethnography. To succeed in this endeavor the paper will also analyze data mining and its essence. The paper will refer to several articles and sources in the discussion of the whole concept.
Culture and Psychology
Ethnography plays a key role in the study of psychological behaviors. These behaviors are shaped by culture and human experience (Atkinson & Hammersley 2007). It is against this background that through data mining, ethnography collects information and evaluates bringing out the essence of the same. Psychological behaviors are those features that emanate from the status of the mind of students at any given time. The psychological status of an individual at any given time determines the effect of whatever activity is prevailing. Psychological behaviors might take the shape of personality and other symptoms of mental disorder. Under normal circumstances, psychological behaviors involve several conditions. Most of these conditions represent a malfunction of the various mental faculties. Examples of psychological behaviors include anxiety, attitude, and motivation (Havemeyer 2007).
Various studies conducted have indicated that anxiety is a major cause of poor performance among students from certain cultural backgrounds. It is a proven fact that anxiety contributes to a high degree of low performance by the students. However, this does not mean that only anxiety causes poor performance since virtually all psychological behaviors tend to harm the performance of the students. Fears of all kinds and aspects of mental and psychological disorders all have a profound effect on the aspect of performance of the students. Yet as far as the learning process is concerned, performance is the most important of all the aspects (Fetterman 2009). The performance of the students is an indicator of the success or failure of the whole program. When a student’s performance is affected it leads to a kind of situation where the program is of no significance.
Children who have psychological disorders lack the necessary concentration and focus that is necessary for learning (Keong 2006). Under normal circumstances, the process of learning requires a lot of focus and attention at the same time. There is therefore a very clear relationship between concentration and the learning process. Without adequate concentration and attention, the learning process is rendered ineffective. As a result, the role played by factors of psychological nature such as anxiety is great and cannot be underestimated. Such factors don’t get limited to the learning process alone (Larose et al 2007). Their impact goes beyond the learning environment. Under normal circumstances, the students get problems in almost all the other areas of life. For instance, the social lives of such students are also affected by such factors of psychological nature.
Conclusion
Ethnography plays an important role in the field of anthropology. Ethnography complements the process of anthropology. However, ethnography is more effective, successful, and specific than anthropology. This is because ethnography focuses on a specific subject in its study. Under normal circumstances, ethnography involves the study of cultures. This is done in a way in which a specific culture is selected for the study. In this way, the process is more objective and successful on many counts. Ethnography involves several methodologies and aspects. For the whole process to be successful several parameters are needed. This is how the process of ethnography becomes successful on many counts. As a result, ethnography involves the process of data mining. Data mining refers to the process in which the data is sought for study. Ethnography cannot operate without data mining. Data mining is the success secret of ethnography. Through data mining, ethnography established the necessary information needed for the analysis and understanding of human culture. This works in a manner where the data involving the selected culture is sought and analyzed thoroughly to be used in making conclusions. The paper has discussed fully the concept of ethnography. Since there are several related aspects, the paper has focused on several parameters. This was done to navigate through all the necessary factors of ethnography.
References
Atkinson, P. & Hammersley, M. (2007). Ethnography: principles in practice. Washington: Taylor & Francis.
Fetterman, D. (2009). Ethnography: step-by-step. New York: SAGE.
Havemeyer, L. (2007). Ethnography. Washington: LLC.
Keong, W. (2006). Advances in knowledge discovery and data mining: 10th Pacific-Asia conference. New York: PAKDD.
Larose et al. (2006). Data mining the Web: uncovering patterns in Web content, structure, and usage. Washington: Wiley-Interscience.
In recent times, the relatively new discipline of data mining has been a subject of widely published debate in mainstream forums and academic discourses, not only due to the fact that it forms a critical constituent in the more general process of Knowledge Discovery in Databases (KDD), but also due to the increased realization that this discipline can be applied in a number of areas to enhance decision making processes, efficiency, and competitiveness in contemporary organizations (Kusiak, 2006).
The basic concept behind the emergence of data mining, and which has contributed immensely to its admissibility as one of the increasingly used strategies in business establishments as well as scientific and research undertakings, is that by automatically sifting through large volumes of information which may primarily appear irrelevant, it should be possible for interested parties to extract nuggets of useful knowledge which can then be used to drive their agenda forward (Adams, 2010).
Goth (2010) observes that the emergence of data mining has been primarily informed by the rapid growth in data warehouses as well as the recognition that this heap of operational data can be potentially exploited as an extension of both business and scientific intelligence.
The present paper seeks to critically discuss the discipline of data mining with a view to illuminate knowledge about its origins, concepts, applications, and the legal and ethical issues involved in this particular field.
Definition & History of Data Mining
Although data mining as a concept has been defined differentially in diverse mediums, this report will adopt the simple definition given by Payne & Trumbach (2009), that “…data mining is the set of activities used to find new, hidden or unexpected patterns in data” (p. 241-242).
The purpose of data mining, as observed by these authors, is to extract information that would not be readily established by searching databases of raw data alone. Through data mining, organizations are now able to combine data from incongruent sources, both internal and external, from across a multiplicity of platforms with a view to assist in a variety of business applications.
At its most elemental state, data mining utilizes proved procedures, including modeling techniques, statistical investigation, machine learning, and database technology, among others, to seek prototypes of data and fine relationships in the sifted data with the main objective of deducing rules and intricate relationships that will inarguably permit the extrapolation of future outcomes (Pain & Trumbach, 2009; Adams, 2010).
Researchers and practitioners are in agreement that the capability of both generating and collecting data from a wide variety of sources has greatly impacted the growth trajectories of data mining as a discipline.
This capability, according to Adams (2010) and Chen (2006), was precipitated by a number of variables, which can be categorized into the following:
increased computerization of business, scientific, and government transactions with the view to increase efficiency and productivity,
extensive usage of electronic cameras, scanners, publication devices, and internationally recognized bar codes for most business-related products,
advances in data gathering instruments ranging from scanned documents and image platforms to global positioning and remote sensing systems,
the development and popularization of the World Wide Web and the internet as widely accepted global information systems.
This explosive growth in stored or ephemeral data brought us to the information age, which was, and continues to be, characterized by an imperative need to develop new techniques, procedures and automated tools that can astutely assist us in transforming and making sense of the huge quantities of data collected via the above stated protocols (Goth, 2010).
To dig a bit deeper into the history of data mining, research has been able to establish that the term ‘data mining’, which was introduced in the 1990s, has its origins in three interrelated family lines. It is important to note that the convergence of these family lines to develop a unique discipline in the context of data mining certainly gives it its scientific foundation (Adams, 2010).
This notwithstanding, extant research (Adams, 2010; Chez, 2006) demonstrate that the longest of these family lines to be credited with the gradual development of data mining as a fully-fledged discipline is known as classical statistics.
Researchers are in agreement that it would not have been possible to develop the field of data mining in the absence of statistics as the latter provides the foundation of most technologies on which the former is built, such as “regression analysis, standard distribution, standard deviation, standard variance, discriminant analysis, and confidence intervals” (Goth, 2010, p. 14).
All these concepts, according to this author, are used to study data and data relationships – central aspects in any data mining exercise.
The second longest family line that has contributed immensely to the emergence of data mining as a fully-fledged field is known as artificial intelligence, or simply AI. Extant research demonstrate that the AI discipline, which is developed upon heuristics as opposed to statistics, endeavors to apply human-thought-like processing to statistical challenges while using computer processing power as the appropriate medium (Talia & Trunfio, 2010).
It is important to mention that since this approach was tied to the availability of computers and supercomputers to undertake the heuristics, it was not practical until the early 1980s, when computers started trickling into the market at reasonable prices (Goth, 2010).
The third family line to have influenced the field of data mining is what is generally known as machine learning or, better still, the amalgamation of statistics and AI (Adams, 2010). Here, it is of importance to note that while AI could not have been viewed as a commercial success during the formative years, its techniques and strategies were largely co-opted by machine learning.
It is also important to note that machine learning, while able to take the full benefit of the ever-improving price/performance quotients provided by computers in the decades of the 1980s and 1990s, found usage in more applications because the entry price was lower that that of AI, not mentioning that it was largely considered as an evolved facet of AI as it was effectively able to blend AI heuristics with complex statistical analysis (Chen, 2006).
Review of how Data Mining is used Today and how it could be used in the Future
Presently, there exist broad consensus that data mining is mostly based on the machine leaning techniques; that is, it is fundamentally perceived as the adaptation of machine learning techniques and concepts to a wide variety of areas, such as business and scientific applications (Adams, 2010).
Therefore, the present-day data mining can only be described as the amalgamation of historical and recent developments, particularly in statistics, artificial intelligence, and machine learning, with a view to developing a software program that can run on a standard computer to, among other things, make diverse decisions based on the data under study, use statistical concepts and applications to establish various relationships among the data, and also use more advanced artificial intelligence heuristics and algorithms to achieve its major goal (Talia & Trunfio, 2010).
Extant research demonstrate that the major objective of current data mining applications is to sift through huge volumes of data to extract nuggets of useful data, which can then be used to establish previously-hidden trends or patterns.
Today, more than ever before, data mining is used in the business arena to boost corporate profits by improving customer relations and targeting new customers (Cary et al, 2003).
According to these authors, “…AT&T Wireless was able to increase it’s subscriber base by 20% in less than a year when it contracted with a data-mining company to identify customers that would likely to be interested in AT&T’s new flat-fee wireless service” (p. 158).
The AT&T story demonstrates that visions of achieving good returns continue to drive businesses toward embracing data mining technology.
Data mining is bound to be used along the same lines in the future to enable enterprises make critical decisions from a knowledge-oriented perspective. Consecutive studies have demonstrated that most business organizations fail to wade through the harsh economic waters of modern times due to their perceived inadequacy to base their most important decisions on knowledge and evidence (Adams, 2010; Goth, 2010).
However, it is now evident that data mining can be used to endear organizations closer to a knowledge-based economy, which basically translates into the use of knowledge to generate economic benefits.
Chen (2006) observes that a knowledge-based economy necessitates data mining processes to become more goal-oriented with the view to generating an enabling environment where more tangible results can be achieved.
Consequently, data mining should be used in the future not only to facilitate the uncovering of concealed knowledge beneath the ocean of data readily found in a multiplicity of mediums and applications, but also to ensure that it makes important contributions to the knowledge-based economy with the express intention of coming up with more tangible business and scientific outcomes (Chen, 2006; Adams, 2010).
Types of Data Mining Applications
There exist a multiplicity of data mining applications which can be used in diverse situations and environments depending on the major objective for usage. Some data mining applications, according to Chen (2006), are simple to use and may be offered for free, while others are complex and require a sizeable investment to operationalize.
This section will discuss some data mining applications based on the sector of practice, and will mainly focus attention to the banking and finance, retail, and the healthcare sectors of the economy.
Data Mining Applications in the Banking & Finance Sector
Most banking institutions have over the years employed a multiplicity of data mining applications to model and predict credit fraud, to assess borrower risk, to undertake trend analysis, and to evaluate profitability, as well as to assist in the initiation and management of direct marketing activities (Seifert, 2004).
In equal measure, most finance and credit companies have over the years employed a variety of neural networks and other data mining applications “…in stock-price forecasting, in option trading, in bond rating, in portfolio management, in commodity price prediction, in mergers and acquisitions, as well as in forecasting financial disasters” (p. 191).
Here, it can be noted that the Neural Applications Corporation has developed an effective application known as NETPROPHET, which is increasingly being used by finance companies to make stock predictions by illustrating the real and predicted stock values depending on the type of data that has been keyed into the system (Groth, 1999; Chen, 2006).
The banking sector is continuously been faced with fraud cases, and data mining applications such as HNC Falcon has assisted the institutions to monitor payment-card applications, decreasing fraud cases by almost 75 percent while increasing applications for payment card accounts by as much as 50 percent on a yearly basis (Groth, 1999).
The importance of banks to develop data mining applications that could be used in cross-selling and maintenance of customer loyalty has been well documented in literature.
These applications, according to Groth (1999), mainly assist banking institutions to model the behavior of their customers in such a manner that the resulting relationships could be used to establish the needs and demands of their customers, as well as make objective predictions into the future.
RightPoint software, Security First, and BroadVision are some of the vendors primarily interested in integrating predictive technologies with consumer interaction points to ensure customer needs and demands are efficiently dealt with (Groth, 1999; Chen, 2006), in addition to using predictive technologies to integrate one-to-one marketing strategies to their clients banking sites (Adam, 2010).
According to Groth (1999), “…the RightPoint Real-Time Marketing Suite takes data-mining models and leverages them within real-time interactions with customers” (p. 194). This application is unique in that it is designed to develop, manage and deliver one-to-one marketing initiatives for high-end industries that heavily depend on direct customer interaction to undertake business (Goth, 2010).
As a general prerogative, it is important to note that majority of the data mining applications used in the banking and finance sector attempt to ensure that each customer interaction seizes the prospect of enhancing customer satisfaction, loyalty, motivation, and profit-generation potential (Talia & Trunfio, 2010; Zhang & Segall, 2008).
Data mining Applications in Retail
Intense competition and slim profit margins have obliged retailers to embrace data warehousing strategies earlier than other sectors. As observed by Groth (1999), “…retailers have seen improved decision-support processes lead directly to improved efficiency in inventory management and financial forecasting” (p. 198).
It is a well known fact that expansive retail and supermarket chains are in possession of huge quantities of point-of-sale data that is not only information-rich, but could be employed using appropriate data mining applications to improve the stated decision-support strategies, improve efficiency in financial predictions and inventory management, and analyze customer shopping patterns (Seifert, 2004).
In the retail sector, the AREAS Property Valuation product from HNC software, as well as SABRE Decision Technologies, serves as good examples on how data mining applications can be used in the retail sector to perform valuations, projection and forecasting, customer purchasing behavior analysis, and customer retention analysis, with the underlying purpose of increasing profitability, enhancing customer experience, and making better and more informed business decisions (Zhang & Segall, 2008).
In evaluating customer profitability in the retail sector, a software vendor referred to as Dovetail Solutions has developed a data mining application known as Value, Activity, and LoyaltyTM (VALTM), with a view to utilize transactional business data from the retailers to synthesize information about customer activity and processes, churn rate, and anticipated future purchases (Groth, 1999; Chen, 2006).
Data Mining Applications in the Medical Field
The vast amount of data available within the healthcare industry, including the associated data collected via medical research, biotechs, and the pharmaceutical industry, have provided a fertile ground for data mining applications to grow. The knowledge that data mining has been employed expansively in the medical industry is in the public domain.
For example, we are aware that the vendor NeuroMedical Systems ingeniously employed neural networks to create a pap smear diagnostic aid, while both Vysis Rochester Cancer Center and the Oxford Transplant Center continues to employ a data mining application known as KnowledgeSEEKER, which utilizes a decision tree technology, to assist in various research undertakings (Groth, 1999; Adams, 2010; Chen, 2006).
It is important to note that these applications are beneficial in the medical sector as they enable health practitioners to come up with accurate diagnosis even without subjecting patients to physical examination (Koh & Tan, 2008).
Governments and other interested health agencies can utilize data mining applications, such as MapInfo, KnowledgeSEEKER, and LEADERS, among others, to: demonstrate average costs of health services; show efficiency of a particular prescription over time; reveal efficacy rates of diverse pathogens over time; develop superior diagnosis and treatment protocols; show patient location in order to deliver superior health services; or assist healthcare insurers to detect fraud (Koh & Tan, 2008; Wen-Chung et al, 2010).
Legal & Ethical Issues in Data Mining
As is the case in other disciplines, the field of data mining is faced with a complexity of legal and ethical issues which needs to be addressed for the applications to succeed.
In the legal arena, it is important to evaluate how organizations should employ data mining applications while remaining focused on protecting the private information of their customers so as to avoid customer dissatisfaction or even being subjected to legal action by customers who may feel that the organizations intruded into their privacy (Cary et al, 2003; Wen-Chung et al, 2010).
In terms of ethical issues, it is a well known fact the spread of personal information, as is the case in many data mining applications, can lead to elevated risks of customer identity theft (Cary et al, 2003).
According to Payne & Trumbach (2009), data mining processes brings into the fore a scenario where “…the consumer loses aspects of privacy as all of their basic demographic information, personal interests, correspondence and activities are stored in databases and available to be combined together” (p. 243).
Such a scenario has obvious ethical ramifications since this information can be used to the disadvantage of the customers.
Another ethical query arises from the fact that consumers lose the control over what happens to their personal information held in large databases, implying that such kind of information can be used to the disadvantage of the providers if it happens to fall into the wrong hands (Payne & Trumbach, 2009; Cary et al 2003; McGraw, 2010).
What’s more, customers who provide personal information to organizations face a more ominous challenge that may entail potential discrimination based on the personal information they either provide or refuse to provide to the organizations.
Another important factor to consider when evaluating ethical concerns in data mining is that there is no fine line distinguishing if it is indeed necessary for an organization to use the private information of its customers to enhance its profitability or if such information should be sorely used to improve customer satisfaction and maintain consumer trust (Payne & Trumbach, 2009).
Lastly, it is well known that a number of data mining processes may yield incorrect conclusions, which may be costly to the organization as well as to the customers (McGraw, 2010).
Conclusion
This discussion has brought into the fore important aspects of data mining, its current and future uses, as well as perceived limitations in terms of legal and ethical constraints.
The general consensus among academics and practitioners is that data mining represents the new frontier of growth, particularly in nurturing mutually fulfilling customer relationships, ensuring that customer’s needs and demands are satisfactorily met, and in facilitating organizations to forecast and predict future growth and decision patterns (Adams, 2010; Kusiak, 2006; Wen-Chung et al, 2010).
The task is therefore for developers to continue investing heavily in effective and efficient data mining applications to ensure that such tools are able to achieve what they were originally intended to achieve. Consequently, research and development into these applications and tools is of primary importance.
Reference List
Adams, N.M. (2010). Perspectives on data mining. International Journal of Market Research, 52(1), 11-19. Retrieved from Business Source Premier Database
Cary, C., Wen, H.J., & Mahatanankoon, P. (2003). Data mining: Consumer privacy, ethical policy, and systems development practices. Human Systems Management, 22(4), 157-168. Retrieved from Business Source Premier Database
Chen, Z. (2006). From data mining to behavior mining. International Journal of Information Technology & Decision Making, 5(4), 703-711. Retrieved from Business Source Premier Database
Goth, G. (2010). Turning data into knowledge. Communications of the ACM, 53(11), 13-15. Retrieved from Business Source Premier Database
Groth, R. (1999). Data mining: Building competitive advantage. Upper Saddle River, NJ: Prentice Hall
Koh, H.C., & Tan, G. (2008). Data mining applications in healthcare. Journal of Healthcare Information Management, 19(2), 64-72
Kusiak, A. (2006). Data mining: Manufacturing and service applications. International Journal of Production Research, 44(18/19), 4175-4191. Retrieved from Business Source Premier Database
McGraw, D. (2010). Data identifiability and privacy. American Journal of Bioethics, 10(9), 30-31. Retrieved from Academic Search Premier Database
Payne, D., Trumbach, C.C. (2009). Data mining: Proprietary rights, people and proposals. Business Ethics: A European Review, 18(3), 241-252. Retrieved from Business Source Premier Database
Talia, D., & Trunfio, P. (2010). How distributed data mining tasks can thrive as knowledge services. Communications of the ACM, 53(7), 132-137. Retrieved from Business Source Premier Database
Wen-Chung, S., Chao-Tung, Y., & Shian-Shyong, T. (2010). Performance-based data distribution for data mining applications on grid computing environments. Journal of Supercomputing, 52(2), 171-198. Retrieved from Academic Search Premier Database
Zhang, Q., & Segall, R.S. (2008). Web mining: A survey of current research, techniques, and software. International Journal of Information Technology & Decision Making, 7(4), 683-720. Retrieved from Business Source Premier Database
Operation of organisations requires the possession of an immense wealth of information, which makes the application of data warehouses and data mining valuable in modern business operation environments. Fundamentally, enterprise data warehouse, alternatively termed as a data warehouse, refers to databases that are deployed in data analysis coupled with reporting (Inmon 5).
Data warehouses are meant to provide a storeroom for historical and recent data that is deployed for the making and dispensation of information that are utilised when preparing superior management plans such as the assemblage of periodical and yearly reports for comparison purposes.
Data contained in the warehouses is normally uploaded to various operational systems of an organisation including sales and marketing, among others. Opposed to data warehouse, data mining refers to “the computational process of discovering patterns in large data sets involving the methods of intersection of artificial intelligence, machine learning, statistics, and data systems” (Haughton et al. 290).
The main aim of putting in place a system for data mining in an organisation is to provide a means of retrieving reports from data stores with the overall objective of converting the retrieved reports into forms, which can be altered into an appropriate arrangement.
Considering that both data warehouse and data mining may avail an incredible data to an organisation, which can be deployed in making vital decision and or aid in taking corrective measures upon its analysis, it sounds imperative for all nations to have organisations placing central emphasis on data warehouse and data mining.
This paper aims at conducting a comparative analysis for the extent to which data warehouse and data mining have been deployed in the UAE and Europe. Before this section, a brief history of data mining and data warehouse is considered.
Research Project Questions
Data warehouse and data mining are incredible in providing business support solutions. From this assertion, the two important research questions related to this research include:
In which industry are data warehousing and data mining most beneficial? As revealed later, data mining and data warehousing have been applied in science and engineering, surveillance, medical fields, and in the business field in both the UAE and the international arena. However, it is most beneficial in the business field.
Is the application of data mining and data warehousing a reserve of some organisations, and to what extent are they utilised within the UAE and other parts of the world? In response, data mining and data warehousing are in all organisations, with the UAE organisations using it the most in all its organisations.
History of data warehouse and data mining
Data warehouse
The concept of data warehousing was first introduced at a practical level in the 1980s. In this time, an IBM researcher Barry Delvin and his colleague Paul Murphy developed the concept with the intention of providing a model for enhancing the flow of various forms of organisational data from an operation system into an environmental decisions support system.
With regard to Inmon, “the concept attempted to address the various problems associated with this flow, mainly the high cost associated with it” (13).
The circumstances leading to the establishment and development of the concept of data warehousing was attributed to the fact that failure to have a data warehouse led to the need of putting in place large amounts of redundancy to help in supporting a myriad of decision support environments.
As information systems within large organisation become complex coupled with disintegration of a single organisation into several parts to enhance service delivery and specialisations, it becomes crucial for each department within an organisation to have its information database and buffer systems.
Consequently, “…in large corporations, it was typical for multiple decision support environments to operate independently” (Zhu and Davidson 67). For instance, in a business set up, an organisation can implement a data warehouse plan that monitors clients’ purchases, as shown below.
Source: (Browning and Mundy)
However, even though different departments held different information, each department’s operations depended on the information held by other departments. Data stored by various departments within an organisation is also similar in one way or another.
For this reason, Barry Delvin and his colleague Paul Murphy found it necessary to develop a “process of gathering, cleaning and integrating data from various sources, usually from long-term existing operational systems, which could be replicated for each environment” (Zhu and Davidson 71).
Although the 1980s marked the period in which the concept of data warehousing was developed into a form that could be utilised within an organisation, the initial formulations of the concept can be traced as early as1960s. The pyramid below shows the beneficiaries of data warehousing and the extent to which they use the technology.
In 1960, General Mills came up with the terms facts and dimensions as they are applied in data warehousing. A decade later, IRI together with ACNielsen speared headed the invention of data marts utilised in retail sales.
In the same decade, Bill Inmon attempted to define the term data house. In 1975, Sperry Univac introduced the concept of MAPPER. It refers to “data management and reporting systems that include the world’s first 4GL, which formed the first platform specifically designed for building information centres (a forerunner of contemporary enterprise data warehousing platform)” (O’Brien and Marakas 93).
The work acted as an insight into the work of Taradata in 1983 who designed and practically tested a database management system meant to enhance decision support. Development of data warehouse in the application of decision support systems acquired principal focus in the 1990s when Raph Kimball introduced the red brick warehouse.
This data management system was particularly designed for data warehousing. Bill Inmon developed further this work when he designed software for data warehouse development in 1991. Later, in the year 2000, Daniel Linstedt released Data Vault.
Data Mining
Data mining has established central roles in knowledge engineering and artificial intelligence discourses. It refers to “the process of discovering meaningful correlations, patterns, and trends by sifting through large amounts of data stored in databases” (Haughton et al. 290) following the procedure below.
Its roots are ingrained in statistics and machine learning though applied mainly in computer science. The subject of data mining has attracted an immense interest over the last two decade, although its study started four decades ago. Data mining started as a statistical analysis tool advocated for by two main companies: SPSS and SAS (O’Brien and Marakas 31).
Nevertheless, some of the applications of data mining in the past four decades remain relevant today. They include regression analysis and cluster analysis, among others. The modern statistics routines used in data mining have incredibly improved on the past application of data mining.
New approaches to data mining like heuristics, fuzzy logic, and neural networks began to gain substantive scholarly attention in the 1980s. With improved computational power of computers, more extensive analysis and analyses that are more powerful could be done through interactive classification.
Workshops on data mining, alternatively called knowledge discovery in databases, were first held in the 1990s. However, such conferences have been incredibly on the hike over the last decade. Precisely, data mining can be traced from three primary sources. These are statistics, artificial intelligence, and machine learning. Data mining can be accomplished through simple strategies such as graphing coupled with charting.
A good example of enhancing data mining through charting is the case of scatter graphs. In an organisation, data mining has broad applications including identification of customer patterns, finding various associations existing between various demographic traits of clients, and identification of the most royal customers, among other tasks. Arguably, multiple departments within an organisation need such information.
Comparison and evaluation of data warehousing and data mining in the UAE vs. International level
Different nations across the globe, including the UAE, have received and applied data warehousing and data mining in a variety of fields.
Data mining has been applied in science and engineering, surveillance, medical fields, and in the business field in both the UAE and the international arena. However, this section discusses the application of data mining and data warehousing in business only since other application are beyond the scope of the paper.
In commercial applications, data warehouse and data mining are incredibly helpful in analysing various chronological performances of a company often available in the company’s records in the form of inactive data. The purpose of the analysis of data is principally to unveil trends and patterns of hidden business performance.
This task is accomplished in data mining through the deployment of data mining software, which makes use of “advanced pattern recognition algorithms to sift through large amounts of data to assist in discovering previously unknown strategic business information” (O’Brien and Marakas 107). Data mining has been used throughout the world, including in the UAE to perform several business-related tasks.
The tasks include market analysis in the effort to identify potential new products, determine the causes of manufacturing problems, prevent withdrawal of some customers from consuming an organisation’s products, acquire new customers, generate customer profiles for purposes of evaluation, and determine the appropriate marketing strategies to sell across both existing and potential customers.
Valid examples of global organisation have invested heavily on data warehousing and data mining to enhance their performances in the dynamic business environment. For instance, “Wal-Mart processes over 20 million point-of-sale transactions every day” (O’Brien and Marakas 141) with the transactions’ information being contained in a centralised database.
Every department within Wal-Mart has to extract information that is necessary to perform its operations from the database. Although information extracted this way is crucial for the making of decisions that would help to better the performance of every department within the company, such raw data is not of any use without having some sort of software to analyse the data.
The need to analyse such data underlines the significance of data mining software within any company having a large pool of customers and with global operations.
For the case of Wal-Mart, the use of data mining techniques to conduct an analysis of the information contained in the centralised database enables the company to develop necessary campaigns for marketing and even predict loyalties developed by customers in relation to a myriad of brands offered by the company with better precision.
Data mining techniques are also helpful in aiding to build and manage customer relationships across all organisations in the world, including in the UAE. From the analysis of data contained in a data warehouse of an organisation using data mining techniques, the UAE organisations have ceased from maintaining a centralised call centre for contacting potential customers.
Rather, the focus shifts to targeting customers who have the highest probability for responding to a given offer from the analysis of their purchasing patterns.
Additionally, with regard to Zhu and Davidson, from the analysis of the data contained in a data warehouse using data mining statistical and other tools, “more sophisticated methods may be used to optimise resources across campaigns so that one may predict to which channel and to which offer an individual is most likely to respond (across all potential offers)” (88).
Once analysis is done to determine response patterns for potential clients for an organisation’s products, automation of mails can be done in the effort to solicit the potential customers to respond. This practice has been well developed and used not only in the UAE but also in the global arena.
Real world examples for the application of data warehousing and data mining used by the UAE organisations
Data warehouses and data mining used by the UAE are available in a variety of places across the globe. For instance, they are used in India in health care settings. One of such practical application is the use of data mining to aid in decision making as tools for evaluation of treatment choice for various ailments such as fibroids (Campbell 201).
In the installation process of such software, the main challenging question that appeared was whether data mining could indeed be applicable in facilitation of the treatment process for fibroids to provide means of predicating possible treatment choices.
If the answers were yes, the hospital technologists within India questioned on the most appropriate data mining techniques that would help to predict the outcomes of various treatment options. In the effort to address these queries, generation of an archive of data relating to different available treatment options and the responses of patients upon undergoing the procedure were maintained.
Based on the characteristics of the patients, analysis of the data warehouse helps to predict the most viable treatment option that is consistent with every patient’s fibroid treatments needs in India.
Extension of this decision through data mining and database warehouse in India to the diagnosis and treatment of other complex ailments has seen the nation rise up as one of the nations in the world housing the most successful treatment facilities with many cases of ailments requiring advanced treatment being referred to India.
Oracle software: how it supports the business
The Oracle Company makes and sells various software applications for business support across the globe. The oracle software carries a myriad of functional modules deploying RDBMS, which functions as back-end.
The various functional modules ingrained within business support software produced by oracle include Oracle HRMS (human resource management systems), oracle CRM (customer relations management), Oracle financials, oracle projects, and oracle procurement among others.
The first business support software to be released by oracle was oracle financials in the 1980s. Later, the company released a myriad of business support software, including the database warehouse management systems and data mining software.
The oracle data-mining module (ODM) is essentially an optional relational data management system (RDBMS). It constitutes data analysis and data management algorithms designed to support business through easing tasks such as feature selection, classification of data, regression analysis, detection of various anomalies, making predictions, and other specialised analytics necessary when making organisational decisions.
With regard to Davenport, oracle data mining software is pivotal in “providing a means of creation, management, and operational deployment of data mining models inside the database” (214). Making of decisions within an organisation requires immense scrutiny and analysis of quantitative data that helps to explain trends in business operations and in general the industry trends.
Purposing to help an organisation to realise these functions, regression analysis-support and modules are incorporated into the ODM. The implementation of various business functional-support modules are done such that “the implementations are integrated right into the oracle database kernel operating natively on stored data in the relational database tables” (Davenport 215).
From these databases, different departments of an organisation can tap the necessary data, which upon its analysis helps in making vital decisions.
Use of Carrefour Technology in Europe and in the UAE
Carrefour is one of the leading global distributors of various commodities. The company has various classes of distribution networks such as supermarkets and hypermarkets coupled with convenience and hard discount stores in America, Europe, the UAE, and other regions in the world. The organisation has been endeavouring to maintain its leadership within a competitive grocery industry.
The company planned to achieve this goal through “gaining control over its marketing process and more effectiveness in leveraging its business intelligence- with the ultimate aim of strengthening customer loyalty” (IBM Para 4).
The solution to resolve challenges facing Carrefour was arrived at by strategically deciding to partner with IBM coupled with other parents to design and implement a system for in-house promotion throughout the supermarkets and hypermarkets owned by the organisation.
This strategy was found to be incredible in aiding to plan and execute more focused marketing plans, which would ensure rapid assessment of the campaign feedbacks. It was essential in the quick evaluation of marketing efforts of the company. Getting information on how customers respond to a given promotional campaign may be measured from various dimensions.
The most significant way of measuring is assessing changes in sale levels of an organisation. Given that Carrefour has many retail stores across the globe, garnering all such information requires the creation of computer-enabled interfaces.
Faced with the need of rapid analysis of sales data, Carrefour had limited options rather than deploying data mining and data warehousing technology, which consisted of various solution components, which were an IBM Supermarket Application coupled with IBM DB2 from the paradigm of software.
Following the success of its technology in helping to establish better relationships with customers, suppliers, and in helping to plan for short but effective marketing campaigns, the Carrefour technology has been used in many European and the UAE organisations to the extent of being deployed in the retail industry and national organisations.
Pros, Cons, Strengths, and Weaknesses with respect to data warehousing and data mining systems
The goal of using data management systems is to facilitate decision-making processes in a more rapid way. However, irrespective of the extent to which systems may be helpful to achieve this noble purpose, no systems lack pros and cons, and strengths and weakness. In case of data warehouse, copies of information garnered form transactions within organisations are maintained.
This case provides an essential opportunity for an organisation to “congregate data from multiple sources into a single database so that a single query engine can be used to present data” (Inmon 52).
Data warehouse also makes it possible to keep historical records of transactions even though systems for the transaction may fail to do so. This data is utilised in making future forecasts, which help to inform critical activities of a business, including the sale and forecasting of production patterns in the future.
Through a data warehouse, it also becomes possible for an organisation to come up with a means of integration of data from a number of sources in the effort to create an overall means for decision making in an entire enterprise.
It also enables an organisation to provide data and information that is consistent since the results of the analysis are representative of both current and historical trends from the analysis of data using data mining tools.
Furthermore, data warehousing is essential in “restructuring data so that it makes sense to the business users, restructuring the data so that it delivers an excellent query performance, even for complex analytic queries, without impacting the operational systems, and adding value to operational business applications, notably customer relationship (CRM) systems” (Inmon 55).
Consequently, through data warehousing, it is possible to provide data models for all data of interest irrespective of the sources of the data within an organisation. Although data warehousing has all these merits, it has central weaknesses (cons) in that data is usually stored in a ‘soft’ form.
The link between data warehousing and data mining is that it is easier to mine data, which is properly housed meaning that the effectiveness of data mining is dependent on data housing. Consequently, data mining has the demerit that it cannot be effective without the existence of an integrated organisational information database.
In case databases are corrupted, or information contained in them is lost through malicious acts of some employees of an organisation or unauthenticated access to an organisation’s database systems, it implies that data mining software loaded in an organisation’s systems would have no data source from which to carry out analysis.
In addition, as Haughton et al. reckon, “if a data mining query has to run through terabytes of data spread across multiple databases, which sit on different physical networks, the query is not efficient and that getting results will take a long time” (301). This demerit puts a pre-condition to data systems developers since data warehouse systems must be developed such that they are able to interconnect with other databases.
Market analysis for data warehousing and data mining systems
The market for data warehouses and data mining systems is immense in the UAE and in the international fronts across many industries. For instance, in the discipline of human resource, data mining can be utilised in the identification of employees who are most productive to an organisation based on their historic efficiency and outputs recorded and maintained in an organisation’s integrated database (data warehouse).
This argument means that the market is substantive for the systems in organisations, which want to have an overall means for evaluation and interrelating data for different organisations’ departments for integration followed by the analysis of the integrated data in the effort to draw more decisions that are conclusive.
The market potential for data warehouse and data mining software has grown in multiple figures following the high demand for decision support systems in the field of engineering and general science. Such areas include education, genetics, bioinformatics, and mechanical and electrical engineering without negating the discipline of medicine.
For instance, with regard to Zhu and Davidson, “in human genetics, sequence mining addresses the important goal of understanding the mapping relationship between the inter-individual variations in human DNA sequence and the variability in disease susceptibility” (109).
The main aim for using data mining in the human genetics is to help in the determination of how alterations of gene structure influence the danger for contracting ailments such as tumour. This argument is significant in the attempt to design and implement strategies for ailment prevention, diagnosis, and treatments. Arguably, the subject of immunology is beyond regional physical boundaries between nations.
Consequently, data mining software such as the software for the multifactor dimensionality reduction would gain positive market reception in the UAE and internationally, including Europe. In fact, European nations have been committing hefty amounts of money to investments in the new immunology and diseases diagnosis technologies in the past three decades.
From a business dimension, the UAE has been noted as being a significant business hub within Middle East. Organisations operating in the region largely depend on information garnering, integration, and processing to make decisions for vibrant growth.
These organisations are able to withstand global market dynamics so that minimal impacts of global financial crunch are experienced in the UAE organisations in comparison to organisations operating in Europe. Growth and the ability to develop resilience to business dynamic depend on how much an organisation is risk-aware.
Analysis of the degree of susceptibility of an organisation to risks is dependent on cute establishment of the trends of business performance based on the historical data. Indeed, this goal cannot be realised without analysis tools such as data mining techniques.
As argued before, these tools and techniques cannot operate free of data. Since the goal of any organisation is to ensure its long-term operation in the effort to return value to its owners, data warehousing and data mining possess an incredibly high potential not only in the UAE but also globally.
Main suppliers’ systems
Data warehousing and data mining systems can be designed in-house within an organisation. However, such an attempt would attract hefty costs. The alternative is hiring the service of an external supplier of the systems. Such suppliers include IBM and Oracle Foundation, among others.
It is important to note that, with globalisation and diversification of markets, it is no longer necessary to consider sources of suppliers for information systems of an organisation from just only potential supplier operating with a particular trade region or a country.
Rather, consideration of global dimensions is vital while evaluating the best information systems for procurement. Important aspects for considering when seeking the suppliers to procure data warehousing and data mining systems include the system availability, quality, reliability, and system security, among others.
Trends, perspectives, recommendations, and conclusion
Many industries in all business fields operating in a competitive business world recognise the relevance of developing information-based organisations to help in decision supports.
The trend in the corporate world has been to seek mechanisms of ensuring that decisions affecting the business of an organisation are based on data that is accumulated by an organisation for a span of time in the effort to make the future forecast-based for instance on data on consumption patterns of certain customers in particular markets.
In the attempt to remain competitive through the establishment of specific market campaigns that would see potential competitors pose lesser threats of new entrants, organisations must integrate all data handled by different organisation’s departments into a single source from which all departments can tap.
The paper discussed this matter as one of noble factors that favour positive reception of data warehousing technology both in the UAE and in the international arena. The trend of data mining has been the deployment of three leading data mining techniques.
These are clustering, neural networks, and association rule and decision trees. They have been vital in alleviating the challenge of an organisation having too much data accompanied with little information.
Although an organisation can hold a tremendous amount of data in its warehouse that documents every transaction, such data is worthless if a means of its analysis is not available so that appropriate decisions based on the statistical data inferences can be made. In this perspective, the paper argued that data mining is the aspect that enables an organisation to conduct an analysis of data to reveal business trends.
Works Cited
Browning, Dave, and Joy Mundy. Data Warehouse Design Considerations, 2000. Web.
Campbell, Kevin. Exploration of Classification Techniques as a Treatment Decision Support Tool for Patients with Uterine Fibroids: Proceedings of International Workshop on Data Mining for HealthCare Management. New York: PAKDD, 2010. Print.
Davenport, Henry. “Competing on analytics.” Harvard Business Review 3.1 (2006): 213-217. Print.
Haughton, Dominique et al. “A review of software packages for data mining.” The American Statistician 57. 4(2003): 290–309. Print.
IBM. Carrefour strengthens customer loyalty and its brand with a new promotions strategy, 2013. Web.
Inmon, William. Building the Data Warehouse. New York, NY: John Wiley and Sons, 2005. Print.
O’Brien, Jonhston, and George Marakas. Management Information Systems. New York, NY: McGraw-Hill/Irwin, 2011. Print.
Zhu, Xingquan, and Ian Davidson. Knowledge Discovery and Data Mining: Challenges and Realities. New York, NY: Hershey, 2007. Print.
The project proposal is specially designed to highlight the problem of large volume data handling and provides an efficient data mining solution. This project proposal is specifically designed keeping an eye on communication service delivering problems and provides its solution in a most approximate way. The proposal starts with basic concepts of data mining, related terms used in data mining, company background and business problem, in later sections this proposal highlights existing problem with the system and later on proposed solution and discussion. The proposal ends with conclusion and references.
Data Mining
Data mining is a commonly used term in computer field. Data mining is the process of sorting huge amount of data and finding out the relevant data. Usually ERP systems are used for sorting data in large organizations. Data mining is commonly known as knowledge discovery. This is the process of analyzing data from numerous perspectives and summarizing it into useful information for further processing. There are numerous companies following data mining techniques in order to make their system effective and time saving. Data mining is widely used for the maintenance of data which helps a lot to an organization in order to organize its resources and capital in a proper way. In other words, data mining is the process of finding relationship between dozens of fields. Database system gets affected if data mining techniques are not properly applied in a certain domain. Data mining is widely sued by large firms with strong consumer focus. Data mining techniques empowers organizations to identify the relationship between entities and to create a strong relationship among internal factors such as cost, positioning, staff skills and also it gives flexible path to create relationship among external factors like economic indicators, competition and customer demographics. Data mining also helps in determining the impact on sales caused by internal factors changes.
Terms
Data: Data is a raw material usually found in bulk quantity. There are three types of data operational or transactional data, non operational data and Meta data.
Information: The pattern, relationship and association among all this data could provide information.
Knowledge: Information can be converted into knowledge if related to previous facts and figures and previous statistics of an organization (Han & Kamber, 2000).
Data warehouses: Dramatic place for storage of data, data warehouse gives flexible opportunity to integrate new data with a previous one. Data warehousing is commonly defined as a process of proper data management and retrieval. Data warehousing gives a concept of storing all data centrally in large organizations.
Knowledge discovery in databases (KDD): KDD is a commonly used term in databases it’s a non-trivial extraction of previously unknown data and information from large databases.
Company Background
PCCW is the largest communication network in Hong Kong. PCCW Limited (PCCW) is one of the best communication companies in HKT (PCCW, 2008). HKT Group Holdings Limited, Hong Kong’s premier telecommunications are the most renowned provider and a world-class candidate in transferring information and communication technologies. PCCW also holds the great interest of foreign investors. The PCCW posses a remarkable position in market and it employs a total of 16,200 employees. Its headquarter is located in Hong Kong and is renowned in maintaining a presence in Europe, the Middle East, Africa, the Americas, mainland China and many other regions of Asia.
HKT has gained so much popularity in telecommunication business HKT Group Holdings Limited (HKT) was founded in 2008 with the aim of providing telecommunication services, media and IT solutions. They are renowned as the Hong Kong’s first quadruple-play experience provider, PCCW/HKT announces a wide range of media content and services in following four domains – fixed-line, broadband Internet, TV and mobile. They have gained a great success worldwide and posses following credits: Hong Kong’s leading telecoms player, genuine quadruple-play experience, Expert in ICT solutions, Expanding into international markets. They offer following services:
Voice Services, Data Services, Internet Services, Mobile Service, Equipment Solutions, ICT Solutions, Contact Center Services, Telecom and IPTV Solutions, Interconnect Services, TSCM Services. PCCW is also famous in outsourcing flagship.
Business process
Business process starts from setting up a wireless telecommunication network using different routers and switches. In order to provide fault free network number of employees and tools are used. The basic problem arises is the management of bulk amount of data efficiently. As, they provide telecommunication and IT services, the main problem arises is of data redundancy and data consistency. Both these problems are the main hurdles in providing valuable services. There is a huge amount of customer’s data also there to be deal with.
Existing Problem
Problem question: How to deal with large amount of customer data and services info in order to provide speedy communication system?
PCCW is a large organization and its primary responsibility is to provide best communication network to all its customers. There is a high risk of loosing customers if network gets fail due to huge network traffic. They offer services in telecommunication and also offer IPTV solutions. IPTV solutions give complete opportunity to integrate satellite systems. The main problem occur in providing speedy connections is data redundancy, time used in searching a particular record and noise distortion over large networks. It is really important that the service provided to all customers should be cost effective, speedy and based on fault free network.
Goals and objectives
The main objective of using data mining techniques is to reduce data redundancy over a large network. It also helps company to better utilize its resources and gives an opportunity to allocate resources in an effective manner. Data mining techniques provides strong facts and data which help in decision making. They also provide a path for better growth and allocation of resources.
Proposed solution
There are numerous data mining techniques are available that suits above scenario. There are many techniques provides excellent data handling over large networks if applied properly. Distortion and clustering techniques is proposed in order to solve above stated problem. Distortion techniques are specifically designed keeping an eye on the changing needs of explore data, this technique also helps in data exploration process by emphasizing on details and preserves an overview of the complete data. The main objective of distortion techniques is to explore high level of detail with the combination of lower level of data detail. For multidimensional data sets a dynamic projection method is widely used to change the overall projections. In order to solve this problem distortion technique will help a lot. PCCW team need to implement a structure in which data handling must be strong i.e. as they use ERP solution for data handling so according to the proposed solution they need to obtain the relationship between fields and then define a structure to link high level of data with lower detail of data based on details or attributes of data. When there is a link between both details, so when any particular data is called the search result would be according to the requirements and lots of time will be saved. Browsing is very difficult over large networks where a bulk quantity of data is available. With the help of interactive filtering and division of large data into smaller groups along with the relationship between fields this problem could be solve up to high extent.
Clustering is a process widely used for portioning data sets in meaningful classes for further effective processing. Clustering is commonly known as unspecified classification of data without the combination of predefined classes. Clustering is a techniques used for division of large volume of data into small identical groups. There is a numerous perspective to classify clustering techniques in data mining domain. Clustering plays a pivotal role in data mining applications. Clustering has become a significant problem in past few years’ databases, graphics, pattern recognition, neural networks and computer graphics. Clustering technique can solve the identified problem as PCCW poses a wide network so if the data will be divided in small groups, according to their Meta data and will be stored in a central respiratory system would be beneficial and save time. ERP solutions provide well defined data structure but still numerous companies are using other software along with ERP as integration problem is associated with an ERP solution. PCCW needs a well defined integrated structure of data for effective service. If clustering technique will be applied so the data would be stored in different groups, whenever a particular data will be searched the crawler or pointer will first check its Metadata and then enter in the group. By this way data redundancy problem can be solved as division of data based on Meta data would not allow the same entry with same Meta data. If in PCCW structure there would be no data redundancy so automatically it will save lots of time in finding a particular record. Results and deliverables of this approach may vary due to increasing amount of customers day by day. The proposed solution is significant in handling of large volume of data over large databases.
Sample Process Model
Deliverables
Details
Before
After
Time required
1-2 minutes
40-50 seconds
Project Load
Uneven & distracted
Organized & Balanced
Data Placement
Uneven & unorganized
Well utilized
Searching Time
1-2 minutes
40-50 seconds
Discussion
There are lots of advantages of using these both techniques in PCCW environment as PCCW is a very large network and posses bulk quantity of data over large network. It’s harder to manage the complexity of large data with the rate of increasing customers. There is an issue of mishandling of data also involves in such cases. In order to solve this problem it’s really necessary to detect the exact problem and then proposed technique is applied in order to get perfect results. Another alternate approach is neural network can be applied in such environment. Every algorithm and proposed solution poses some advantages and disadvantages. The selection of solution depends on environment, requirements and level of fitness in order to solve the problem.
Conclusion
PCCW is leading firm in Hong Kong offers telecommunication services. It has a huge list of customers and the rate of upcoming customers is also very high. PCCW is a wide network and its being ruling its position from last many years. PCCW faces problems in handling large volume of data. Proposed data mining techniques help a lot in establishing a fault tolerant network and also helps in proper allocation of resources and staff. The proposal gives proper justification and solution to the identified problem. Management can make decision on the basis of fair and free data obtained with the aid of proposed model.
References
Han, J. & Kamber, and M. (2000), Data Mining: Concepts and Techniques. Morgan Kaufmann
Prefix Clustering, (2006), C-BGP and prefix clustering.
Levi, the renowned name in jeans is feeling the heat of competition from a number of other brands, which have come upon the scene well after Levi’s but today appear to be approaching Levi’s market with a variety of measures to attract customers. What is worrying for Levi’s is, these companies are also succeeding in their efforts and giving a tough time to Levi’s. The company desires to make intelligent decisions based on the factual positions and circumstances prevailing in the market. Data mining is one such process that the company intends to adopt.
Data mining
Data mining is the process through which a company or an organization tries to extract useful information by compiling data from various sources, analyzing it, and subsequently coming out with some key components, which can be further probed to predict the future trend for the market and the company. In fact, as the name itself suggests, this process involves mining the data from almost everywhere having a direct or indirect link with the business operations of the company.
The useful data is then sifted with the help of available analysis techniques to narrow down to usable data. Subsequently, the company is supposed to take a number of decisions like deciding about the range of products, pricing strategies, marketing communication, and promotion strategies, etc. For example, a product can be priced in many ways depending upon the cost of manufacturing, variations in the cost of raw materials, the cost of reaching out to customers, the prevailing economic conditions, the income levels of the market segment, etc. Product pricing also depends on the availability of competitors in the market and certain rules and regulations of the land.
Similarly, propagating a brand and making investments for creating a brand identity also depends upon the potential of the market assessed with the help of data mining.
Brand promotion
Biswas et al. (1998) state that technological advancements have made it possible to collect and store data quite easily. What actually causes the problem is the abundance of data and it becomes quite challenging to effectively and efficiently analyze this data using the automated mechanism to better understand, characterize and validate known phenomenon and trends and discover the new and interesting phenomenon. This appears to be the case with Levi’s. Levi’s is an internationally renowned name for jeans. It all began in 1853 when a Bavarian immigrant named Levi Strauss opened a wholesale dry goods business.
He continued to be a small-time business for the next twenty years, till he patented the process of putting rivets in pants for strength. This gave birth to the world’s first jeans[1]. The rest is history. Today we have a number of well-known brands in jeans, but Levi’s continued its dominance over denim, the cloth which was initially meant for sailors but became a fashion statement gradually, for well over a century. Today, there are a number of brands that are vying for the market space, therefore, Levi’s is supposed to;
Make efforts to retain the existing customer base
Tie up with major retail outlets to make the brand available with prominence. Nowadays retail outlets have their own loyal customer base.
Work out the maximum benefit that can be passed on to the customers as well as the retailers, which might imply constraints on the company’s profit margin.
Look for volume sales as well. Having a strategic tie-up with high-end stores definitely gives a value proposition to the brand and its customers, but the association with high-end stores only might result in alienating the major consumer base, which relies on retailers like Wal-Mart, Tesco, etc. Today we are living in a competitive era, which implies more the merrier. As has been pointed out in the article in San Francisco, last year, the company’s profit plunged 32% to $151 million on sales of $4.25 billion. The need for having a pair of denim pants that sells for under $25 has been pointed out in the article as well.
Data mining will certainly help Levi’s in making an informed decision and understanding the socio-cultural and economic profile of its customer base. Data mining can be carried out by;
Analyzing the sales figures of previous years and comparison of targets with actual achievements.
Analyzing the variations in sales figures i.e. increase or decrease over the years.
Taking a look at the motivation levels of its employees. A satisfied and motivated workforce proves to be a big boost for the competitive strategies of the company.
Comparing the effect of factors like pricing, brand promotion, salary hikes, etc. on the production and sales.
But the company will do a world of good if it could take a lead from other available indicators like the general profile of jeans customers. Similarly, it will also help the company if it could start exploring the customer base outside the developed market. For instance, the Asian region in general and markets like China and India particular are major consumer markets for such items. Questions might be raised on such strategies by way of;
Questioning the dilution of brand value
Doubting the success of strategies in somewhat closed and protected economies like China
Viability of investing huge amounts in generating demand for the Levi’s
But the company needs to answer such criticism by taking a round look at the prevailing scenario of consumerism and its compulsions for a sustainably profitable business. For example;
Brand value depends upon its recognition by people. Trying to give a premium look to a product like Jeans, might not be a wise move in these times of market-driven economies.
China is gradually opening up its economy to the outside world and the billion-strong consumer base is a big attraction for companies nowadays.
Investing towards establishing brand awareness certainly helps in remaining in public memory, which is translated towards sales in the long run. The kind of advertisement and promotion did by companies like Coke and Pepsi even during off-peak seasons like winters, is an example.
In order to make a mark in the developing markets, Levi’s will certainly have to adopt a competitive and penetrative pricing strategy instead of skimming strategies. Online retailing by supermarkets is a new trend nowadays, which can work to the advantage of the company if it is able to create awareness and desire amongst the prospective customers. For this marketing communication strategies prove quite handy. Of late Levi’s has certainly realized this and ventured on to a promotion campaign. It needs to continue the momentum.
References
Levis’. About LevisStore.com. Web.
Biswas, Gautam; Weinberg, Jerry B., and Fisher, Douglas H. (1998). ITERATE: A Conceptual Clustering Algorithm for Data Mining. IEEE Transactions on Systems, Man, and Cybernetics—part c: Applications and Reviews, vol. 28, no. 2.