What is the significance of outliers




















In data mining, outlier detection aims to find patterns in data that do not conform to expected behavior. It is extensively used in many application domains such as. Outliers can be classified into three categories:. For example, Intrusion detection in computer networks. Attributes of data objects should be divided into two groups. However, if we then change the value final value and we had friends with the ages of 23, 25, 27, and 70, the average age is now This is quite a large increase, even though the majority of our friends are under 30 mind the change in scale of the graphic.

In this case, we have much less confidence that the average is a good representation of a typical friend and we may need to do something about this. Being able to identify outliers can help to determine what is typical within the data and what are exceptions. Identifying outliers can also help to determine what we should focus on in our analysis.

Sometimes what we wish to discuss is not what is common or typical, but what is unexpected. If results are extraordinarily good, it may be helpful to understand why a particular value is so much better than the rest - is there something that can be learned from this situation that can be applied elsewhere? It can be helpful to try to understand the cause of these peaks.

Did we start a new ad campaign on that day? Do these peaks always happen when we start an ad campaign? Are there some ad campaigns that have been associated with higher peaks than others? What can we learn from this? When presenting the information, we can add annotations that highlight the outliers and provide a brief explanation to help convey the key implications of the outliers. If something is particularly poor, it may alert us that there is an issue that needs to be addressed.

For example, if you run four stores and in a quarter three are doing well in sales and one is not, this may be something to look into. Is this consistent performance for the store? Was there something happening in the local neighborhood, such as construction on the street where it is located, that could have contributed to the lower sales? Are there practices that are implemented in the other stores that could be adopted here? Or, is it that this is a brand new store and it is still building up its customer base?

All outliers are not created equal! This will give us insights into how we manage them. Visualizing data gives an overall sense of the spread of the data. Data generation technique or data supplier is error prone if outliers are frequently available in a dataset.

A good dashboard would allow you to help in going to the root of the occurrence. Reason behind occurrence of outlier might or might not be straightforward and hence could be time consuming.

A good BI dashboard should allow you to help discover more on outliers. Data scientists at Nalashaa are passionate about finding insights in data and the root cause behind data behavior. We help clients make dashboards with analytics insights that provide degree view on your data.

Vinay is a data scientist with far fledged experience in transportation, industrial engineering and B2B marketing. He works with teams to identify the inherent need of the business and how data analytics can help, particularly in maintenance predictive and preventive. Share via:. Vinay Mehendiratta Vinay is a data scientist with far fledged experience in transportation, industrial engineering and B2B marketing.

Top blog post Predictive maintenance — the journey from a break-fix service model to an unfailing one. Analytics and Trends Evolution in the Airline Industry. Our Expertise Microsoft Dynamics Product Engineering. Robotic Process Automation. Get in Touch. Thank you for reaching out! We will get back to you soon.



0コメント

  • 1000 / 1000