So, what is Data Mining?
23 June 2017 |
About a 2 minute read
Have you ever wondered how supermarkets choose where to place items on their shelves? Or how Netflix picks the programmes you want to watch? Or how financial institutions can predict the markets to make money (or so they hope)?
This is the world of data mining. It is a process of analysing large amounts of data and finding patterns. These patterns are used for analysis, turning raw data into knowledge.
One type of Data Mining is Market Basket Analysis. A good example of this is supermarket shopping. A typical high street chain has huge amounts of data on historic shopping baskets. A Data Mining application would analyse this data and find association rules between products – it would find patterns where products are bought together which one would not normally suspect. For example nappies and beer – not something you’d usually think of belonging together, but the data shows that so it makes sense to keep them in nearby aisles
Another example is Clustering. This is the concept of giving a data mining application a large set of data, and getting it to look through items and find ‘similar’ items. This can be used in many different contexts. A particularly powerful example is helping a marketing team gain a better understanding of their customer database, and make their services more relevant to individuals within it.
One of the most powerful forms of Data Mining is prediction, i.e. predicting the future! How can this be accomplished, one may ask? First of all, you need to have historic data of the subject matter at hand, for example, customers foreclosing on their mortgage. A bank could take historical data of customers, some of whom would have foreclosed on their mortgages, which would need to include any data pertinent to customer foreclosure like whether a customer ever missed a payment. The big question here is what data to include – The answer? As much as possible. If there is even a remote possibility that data relates to a customer foreclosing, then add it. Modern Data Mining applications are very good at ‘understanding’ which are the important items of data, and hence removes the non-important stuff.
Once you give a data mining application this historic data, it learns the patterns of customers foreclosing, and uses the knowledge to predict who might foreclose in the future. It will also be able to give you a confidence rating as to whether customer will foreclose, so you can assess the risk.
Congratulations, you are now a wizard. You have predicted the future!
Data mining is already part of our lives, and has many applications from stock market and, currency trading, sports results and even predicting earthquakes.
Whether it’s Netflix helping you choose which programmes we watch, or a personalised website giving you only relevant information, data mining is here to stay.
Tech Lead (Reading)
Bring your expert tech knowledge to the table to influence the direction of projects, whilst coaching and your team through engineering best practices.I'm Interested
DevOps Lead (Reading)
Bring your delivery expertise to the table, leading the pack as ambassador on operational requirements, influencing and continuous development.I'm Interested
Programme Lead (Edinburgh)
Bring your expert project knowledge to the table to own delivery of all our initiatives being delivered out of our Delivery Engine.I'm Interested