So, what is Data Mining?

23 June 2017 | About a 2 minute read
Tags: Analyst, application, apps, Big data, data, developer, lead, Learning, machine, market, mining, process, Product, tech

Have you ever wondered how supermarkets choose where to place items on their shelves? Or how Netflix picks the programmes you want to watch? Or how financial institutions can predict the markets to make money (or so they hope)?

This is the world of data mining. It is a process of analysing large amounts of data and finding  patterns. These patterns are used for analysis, turning raw data into knowledge.

One type of Data Mining is Market Basket Analysis. A good example of this is supermarket shopping. A typical high street chain has huge amounts of data on historic shopping baskets. A Data Mining application would analyse this data and find association rules between products –  it would find patterns where products are bought together which one would not normally suspect. For example nappies and beer – not something you’d usually think of belonging together, but the data shows that so it makes sense to keep them in nearby aisles

Another example is Clustering. This is the concept of giving a data mining application a large set of data, and getting it to  look through items and find ‘similar’ items. This can be used in many different contexts. A particularly powerful example is helping a marketing team gain a better  understanding of their customer database, and make their services more relevant to individuals within it.

One of the most powerful forms of Data Mining is prediction, i.e. predicting the future! How can this be accomplished, one may ask? First of all, you need to have historic data of the subject matter at hand, for example, customers foreclosing on their mortgage. A bank could take historical data of customers, some of whom would have foreclosed on their mortgages, which would need to include any data pertinent to customer foreclosure like whether a customer ever missed a payment. The big question here is what data to include – The answer? As much as possible. If there is even a remote possibility that data relates to a customer foreclosing, then add it. Modern Data Mining applications are very good at ‘understanding’ which are the important items of data, and hence removes the non-important stuff.  

Once you give a data mining application this historic data, it learns the patterns of customers foreclosing, and uses the knowledge to predict who might foreclose in the future. It will also be able to give you a confidence rating as to whether customer will foreclose, so you can assess the risk.

Congratulations, you are now a wizard. You have predicted the future!

Data mining is already part of our lives, and has many applications from stock market and, currency trading, sports results and even predicting earthquakes.

Whether it’s Netflix helping you choose which programmes we watch, or a personalised website giving you only relevant information, data mining is here to stay.

Share this blog post

Related Articles


We’re looking for bright, dynamic people to join our team!

Discover More Roles