Skip to main content

Data Mining Methods

Data mining is the process of using statistical analysis and machine learning to reveal hidden patterns or odd things in large datasets because of this you can help make important decisions and predict what’s going to happen. it’s where you take data such as structured data, an image, video, text etc and train it, deploy and serve it then you can get actionable insights and application events. 

The techniques are classification which is used to organise data into different classes or categories so it trains a model on labelled data and uses it to predict the class.

Regression is used to predict numbers or continuous values based on relationships so it finds the function or model that best fits the data to make accurate predictions. 

Clustering is used to group similar data uncover patterns or structures in the data without any classes or labels. 

There is also association rule, anomaly detention, time series analysis, neural networks, decision trees, ensemble methods and lastly text mining. 

https://www.qlik.com/us/data-analytics/data-mining

Comments

Popular posts from this blog

Definition of big data

Big data is large and varied collections of data. That cannot be managed by traditional systems. It’s studied to reveal patterns, trends and associations. It was created by Roger Douglas in 2005. The data is so large and complex in its volume, velocity and variety that normal data management/storage systems can’t store, process or analyse due to the mass size of the dataset.

Traditional Statistics

 The traditional statistics of big data mainly revolves around structured data. It involves around statistical techniques which is things such as surveys, interviews, experiment data etc and visualisation's. But it also may include machine learning, data storage and additional techniques. Using a range of data sources such as social media, lot devices or sensors which is used for big data analytics.  https://www.geeksforgeeks.org/difference-between-traditional-data-and-big-data/#:~:text=Traditional%20data%20analysis%20methods%20typically,be%20stored%20and%20managed%20effectively. https://www.linkedin.com/pulse/whats-difference-between-big-data-analytics-traditional-statistical-ojodc#:~:text=Traditional%20statistical%20analysis%20primarily%20deals,surveys%2C%20experimental%20data%2C%20etc.

Characteristics of Big Data Analysis ps

 It’s often described using five main characteristics these being: Volume, value, variety, velocity and lastly veracity.  Volume being the size of the data that is needed to be analysed, value is from the eye of the business which leads to better operations, decisions, customer satisfaction and also being able to see what the business needs to make it succeed, variety which is the range of the data types, velocity is the rate at which companies get, store and manage their data such as on social media if someone asks a question how long it takes for the company to reply to this, and finally veracity which shows the actual accuracy behind the data and the information.   https://www.teradata.com/glossary/what-are-the-5-v-s-of-big-data#:~:text=Big%20data%20is%20a%20collection,variety%2C%20velocity%2C%20and%20veracity.