Data Mining

What is it?
Data mining is the systematic exploration of data for the purpose of extracting key patterns or findings.

How did it begin?
Organizations in the late 1990's typically have large stores of data available due to advances in information technology and reduced costs for data storage. The question is how to best utilize data that has magnified in terms of volume and complexity? Companies that can more quickly and efficiently uncover useful information to help run their businesses have a distinct competitive advantage.

The revolution?
The progressing computerization of professional and private life, paired with a sharp increase in memory, processing and networking capabilities of today's computers, makes it now more than ever possible to gather and analyze vast amounts of data. For the first time ever, the people all around the world are connected to each other electronically through the Internet, making available huge amounts of online data at an astonishingly increasing pace.

How is data mining involved in the revolution?
Sparked by these innovations, we are currently witnessing a rapid growth of a new industry, called the data mining industry. Companies and governments have begun to realize the power of computer-automated tools for systematically gathering and analyzing data. For example, medical institutions have begun to utilize data-driven decision tools for diagnostic and prognostic purposes; various financial companies have begun to analyze their customers' behavior in order to maximize the effectiveness of marketing efforts; the Government now routinely applies data mining techniques to discover national threats and patterns of illegal activities in intelligence databases; and an increasing number of factories apply machine learning methods to optimize process control.

How does it involve neural networks?
Neural networks are well suited for data mining tasks due to their ability to model complex, multi-dimensional data. As data availability has magnified, so has the dimensionality of problems to be solved, thus limiting many traditional techniques such as manual examination of the data and some statistical methods. Although there are many techniques and algorithms that can be used for data mining, some of which can be used effectively in combination, neural networks offer the following desirable qualities:

  • Automatic search of all possible interrelationships among key factors
  • Automatic modeling of complex problems without prior knowledge of the level of complexity
  • Ability to extract key findings much faster than many other tools

We have found that the process alone of organizing the data for neural networks can be invaluable. The level of rigor applied is in itself sufficient to reveal findings in the data.
Although neural networks are quite adept at finding the hidden patterns in data, they do not directly reveal their findings to the developer. Examination of the final model is necessary to extract the key relationships uncovered.

How does it work?
Neural Networks use a set of processing elements (or nodes) loosely analogous to neurons in the brain. (Hence the name, neural networks.) These nodes are interconnected in a network that can then identify patterns in data as it is exposed to the data.

How do neural networks differ from traditional computing programs?
In a sense, the network learns from experience just as people do. This distinguishes neural networks from traditional computing programs that simply follow instructions in a fixed sequential order.

What does it look like?
The structure of a neural network looks something like the following:



The bottom layer represents the input layer, in this case with 5 inputs labeled X1 through X5. In the middle is something called the hidden layer, with a variable number of nodes. It is the hidden layer that performs much of the work of the network. The output layer in this case has two nodes, Z1 and Z2 representing output values we are trying to determine from the inputs. For example, we may be trying to predict sales (output) based on past sales, price and season (input).

What about the hidden layer?
Each node in the hidden layer is fully connected to the inputs. That means what is learned in a hidden node is based on all the inputs taken together. This hidden layer is where the network learns interdependencies in the model. The following diagram provides some detail into what goes on inside a hidden node.

Simply speaking a weighted sum is performed: X1 times W1 plus X2 times W2 on through X5 and W5. This weighted sum is performed for each hidden node and each output node and is how interactions are represented in the network.

Each summation is then transformed using a nonlinear function before the value is passed on to the next layer.