When we do predictive modelling, our goal is to find or predict quantitative/categorical answers to specific questions within limits of Von Neumann Architecture. This prediction is based on historical data. At high level, the prediction or say output can be of two types.
- Discrete output.
- Continuous output.
When we think of a prediction based on historical data, it can either be a quantity or a category. For example, based on historical price data we can find price of an apartment. Similarly, you can try to categorize whether a tumor is malignant or benign, based on the features like size, age etc. In both of the cases we are trying to predict a value. In first case we would get a continuous value like 100K, 120K or so. In the second case, you would get a discrete values like benign or malignant tumor.
So we can say, classification is about predicting a label and regression is about predicting a quantity.
Function Approximation in Supervised Learning:
In Predictive modeling, we derive the mathematical problem of approximating a mapping function (f) from input variables (X) to output variables (y). This is called the problem of function approximation.
Modeling algorithm finds the best fit which in turn is the best mapping function.
To be continued….