Machine Learning is a subset of artificial intelligence. Where computers have the ability to learn without being programmed. That is, it does not require a person to program these instructions. we must see it as a project. And as such, it requires processes or stages to reach the proposed objective.
Next, we will mention the stages to have an account if we want to take the path of Machine Learning:
Stage 1: Define the objective
It is vital to understand the problem to be solved. We must be objective in terms of our objective given the characteristics of the company, as well as the data that we will have available. Recall that Machine Learning has as its main input data.
The following questions are typical at this stage:
- What exactly do we want to do?
- How exactly can we do it?
- Is what I want possible given the data I have?
Stage 2: Data collection
After defining our goal. We proceed to the data collection. Remember that our Machine Learning algorithm needs it.
This stage turns out to be easy if we are facing an orderly company regarding the collection process. It is important to understand the strengths and limitations of data. Because rarely these will coincide with the problem to be solved.
Also, remember that we will not necessarily have all the necessary data to solve the problem in our system. We will have in some cases to resort to external systems, which can be zero cost. And in other cases we will have to buy the data. Or it simply won’t be possible to access because they don’t exist.
Stage 3: Prepare the data
Once we have the data we continue with the preprocessing of the data, this is usually known as data cleaning, formatting.
The objective of this stage is to manipulate and convert the data in ways that produce better results.
As typical examples of data preparation we have: eliminate or infer lost data, categorize the values of the variables, normalize the numerical values or scale them so that they can be comparable.
Stage 4: Choice of the algorithm
Once we have preprocessed the data it is up to us to choose the most appropriate algorithm in relation to the problem we wish to solve.
This is where we must opt for a supervised learning algorithm or unsupervised learning. Within the supervised learning algorithms we have: Linear Regression, Logistic Regression, Decision Tree Regression, K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Decision Tree Classification, etc.
Within the unsupervised learning algorithms we have: K-Means Clustering, Hierarchical Clustering, Principal Component Analysis, etc.
Stage 5: Train the model
Once the algorithm is chosen, we proceed to separate the preprocessed data. A percentage of the data, commonly 70% of the total, will be used as training data.
That is, it will be the one to which we will apply the selected algorithm. On this data we will seek to achieve the objective initially set.
Stage 6: Model Validation
Since we already have the trained model, the following is to validate it. We will do this with the remaining data, the one we do not use for training. Which we will call validation data.
On the validation data we will proceed to run the algorithm and evaluate the results obtained.
We must be aware that there is a possibility that the model works well for the training data and not for the validation data (overfitting problem).
Therefore, it should be noted that we will return to stage 6 until our model fits well with the two partitions (training data and validation data). All this with the purpose of gaining confidence in our model.
Stage 7: Prediction
Once our model has overcome the problem of overfitting, the next step is to make the prediction. Which we will obtain when entering new data to our model.
These are the Machine Learning process steps in detail. If you are interested in learning Machine Learning, Then take Data Science course with Machine learning from best Data Science training institute in Bangalore.
