Learning (ML) model mustn’t the training data. As an alternative, it should well from the given training data in order that it could well to latest, unseen data.
The default settings of an ML model may not work well for each sort of problem that we try to unravel. We’d like to manually adjust these settings for higher results. Here, “ discuss with hyperparameters.
What’s a hyperparameter in an ML model?
The user manually defines a hyperparameter value before the training process, and it doesn’t learn its value from the information in the course of the model training process. Once defined, its value stays fixed until it is modified by the user.
We’d like to differentiate between a hyperparameter and a parameter.
A parameter learns its value from the given data, and its value is dependent upon the values of hyperparameters. A parameter value is updated in the course of the training process.
Here is an example of how different hyperparameter values affect the Support Vector Machine (SVM) model.
from sklearn.svm import SVC
clf_1 = SVC(kernel='linear')
clf_2 = SVC(C, kernel='poly', degree=3)
clf_3 = SVC(C, kernel='poly', degree=1)
Each clf_1 and clf_3 models perform SVM linear classification, while the clf_2 model performs non-linear classification. On this case, the user can perform each linear and non-linear classification tasks by changing the worth of the hyperparameter within the SVC() class.
What’s hyperparameter tuning?
Hyperparameter tuning is an iterative means of optimizing a model’s performance by finding the optimal values for hyperparameters without causing overfitting.
Sometimes, as within the above SVM example, the number of some hyperparameters is dependent upon the sort of problem (regression or classification) that we would like to unravel. In that case, the user can simply set for linear classification and for non-linear classification. It is an easy selection.
Nevertheless, for instance, the user needs to make use of advanced searching methods to pick out the worth for the hyperparameter.
Before discussing searching methods, we’d like to know two essential definitions: hyperparameter search space and hyperparameter distribution.
Hyperparameter search space
The hyperparameter search space incorporates a set of possible hyperparameter value combos defined by the user. The search might be limited to this space.
The search space may be , where is a positive integer.
The variety of dimensions within the search space is the variety of hyperparameters. (e.g third-dimensional — 3 hyperparameters).
The search space is defined as a Python dictionary which incorporates hyperparameter names as keys and values for those hyperparameters as lists of values.
search_space = {'hyparam_1':[val_1, val_2],
'hyparam_2':[val_1, val_2],
'hyparam_3':['str_val_1', 'str_val_2']}
Hyperparameter distribution
The underlying distribution of a hyperparameter can be essential since it decides how each value might be tested in the course of the tuning process. There are 4 varieties of popular distributions.
- Uniform distribution: All possible values throughout the search space might be equally chosen.
- Log-uniform distribution: A logarithmic scale is applied to uniformly distributed values. This is beneficial when the range of hyperparameters is large.
- Normal distribution: Values are distributed around a zero mean and a typical deviation of 1.
- Log-normal distribution: A logarithmic scale is applied to normally distributed values. This is beneficial when the range of hyperparameters is large.
The selection of the distribution also is dependent upon the sort of value of the hyperparameter. A hyperparameter can take discrete or continuous values. A discrete value may be an integer or a string, while a continuous value all the time takes floating-point numbers.
from scipy.stats import randint, uniform, loguniform, norm
# Define the parameter distributions
param_distributions = {
'hyparam_1': randint(low=50, high=75),
'hyparam_2': uniform(loc=0.01, scale=0.19),
'hyparam_3': loguniform(0.1, 1.0)
}
- randint(50, 75): Selects random integers in between 50 and 74
- uniform(0.01, 0.49): Selects floating-point numbers evenly between 0.01 and 0.5 (continuous uniform distribution)
- loguniform(0.1, 1.0): Selects values between 0.1 and 1.0 on a log scale (log-uniform distribution)
Hyperparameter tuning methods
There are numerous several types of hyperparameter tuning methods. In this text, we are going to give attention to only three methods that fall under the category. In an exhaustive search, the search algorithm exhaustively searches all the search space. There are three methods on this category: manual search, grid search and random search.
Manual search
There is no such thing as a search algorithm to perform a manual search. The user just sets some values based on instinct and sees the outcomes. If the result isn’t good, the user tries one other value and so forth. The user learns from previous attempts will set higher values in future attempts. Subsequently, manual search falls under the category.
There is no such thing as a clear definition of the hyperparameter search space in manual search. This method may be time-consuming, nevertheless it could also be useful when combined with other methods akin to grid search or random search.
Manual search becomes difficult when we have now to look two or more hyperparameters directly.
An example for manual search is that the user can simply set for linear classification and for non-linear classification in an SVM model.
from sklearn.svm import SVC
linear_clf = SVC(kernel='linear')
non_linear_clf = SVC(C, kernel='poly')
Grid search
In grid search, the search algorithm tests all possible hyperparameter combos defined within the search space. Subsequently, this method is a brute-force method. This method is time-consuming and requires more computational power, especially when the variety of hyperparameters increases (curse of dimensionality).
To make use of this method effectively, we’d like to have a well-defined hyperparameter search space. Otherwise, we are going to waste a whole lot of time testing unnecessary combos.
Nevertheless, the user doesn’t must specify the distribution of hyperparameters.
The search algorithm doesn’t learn from previous attempts (iterations) and due to this fact doesn’t try higher values in future attempts. Subsequently, grid search falls under the category.
Random search
In random search, the search algorithm randomly tests hyperparameter values in each iteration. Like in grid search, it doesn’t learn from previous attempts and due to this fact doesn’t try higher values in future attempts. Subsequently, random search also falls under .
Random search is significantly better than grid search when there’s a big search space and we have now no idea in regards to the hyperparameter space. It is usually considered computationally efficient.
After we provide the identical size of hyperparameter space for grid search and random search, we are able to’t see much difference between the 2. We’ve got to define a much bigger search space with a purpose to make the most of random search over grid search.
There are two ways to extend the scale of the hyperparameter search space.
- By increasing the dimensionality (adding latest hyperparameters)
- By widening the range of hyperparameters
It is strongly recommended to define the underlying distribution for every hyperparameter. If not defined, the algorithm will use the default one, which is the uniform distribution by which all combos could have the identical probability of being chosen.
There are two essential hyperparameters within the random search method itself!
- n_iter: The variety of iterations or the scale of the random sample of hyperparameter combos to check. Takes an integer. This trades off runtime vs quality of the output. We’d like to define this to permit the algorithm to check a random sample of combos.
- random_state: We’d like to define this hyperparameter to get the identical output across multiple function calls.
The most important drawback of random search is that it produces high variance across multiple function calls of various random states.
That is the top of today’s article.
Please let me know in the event you’ve any questions or feedback.
How about an AI course?
See you in the subsequent article. Glad learning to you!
Designed and written by:
Rukshan Pramoditha
2025–08–22