Meta-LabelingMeta-labeling, also known as corrective AI, is a machine learning (ML) technique utilized in quantitative finance to enhance the performance of investment and trading strategies, developed in 2017 by Marcos López de Prado at Guggenheim Partners and Cornell University.[1] The core idea is to separate the decision of trade direction (side) from the decision of trade sizing, addressing the inefficiencies of simultaneously learning both side and size predictions. The side decision involves forecasting market movements (long, short, neutral), while the size decision focuses on risk management and profitability. It serves as a secondary decision-making layer that evaluates the signals generated by a primary predictive model. By assessing the confidence and likely profitability of those signals, meta-labeling allows investors and algorithms to dynamically size positions and suppress false positives.[1] MotivationMeta-labeling is designed to improve precision without sacrificing recall. As noted by López de Prado, attempting to model both the direction and the magnitude of a trade using a single algorithm can result in poor generalization. By separating these tasks, meta-labeling enables greater flexibility and robustness:
ApplicationsMeta-labeling has been applied in a variety of financial ML contexts, including:
General architectureMeta-labeling decouples two core components of systematic trading strategies: directional prediction and position sizing. The process involves training a primary model to generate trade signals (e.g., buy, sell, or hold) and then training a secondary model to determine whether each signal is likely to lead to a profitable trade. The second model outputs a probability that is interpreted as the confidence in the forecast, which can be used to adjust the position size or to filter out unreliable trades.[1][2] Meta-labeling is typically implemented as a three-stage process:[2][3]
Stage 1: Forecasting sidePrimary model architecture [2] Figure 1 ![]() Figure 1 presents the architecture of a primary model. It focuses on forecasting the side of the trade. Following the example, this model (M1) takes in input data – such as open-high-low-close data and determines the side of the position to take: a negative number is a short position, and positive number is a long position, the range is set between −1 and 1 (the closer it is to −1 or 1, the stronger the models conviction is). When training the model, the labels are −1 and 1, based on the direction of forward returns for some predefined investment horizon. The researcher may decide to apply a recall check (τ: "Tau") by setting a minimum threshold that the initial output needs to be to qualify of a short or long position (if the threshold is not met, no side forecast is predicted, leading to closing of any open positions), this leads to the primary model output which is one of three possible side forecasts: −1, 0, or 1. The primary model also generates evaluation data which can be used by the secondary model, to improve performance of size forecasts. Some examples of evaluation data include rolling accuracy, F1, recall, precision, and AUC scores. Stage 2: Filtering out false positivesGeneral meta-labeling architecture [2] Figure 2 ![]() Next comes the phase of filtering out false positives, by applying a secondary machine learning model (M2), which is a binary classifier trained to determine if the trade will be profitable or not. The model takes as input four general groupings of data:
The output of the model is a value between −1 and 1 (if using a Tanh function) which will indicate the strength of the conviction that a short or long position is profitable, or it could simply be between 0 and 1 (using a sigmoid function) if one only wanted to know if it made money or not. This output allows filtering out trades that are likely to lead to losses. One could stop at this point or use the outputs of the secondary model as inputs to a position sizing algorithm (M3) which could further enhance strategy performance metrics by translating the output probability of the secondary model into a position size. Higher confidence scores result in larger allocations, while lower confidence leads to reduced or zero exposure. Stage 3: Optimizing position sizesPosition sizing methods (M3)Various algorithms have been proposed for transforming predicted probabilities into trade sizes:[3]
Model calibrationEach machine learning algorithm used in meta-labeling tends to produce outputs with different characteristic distributions; for example, some are approximately normally distributed, whereas others exhibit a pronounced U-shape, concentrating probabilities near the extremes.[4] Due to these varying distributions, simply summing the outputs of different models can inadvertently lead to uneven weighting of signals, biasing trade decisions. To address this, model calibration techniques are essential to adjust the predicted probabilities towards frequentist probabilities, ensuring that model outputs reflect true likelihoods more accurately. Two common calibration techniques are:
Transforming predictions to frequentist probabilities is crucial as it provides probabilistic outputs that are directly interpretable as the actual likelihood of an event occurring. Such calibration significantly enhances the effectiveness of fixed position sizing methods, reducing maximum drawdowns and increasing risk-adjusted returns. However, calibration has less impact on position sizing methods that directly estimate parameters from the training data, such as ECDF and SOPS, suggesting that calibration is a critical step mainly for fixed methods that rely heavily on raw model outputs. Notes
Meta-labeling architecturesVarious model architectures exist, each tailored to different aspects and complexities of trading strategy development.[8] Discrete long and shortRecognizing that factors driving long and short positions can differ significantly, this architecture splits meta-labeling into two specialized secondary models: one optimized for long positions and another for short positions. ![]() Components Primary model: Generates directional trade signals. Two Secondary Models:
Separate feature sets may be employed to reflect distinct informational drivers of market rallies versus sell-offs. Benefits
Sequential meta-labeling (SMLA)The SMLA introduces multiple layers of secondary models. Each secondary model's inputs include previous secondary models' outputs and evaluation statistics. This iterative process incrementally improves the model's accuracy. ![]() Components Primary model: Predicts initial trade direction. Sequential secondary models, where each subsequent model:
Final predictions reflect accumulated insights and error-corrections from preceding models. Benefits
Conditional meta-labeling (CMLA)The CMLA partitions data based on specific market states or regimes, applying specialized secondary models tailored to these conditions. It explicitly recognizes that trading strategy performance varies significantly across different market conditions. ![]() Components Primary model: Provides base directional signals. Condition-specific Secondary Models:
Outputs merged into final decision function. Benefits
Ensemble meta-labelingEnsemble methods combine multiple model predictions to achieve better performance than individual models by balancing bias and variance. Two prominent ensemble architectures are: 1. Bagging meta-labelingEmploys Bootstrap Aggregation (bagging), training multiple secondary models on bootstrapped samples of the data to mitigate variance and overfitting. ![]() Components Primary model: Generates initial directional signals. Multiple secondary models:
Predictions combined via majority voting or weighted aggregation. Benefits
2. Boosting meta-labelingSequentially trains secondary models where each model aims to correct the mistakes of the preceding model. Particularly effective at addressing bias and under-fitting. ![]() Components Primary model: Provides the initial trade signals. Sequentially Trained Secondary Models:
Final output combines sequential error corrections into a single enhanced prediction. Benefits
Inverse meta-labelingInverse meta-labeling reverses the standard process by first identifying important features from secondary models to refine and improve the primary model. This iterative improvement cycle helps create more effective primary models before applying meta-labeling. ![]() Components Primary model: Provides base directional signals. Initial secondary model:
Adjusted primary model:
Revised secondary model:
Benefits
PerformanceEmpirical studies using synthetic data and simulated trading environments have demonstrated that meta-labeling improves strategy performance. Specifically, it increases the Sharpe ratio, reduces maximum drawdown, and leads to more stable returns over time.[3][2] Open-source code for experiment replicationThe following GitHub repositories link to open-source code to replicate the experiments which show how meta-labeling improves the performance statistics of trading strategies.
References
Further reading
|