Research Article |
|
Corresponding author: Farai Nyika ( farai.nyika@mancosa.co.za ) Academic editor: Marina Sheresheva
© 2025 Gladys Fernandes-Gondoza, Ronney Ncwadi, John Manuel Fernandes, Farai Nyika.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Fernandes-Gondoza G, Ncwadi R, Fernandes JM, Nyika F (2025) Predicting Currency Crises in Emerging Markets: A Case Study on South Africa Using Artificial Neural Networks. BRICS Journal of Economics 6(2): 91-116. https://doi.org/10.3897/brics-econ.6.e141556
|
Purpose: In this paper, we study the potential of using Artificial Neural Network (ANN) models to predict currency crises in emerging markets, with a specific focus on the South African economy. South Africa’s rand is one of the most volatile currencies in the world and is prone to crises.
Methodology: We built two ANN models, where Model 1 uses ten economic indicators and Model 2 uses four. These models were assessed for statistical significance (using probit analysis), and their performance in predicting major South African currency crises (e.g., 1998, 2001, and 2008) was tested with both in-sample and out-of-sample data.
Results: The first model was much more accurate than Model 2 in predicting early warning signs nearly two years before the currency crises occurred. Model 1’s higher accuracy is attributed to its inclusion of a greater number of economic variables. Both models occasionally produced false positives, though overall, they were very accurate in predicting crises.
Originality: Our paper highlights the importance of ANNs in capturing nonlinear patterns in economic data, demonstrating their strength as early warning tools for financial crises. We recommend that ANN methods continue to be researched and advanced to further reduce false positives and improve predictive performance.
Currency crises, Early Warning System, Artificial Neural Networks, South Africa, Emerging markets.
Currency crises are not only a problem of the past, but also of the present and future. Since the fall of the Bretton Woods system in 1973, emerging and developing markets have experienced an increased frequency of crises, which continue to pose a threat today (
South Africa, as the economic powerhouse of Africa, is an interesting case to examine. The country operates a floating exchange rate system, which makes it, in theory, less vulnerable to currency crises (
South Africa as an emerging market economy (EME) is gaining global importance due to its strategic partnership with the BRICS+ association. Since EMEs have historically been more susceptible to crises (
The South African economy needs an alternative early warning system (EWS) model, which is to be developed and evaluated. This model should accurately detect the early signs of crisis events before they occur, allowing authorities to implement optimal policies to prevent or mitigate their impact. One tool that has garnered significant interest from the International Monetary Fund (IMF), policymakers, and academics is the Artificial Neural Networks (ANN) approach (
The architecture and functionality of ANNs bear resemblance to those of the human brain, albeit with reduced processing speed, computational capacity, and storage capabilities (
A neural network is a realization (R) of a nonlinear mapping from RI to RK i.e
fNN : RI → RK (1)
where I and K are respectively the dimension of input and the desired output space. fNN is usually a complex function of a set of nonlinear functions, one for each neuron in the network. The artificial neuron receives vector input signals,
zi = (z, z2, …, zI) (2)
either from the environment or from other artificial neurons. To each input signal, zi is associated a weight, vi, to strengthen or deplete the input signal. The artificial neuron computes the net signal and uses an activation function (fAN) to compute the output signal, given the net input. This is shown in Figure
ANN models can be divided into three or more layers - an input layer, an output layer and an intermediate layer/s (i.e. a hidden layer/s between input and output layers (See Figure
The initial phase in the development of an artificial neural network involves the judicious selection of appropriate indicators for input variables. This critical step is essential to ensure the accuracy and precision of the probability prediction of the output (
A critical consideration in artificial neural network design is the determination of the optimal number of hidden layers and the appropriate quantity of neurons within each of these layers. While the utilization of multiple hidden layers is permissible, such architectural decisions necessitate computational resources that may exceed the capabilities of standard computing systems (
Our study therefore uses only one hidden layer. Regarding the number of neurons in the hidden layer, note that too many neurons will affect the learning period of the model, making it longer; this can result in over-fitted data and poor model performance (
Several neural networks have been developed. For example, the most successful applications in terms of business analyses were the multi-layered feed-forward neural network used for prediction and classification (
Once all indicators have been selected, all signals from all neurons in the input layer, their corresponding weights that are set to small random values and a bias neuron are sent to the neurons in the hidden layer. Two processes occur in the hidden layer, first the summation of the input signals followed by using the activation function to fire a signal to the output layer. Thus, all incoming signals (net input signal into the hidden neuron), are summed as,
(3)
where xi is the signal from a hidden neuron and voj is the bias (weight) attached to each hidden neuron (
An activation function receives the net signal and bias and thus determines the output (or firing strength). In general, activation functions are monotonically increasing mappings, where,
fAN (– ∞) = 0 or fAN (– ∞)= –1 (4)
and
fAN (∞)= 1 (5)
This excludes the linear function (
(6)
The sigmoid function is a continuous version of the ramp function with, fAN (net) ∈ (0,1). The parameter λ controls the steepness of the function and is usually equated to one (
An artificial neural network model, very much like the human brain, can be trained to make its prediction more accurate and powerful. The artificial neuron learns the best values for vij and the threshold from the given data by adjusting weights and threshold values until a certain criterion/s is/are satisfied (
In the feed-forward step, all signals travel from the input layer with one bias neuron to the hidden layer with their respective initial weights and in the hidden layer all input signals are summed as in Equation 3 and transformed to binary values within the range of 0 and 1 via the logistic sigmoid activation function and then sent to the output layer. The logistic sigmoid activation function in Equation 6 can be rewritten simply as,
(7)
where xj is the activation signal from the hidden neuron (
(8)
where wjk is the weight of input signals from the hidden layer and wok is the bias on output unit (
(9)
(
The next step is to compare the network’s output (Yk) with the target (Tk) and all errors from this comparison are backpropagated from the output to the input layer via a hidden layer as follows:
(10)
As the error (E) is a function of weights, the network will minimize this error by adjusting the connection weight of each neuron in the entire layer.
The last step in the process of back propagation is the gradual adjustment of the weights associated with each connection among the neurons. The idea is to gradually adjust the error in a backward direction. To do this, the model updates the weights on connections between output neurons and hidden neurons and between hidden neurons and input neurons. The new connection weight between output and hidden neurons can be given by
(11)
where ∆wjk is the weight correction term.
The back-propagation algorithm is the optimizing technique using the steepest descent that requires a step size or learning rate () to be specified (
(12)
Equation 12 indicates that the error will be minimized by adjusting the connection weight (
ANNs are commonly referred to as ‘black-box models’ that do not have interpretability, which is a result of their complex internal structures and large number of parameters. It is not easy to make sense of how an ANN justifies a decision and prediction (
In constructing the artificial neural network model, firstly, an output neuron had to be determined. The output neuron was the currency crises in South Africa. This study defines a crisis as the actual depreciation or devaluation of a currency that takes place coupled with interventions by the central bank to counteract this depreciation (
(13)
That is to say, a crisis is indicated if the value of the EMP index exceeds the mean of the EMP time series (µEMP) plus the standard deviation of the EMP time series (σEMP) multiplied by a weight (δ) (
The next step was to select the set of variables for the input neurons to ensure the precision of the prediction of output. No specific methods have been recommended in the literature on how to select the input variables for this model. This study employed the NSR for selecting a set of input variables. Ten variables were chosen as leading indicators of a currency crisis in South Africa. Using fewer input variables is superior to using too many as the inclusion of noise in the data set degrades the performance of the artificial neural networks model (
On the other hand, Hashemi, (2019) found that using many input neurons creates a model superior to that with fewer input neurons. For this reason, the present study created a second artificial neural networks model using the statistically significant variables attained by the application of a simple probit model employing the ten chosen leading indicators. Tables
| No | Variables |
| 1 | ratio of budget deficits to GDP |
| 2 | change in the international liquidity position |
| 3 | growth rate of merchandise exports |
| 4 | growth rate of merchandise imports |
| 5 | growth rate of the ratio of domestic credit to GDP |
| 6 | growth rate of bank deposits |
| 7 | growth rate of foreign debt |
| 8 | changes in the gold price |
| 9 | change in the domestic interest rate |
| 10 | inflation differential to the USA |
| No | Variables |
| 1 | change in the international liquidity position |
| 2 | growth rate of foreign debt |
| 3 | change in the domestic interest rate |
| 4 | inflation differential to the USA |
All variables were stationary, and no multicollinearity existed when models were constructed. To provide statistical distribution and equal proportional contributions, and also remove any biases in the forecasting model,
(14)
Source:
Concerning the number of hidden neurons, it has been noted that if the number of hidden neurons increases, there is a tendency to overfitting, which means that the model mimics a “parrot” and will thus fail in its predictions. At the same time, too few hidden neurons decrease the ability of the model to manage a more complex data set. As one hidden layer is sufficient for the model to solve all problems (
The ANN models used a backpropagation learning algorithm to determine model performance (
| No. | TRAINING INFORMATION | MODEL 1 | MODEL 2 |
| 1 | Type of network | Multi-layer perception | Multi-layer perception |
| 2 | Number of layers | 3 | 3 |
| 3 | Number of hidden layers | 1 | 1 |
| 4 | Number of input neurons | 10 | 4 |
| 5 | Number of hidden neurons | 19 | 19 |
| 6 | Number of output neurons | 1 | 1 |
| 7 | Activation function | Logistic Sigmoid | Logistic Sigmoid |
| 8 | Performance function | Mean square error | Mean square error |
| 9 | Training algorithm | Back-propagation | Back-propagation |
| 10 | Starting weights and biases | Random | Random |
| 11 | Number of iterations | 30 000 | 30 000 |
| 12 | Training error | 0.126 | 0.148 |
| 13 | Learning rate | 0.010 | 0.010 |
| 14 | Momentum factor | 0.8 | 0.8 |
The process began when both models stochastically assigned their initial connection weights between neurons and bias neurons in the input layer and the hidden neurons in the hidden layers. Feed-forward was employed to ensure that signals constantly flow in a forward direction from the input layer to the output layer through the hidden layer. Within the concealed neuron, the models aggregate all these signals and convert them into a numerical range of 0 to 1 by means of the logistic sigmoid activation function. The initial weights connecting the hidden neuron and bias neuron to the output neuron in the hidden layer are randomly assigned. Subsequently, the models transmit modified signals from the concealed neuron to the output neuron, employing the logistic sigmoid activation function once more. The learning stage commences by contrasting the output neuron with its designated target. The mean square error is propagated backwards from the output layer to the input layer through the hidden layer using the backpropagation learning technique. This mechanism enables the establishment of suitable connection weights for all neurons. Consequently, the models stop the learning process whenever they reach the minimum error or the maximum number of repetitions.
To establish the most statistically significant input variables, the study employed a popular method using connection weights and the feed-forward back-propagation algorithm (
(15)
Where Rij is the relative importance of the variable xi with respect to the output neuron j, H is the number of neurons in the hidden layer, Wik is the synaptic connection weight between the input neuron i and the hidden neuron k and Wkj is the synaptic weight between the hidden neuron k and the output neuron j (
Table
Average Contributions of Input Variables to Output Variable for Artificial Neural Networks Model 1
| NO. | VARIABLES | CONTRIBUTION |
| 1 | ratio of budget deficits to GDP | 22.68% |
| 2 | change in the international liquidity position | 13.38% |
| 3 | growth rate of merchandise exports | 4.33% |
| 4 | growth rate of merchandise imports | 12.76% |
| 5 | growth rate of the ratio of domestic credit to GDP | 16.23% |
| 6 | growth rate of bank deposits | 3.08% |
| 7 | growth rate of foreign debt | 12.37% |
| 8 | changes in the gold price | 4.58% |
| 9 | change in the domestic interest rate | 3.31% |
| 10 | inflation differential to the USA | 7.28% |
The contributions of the input variables in the reduced model (Model 2) were calculated. As shown in Table
Average Contributions of Input Variables to Output Variable for Artificial Neural Networks Model 2
| NO. | VARIABLES | CONTRIBUTION |
| 1 | change in the international liquidity position | 53.13% |
| 2 | growth rate of foreign debt | 18.47% |
| 3 | change in the domestic interest rate | 2.64% |
| 4 | inflation differential to the USA | 25.76% |
Both models were employed to simulate and evaluate their ability to predict the probability of the South African currency crises of July 1998, December 2001, and October 2008. To analyze the probability predictions, this section was divided into in-sample probability predictions (Figure
Figure
Regarding the second in-sample crisis of December 2001, Model 1 emitted warning signals from as early as the latter part of 1999. The signal strength reached approximately 58% in December 1999, which falls within the 24-month crisis window. After a decline to around 9% in the early months of 2000, the signals intensified again, reaching heights of approximately 70% in late 2000 and early 2001. By December 2001, the predicted probability had risen to 82%. Model 2 also sent signals indicating the December 2001 crisis, but the signal strengths were not as strong as those from Model 1. As December 2001 approached, Model 2’s predicted probability was approximately 62%, which is 20% lower than that of Model 1.
Despite these models’ abilities to correctly point to the two in-sample crises, many signals are also emitted outside the 24-month crisis window. This indicates that both models sent many false alarms.
Figure
In contrast, Model 2 (Figure
The primary objective of this study was to develop an early warning system (EWS) model capable of effectively predicting financial crises, rather than merely identifying tranquil periods. To achieve this, the study utilized various cut-off probabilities (20%, 30%, 40%, and 50%) as thresholds for crisis prediction. To evaluate the performance of the EWS model in predicting South African financial crises, both in-sample and out-of-sample, six assessment methods were employed (
(i) Percentage of Observations Called Correctly: Measures the overall accuracy of the model.
(ii) Percentage of Pre-Crisis Periods Called Correctly: Assesses the model’s ability to accurately identify periods leading up to crises. (iii) Percentage of Tranquil Periods Called Correctly: Evaluates the model’s effectiveness in distinguishing tranquil periods from those with potential crisis indicators.
(iv) False Alarms as a Percentage of Total Alarms: Measures the rate of incorrect crisis predictions.
(v)
(vi) Global Score Bias (GSB): Measures the overall bias in the model’s predictions.
For a more detailed explanation of these methods, please refer to the appendix. The results of the performance evaluation are presented in Table
| Thresholds (Pr*) | Assessment methods | In-sample | Out-of-sample | ||||
| 1993/02-2004/12 | 2005/01-2008/10 | 2008/11-2017/03 | |||||
| M1 | M2 | M1 | M2 | M1 | M2 | ||
| 20% | % of observations correctly called | 59% | 55% | 61% | 52% | 34% | 56% |
| % of pre-crisis periods correctly called | 56% | 60% | 96% | 52% | 23% | 57% | |
| % of tranquil periods correctly called | 40% | 48% | 81% | 48% | 32% | 48% | |
| % of false alarms of total alarms | 60% | 60% | 41% | 43% | 32% | 22% | |
| QPS | 0.643 | 0.839 | 1.000 | 0.957 | 0.520 | 1.140 | |
| GSB | 0.113 | 0.189 | 0.417 | 0.004 | 0.080 | 0.520 | |
| 30% | % of observations correctly called | 69% | 65% | 59% | 43% | 27% | 32% |
| % of pre-crisis periods correctly called | 44% | 18% | 80% | 12% | 7% | 15% | |
| % of tranquil periods correctly called | 18% | 9% | 67% | 19% | 12% | 16% | |
| % of false alarms of total alarms | 44% | 54% | 41% | 57% | 38% | 27% | |
| QPS | 0.364 | 0.434 | 0.739 | 0.783 | 0.120 | 0.340 | |
| GSB | 0.006 | 0.017 | 0.160 | 0.185 | 0.002 | 0.024 | |
| 40% | % of observations correctly called | 63% | 65% | 54% | 43% | 25% | 26% |
| % of pre-crisis periods correctly called | 20% | 6% | 56% | 4% | 0% | 5% | |
| % of tranquil periods correctly called | 14% | 3% | 48% | 10% | 0% | 12% | |
| % of false alarms of total alarms | 57% | 50% | 42% | 67% | 0% | 43% | |
| QPS | 0.336 | 0.378 | 0.365 | 0.870 | 0.080 | 0.200 | |
| GSB | 0.006 | 0.061 | 0.009 | 0.306 | 0.005 | 0.010 | |
| 50% | % of observations correctly called | 66% | 66% | 50% | 48% | 25% | 26% |
| % of pre-crisis periods correctly called | 16% | 4% | 36% | 4% | 0% | 1% | |
| % of tranquil periods correctly called | 6% | 1% | 33% | 0% | 0% | 0% | |
| % of false alarms of total alarms | 43% | 33% | 44% | 0% | 0% | 0% | |
| QPS | 0.350 | 0.392 | 0.200 | 0.870 | 0.080 | 0.120 | |
| GSB | 0.028 | 0.077 | 0.024 | 0.378 | 0.005 | 0.006 | |
Both models have performed well on the whole, as indicated by the small values of the QPS and GSB in which a value of zero indicates perfection and a value of 2 indicates model failure (
The study has developed and estimated an artificial neural network (ANN) model as an alternative EWS for predicting currency crises in South Africa. Two models were constructed: a comprehensive model (Model 1) incorporating ten input variables and a reduced model (Model 2) using only four statistically significant variables. The purpose of estimating the reduced model was to investigate the impact of the number of input variables on model performance, addressing a common point of contention in the literature. Performance evaluations confirmed that models with more input variables outperform those with fewer. The study also identified the key contributors to South Africa’s currency crises: the ratio of budget deficits to GDP, followed by the growth rate of the ratio of domestic credit to GDP, and then changes in the international liquidity position.
The ANN models demonstrated their effectiveness by accurately predicting currency crises up to 24 months in advance, both in-sample and out-of-sample. Both models successfully identified all three notable currency crises in South Africa during the study period. However, Model 1, with its larger number of input variables, was able to provide earlier warning signals, up to 24 months prior to crisis occurrences, compared to Model 2. Additionally, both models correctly predicted the absence of currency crises after October 2008, a period during which South Africa did not report any such events. A particularly noteworthy feature of the ANN model was its ability to predict the probability of the out-of-sample GFC without requiring prior training with a target output. This demonstrates the model’s capacity to identify emerging trends and patterns that may indicate potential crises. While the ANN models demonstrated promising results, they also generated a considerable number of false alarms during the 1998 and 2001 periods. This suggests a potential limitation of the models, as they were unable to effectively differentiate between currency crises and other economic vulnerabilities. Further research is needed to explore potential relationships between various economic vulnerabilities and their impact on currency crises. Overall, the study concludes that ANNs offer a promising alternative to traditional EWS tools for predicting currency crises in South Africa. The models’ ability to provide early warnings and accurately identify the key contributing factors highlights their potential value in risk management and policymaking.
Fernandes-Gondoza, G: Conceptualization, Formal analysis, Data curation, Investigation, Methodology, Writing – original draft, review & editing. Ncwadi, R: Project administration, Supervision, Validation, Writing – original draft. Fernandes, J: Data curation, Software, Resources, Methodology, Data Visualization, Writing – original draft. Nyika, F: Writing – Review & editing, Updating of manuscript literature.
Artificial Intelligence (AI) tools were not used to write this manuscript.
All authors declare no conflicts of interest in this article.
No funding was used in the writing of this article.
In evaluating the performance of the four EWS models, suggest the following assessment criteria: the percentage of observations correctly called; the percentage of pre-crisis periods correctly called; the percentage of tranquil periods called and the percentage of false alarms to total alarms. In order to carry out these assessment methods, use is made of the two-by-two matrix in Table
| • The percentage of observations called correctly = (A + D)/(A + B + C + D) | (16) |
| • The percentage of pre-crisis periods called correctly = A/(A + C) | (17) |
| • The percentage of tranquil periods called correctly = B/(B + D) | (18) |
| • The false alarms as a percentage of the total alarms = B/(A + B) | (19) |
| • The Quadratic probability score or | (20) |
| • The Global score bias or | (21) |
If the values in Equations 16, 17 and 18 are high, the model performs better. In contrast, in Equation 19, the lower the ratio the better the indictor. The QPS and GSB allow for the assessment of the average closeness of the realisation of a crisis (Rt) and the probability prediction of a crisis (Pt) during the signalling horizon. Recall that if the event is a crisis, it scores 1 and 0 otherwise. The QPS statistic lies in the range between zero and two. The performance quality of the composite index is better the closer the test statistic is to zero. The performance quality of the composite index can be affected by the sample size. The larger the sample size, the more robust the test statistics (Diebold & Rudebusch, 1989).
The ten leading indicators used in this study were selected using the signal approach. The performance of an indicator in predicting a crisis can be shown in the value of its noise-to-signal ratio (NSR). The NSR can be presented by taking the ratio of the percentage of bad signals over the percentage of good signals (
(15)
In Equation 15, if the NSR ≥ 1 for an indicator, this indicates excessive noise and thus contributes less to the probability prediction of a currency crisis. The desired outcome is to have an NRS that is less than 1.