• Home
  •   /  
  • Statistical Analysis
  •   /  
  • Prediction with NN

Back to Home

In this section, we present the analysis results obtained from employing some statistical models to explore the features of COVID-19 in Canada. The primary purpose here is to demonstrate the possibility of using different modeling strategies to analyze the COVID-19 data. We hope the studies can shed light on understanding the complex features and the development of COVID-19 in Canada. When interpreting the results, readers are reminded to pay attention to the associated model assumptions that may be untestable.

Prediction with SIR Model   |   Prediction with NN   |   Regression Analysis

Objective

As an alternative to the SIR model, we explore using the neural network (NN) method to predict the cumulative number of infected cases in Ontario, British Columbia, Quebec, and Alberta. We fit the model using the data from March 18 to Oct 25, 2020 (https://coronavirus.1point3acres.com/en ) to do prediction for the period of Oct 26 to Nov 1, 2020.

Assumption and Model

To deal with nonlinear time series data (see DATA VISUALIZATION), we employ the neural network (NN) model, an important method in machine learning, to do prediction. The neural network model basically includes three elements: the input layer, the hidden layer(s), and the output layer, as shown in the following figure. The R function nnetar is used to construct the neural network model, where the time series data of the cumulative number of confirmed cases for the period of March 18 to Oct 25, 2020 are inserted as the input layer, and the output layer gives the predicted value for a day in the period of Oct 26 to Nov 1, 2020.

A Single Neural Network

Findings and Discussion

The following figures present the fitted and predicted cumulative number of cases (in red) together with the reported cumulative number of confirmed cases (in blue) for the four provinces. A red solid curve reports the fitted number for the period of March 18 to Oct 25, 2020, and its differences from the blue curve show the performance of using the NN model. A red dashed curve is the predicted cumulative number of cases for the period of Oct 26 to Nov 1, 2020, together with dashed black curves indicating 95% prediction regions.

For ease of visualization, here we use connecting curves instead of isolated points to display the reported or predicted cumulative numbers of cases for the four provinces.

The analysis here supplies an alternative prediction method to the previously discussed SIR model. Although the implementation of the NN model is straightforward, one needs to be aware of the related limitations such as the uncertainty of determining the number of hidden layers. While it is difficult or even impossible to know the exact number infected cases for the past or future due to multiple reasons such as limited testing capacity and asymptomatic infections, it is clear that the predicted trends can differ considerably from model to model. Studying the COVID-19 development from different angles with different modeling may help enhance our understanding of the pandemic.

ONTARIO

The comparison of the fitted cumulative number of infected cases using the NN model (in red) versus the reported cumulative infections (in blue) in Ontario. A red dashed curve represents the prediction for the next 7 days and dashed black curves indicate 95% prediction regions.

ALBERTA

The comparison of the fitted cumulative number of infected cases using the NN model (in red) versus the reported cumulative infections (in blue) in Alberta. A red dashed curve represents the prediction for the next 7 days and dashed black curves indicate 95% prediction regions.

BRITISH COLUMBIA

The comparison of the fitted cumulative number of infected cases using the NN model (in red) versus the reported cumulative infections (in blue) in British Columbia. A red dashed curve represents the prediction for the next 7 days and dashed black curves indicate 95% prediction regions.

QUEBEC

The comparison of the fitted cumulative number of infected cases using the NN model (in red) versus the reported cumulative infections (in blue) in Quebec. A red dashed curve represents the prediction for the next 7 days and dashed black curves indicate 95% prediction regions.