Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (2023)

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (1)

When you want to forecast the time series data in R, you typically would use a package called forecast, with which you can use models like ARIMA. But then, beginning of this year, a team at Facebook released Prophet, which utilizes a Bayesian based curve fitting method to forecast the time series data. The cool thing about Prophet is that it doesn’t require much prior knowledge or experience of forecasting time series data since it automatically finds seasonal trends beneath the data and offers a set of ‘easy to understand’ parameters. Hence, it allows non statisticians to start using it and get reasonably good results that are often equal or sometimes even better than the ones produced by the experts.

We have added a support of Prophet in Exploratory and written an introductory post about it before.

An Introduction to Time Series Forecasting with Prophet Package in ExploratoryWe are adding Time Series Forecasting with the revolutionary amazing ‘Prophet’ package with Exploratory v3.3! 🎉blog.exploratory.io

Then, you would ask, should we always use Prophet, and not use forecast or more precisely ARIMA or other type of forecasting models at all? Is Prophet always outperform forecast better?

Recently, we’ve actually got this question from one of our customers who found that ‘Prophet’ was not performing well for their data.

So, I decided to compare the two, Prophet vs. forecast.

I’m going to use Exploratory’s out-of-the-box Prophet-based time series forecasting feature for Prophet while I use the model extension framework in Exploratory to bring in ‘forecast’ package by writing an R script.

I have a monthly aggregated data of US airline flights from 2005 to 2007. The goal is to use the data for 2005 and 2006 to build the forecasting model and forecast the number of the flights for 12 months period in 2007, and compare the forecasted values against the actual 2007 data.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (2)

I’m going to work with ‘forecast’ package with these two relatively new packages called ‘sweep’ and ‘timetk’, which make it possible to work with forecast in a tidy way.

I needed to install the following 3 packages in Exploratory before I start.

  • forecast : The time series forecasting package.
  • sweep : Package to handle time series forecasting in a tidy way.
  • timetk : We use this package to convert between regular data.frame and time series data format called ‘ts’.

From Exploratory’s Project List page, click R Package menu.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (3)

Click Install tab.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (4)

Type in “forecast” and click Install button, which will install the forecast package.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (5)

Install sweep and timetk packages the same way.

Let’s add a custom model script that adds ARIMA model to Exploratory through forecast package.

If you want to know more details on how to use the model extension framework please take a look at this document page.

Create a new custom R script inside a project by clicking + icon for Scripts menu.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (6)

Then type in a name for the script, and click Create button.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (7)

Here is the R script to bring forecast’s ARIMA model in with the model extension framework inside Exploratory.

# Time series forecasting by forecast
# as Exploratory Custom Model Function
# https://docs.exploratory.io/user-defined-model-function.html
library(forecast)
library(sweep)
library(timetk)
build_forecast_model <- function(formula, data, freq = 12, periods = 5, ...) { # default freq 12 for monthly data
training_data <- data
lhs_cols <- all.vars(lazyeval::f_lhs(formula)) # predicted variable
rhs_cols <- all.vars(lazyeval::f_rhs(formula)) # date column
all_cols <- c(lhs_cols, rhs_cols)

training_data <- training_data[colnames(training_data) %in% all_cols]
start_date <- training_data[[rhs_cols]][[1]]
start <- year(start_date) + yday(start_date)/365.25 # get start yearmon for the following tk_ts()
training_data_ts <- training_data %>% tk_ts(start=start, freq = freq, silent = TRUE)

(Video) Forecasting at Scale: How and Why We Developed Prophet for Forecasting at Facebook

fit_arima <- auto.arima(training_data_ts)
# fit_arima <- Arima(training_data_ts, order=c(1,1,0), seasonal=list(order=c(1,0,0)), method="ML")
# return model and periods as one object
ret <- list(model = fit_arima, periods = periods, date_col = rhs_cols, value_col = lhs_cols)
class(ret) <- c("forecast_model")
ret
}

augment.forecast_model <- function(x, data = NULL, newdata = NULL, ...) {
fcast <- forecast::forecast(x$model, h = x$periods)

fcast_df <- as.data.frame(fcast) %>%
rownames_to_column("date") %>%
mutate(date = as.Date((as.numeric(date) - 1970) * 365.25, origin = "1970-01-01"), key="forecast") # back to Date class
colnames(fcast_df)[colnames(fcast_df) == "Point Forecast"] <- x$value_col
colnames(fcast_df)[colnames(fcast_df) == "date"] <- x$date_col
actual_df <- data %>% mutate(key = "actual")
ret <- bind_rows(actual_df, fcast_df)
ret
}

glance.forecast_model <- function(x, ...){
sw_glance(x$model)
}
tidy.forecast_model <- function(x, ...){
sw_tidy(x$model)
}

Copy the above script into the script editor, and click Save button.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (8)

Now we are ready to perform time series forecasting with ARIMA inside Exploratory. Let’s try it!

Download the data

Download the following 2 csv files.

  • Number of flights in the US in 2005, 2006
  • Number of flights in the US in 2007

Import the downloaded CSV Data into Exploratory

Import 2005_2006_flights.csv into Exploratory.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (9)

Let’s take a look at this data on line chart. It looks like below, showing seasonal trend like August having most flights each year.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (10)

Import 2007_flights.csv too, so that we can compare our forecast with this actual 2007 data later.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (11)

Now, let’s go back to 2005_2006_flights data frame to build model.

The custom model script for forecast we just saved uses auto.arima function, which automatically tries to pick the best meta-parameters for ARIMA.

In order to call the R function registered in the script we want to select ‘Custom Command’ from the plus button and type the function name.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (12)

Then, type in the following custom command into the command form.

build_model(model_func=build_forecast_model, formula=count~year_month, periods=12)

Then, click the green Run button.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (13)

That will create an ARIMA model like below.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (14)

Let’s forecast with this model. Click “Predict on Training Data” menu.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (15)
(Video) What Is Arima Model In Time Series | How Arima Model Works | Time Series Forecasting | Intellipaat

Click Run button in the dialog.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (16)

That will append the forecasted data at the end of the original data.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (17)

Let’s visualize it with Line chart like following.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (18)

Also, let’s show prediction interval around the forecasted values.

Select “Range” from the Y Axis configuration menu.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (19)

Then in the dialog, check Show Range checkbox and select Hi 80/Lo 80 columns for Upper/Lower Limit. (By the way, I have been seeing more of 80% prediction intervals in the area of time series forecasting literatures, than 95%, which is most popular in other area of statistics. Prophet’s default prediction interval is 80% too. I guess this is indicative of forecasting time series with high confidence being very difficult, or just impossible in some cases.)

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (20)

Hmm.. The forecasted 2007 part looks pretty flat, including the prediction interval. Is this really a good forecast? Anyway, let’s compare it with actual 2007 data.

If you look at the forecast result data, the rows with actual data up to the end of the 2006 has “key” column with the value “actual”. What we want is continuation of this kind of rows for 2007 with the actual data.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (21)

So, let’s go to 2007_flights data frame and add “key” column with the value of “actual” so that we can use this data along with 2005_2006_flights data frame. We can do this by mutate command.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (22)

The key column is added like the following.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (23)

Now let’s go back to 2005_2006_flights data frame, and merge the 2007 data to this data frame.

Click Merge menu.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (24)

Select 2007_flights data frame, and click Run button.

(Video) Time Series Forecasting with Facebook Prophet and Python in 20 Minutes

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (25)

2007 actual data is merged to the data frame we have been working on.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (26)

After that, go back to the Line Chart, and set the “key” column for Color By. 2007 data is appended to the blue actual line, and forecast result by auto.arima is shown in orange line and band.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (27)

Well, this is kind of uninteresting result. auto.arima is just returning flat line with average value, and basically giving up on forecasting with further details like smaller up and down.

Let’s see how Prophet does compared to this.

From the Source step, create a branch so that we can work on forecasting with Prophet there.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (28)

In the branch, select “Run Time Series Forecast” menu.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (29)

In the dialog, set columns like the screenshot below. Also, set Forecasting Time Period to 12, so that we forecast for one year. You can leave Parameters on the right-hand side with defaults. Click Run button.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (30)

That will add columns with forecasted values like the following.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (31)

Let’s rename the forecasted_value column to “count”, just like the main 2005_2006_flights data frame so that we can show them on the same line chart. (Before that I needed to remove the original “count” column.)

Then let’s add “key” column with the value “prophet”, just like we did for actual 2007 data, so that this can be shown as a line with a different color on the line chart.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (32)

“key” column is added.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (33)

Prophet returns not only the forecasted data but also the past data, in this case, that is 2005 and 2006 data. But we are interested in only the forecasted data for the comparison, we can filter out the 2005 and 2006 before going back to the main data frame and merge this result to the one with ARIMA.

We can go back to the main branch of 2005_2006_flights, and merge the forecasting result by Prophet to the main.

(Video) Time Series Forecasting using SARIMAX and compared with ARIMA

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (34)

Now we can compare the Prophet result on the same line chart. The prophet result is the green line.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (35)

So, Prophet is showing a reasonable seasonal trend unlike auto.arima, even though the absolute values are kind of off from the actual 2007 data.

This showcases why Prophet was developed by a team at Facebook in the first place. For algorithm like ARIMA, oftentimes, further configuration is needed to make it produce reasonable results, which is out of reach of many people who are not not professionally trained experts.

But, is this the best we can do with ARIMA? auto.arima does not reproduce the seasonal pattern at all.

Let’s see what ARIMA can do with a little more hint.

Instead of auto.arima, let’s manually specify the parameters for ARIMA model, and also give ARIMA a clue that this data has yearly seasonality.

Open the Script we saved before, and comment out line 17, which was calling auto.arima function, and instead uncomment line 18 to enable the line with Arima function.

Click Save button.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (36)

This is line 18, with Arima function instead of auto.arima function.

fit_arima <- Arima(training_data_ts, order=c(1,1,0), seasonal=list(order=c(1,0,0), period=12))

Numbers in order parameter are the ARIMA’s meta-parameters. I manually adjusted them so that this ARIMA model fits well with our data.

ARIMA, which stands for AutoRegressive Integrated Moving Average model, is a combination of 2 types of models, namely, AutoRegressive model and Moving Average model, and a processing called differencing. The first number (1) goes to its AutoRegressive model part, and the 2nd number (0) goes to differencing part, and the 3rd number (1) goes to the Moving Average model part. For more detail, take a look at this free great textbook.

seasonal parameter is about seasonality. period parameter in it tells that the data has seasonality with the period of 12 month. order parameter in it again consists of 3 numbers that controls seasonal part of AutoRegressive model, differencing, and Moving Average model. This section of the same textbook has more detail on it.

With this change, let’s rebuild the ARIMA model. Go back to Build Model step and click the round green Run button.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (37)

Then go back to the last Bind Rows step, and open the line chart.

Here is the result with the new ARIMA model.

Is Prophet Really Better than ARIMA for Forecasting Time Series Data? (38)

It’s the orange line and band. Let me show the chart in full screen.

This looks like a pretty good forecasting result compared to the previous flat line. And it’s even better than the one with Prophet (Green line)!

With the newly configured ARIMA model, we can see that the prediction interval is much narrower, which means ARIMA is rather confident with this forecast, and that the actual values are fitting well within the prediction interval range.

With Model Extension Framework, you can quickly bring your favorite R package and build models inside Exploratory. This time, we brought in ‘forecast’ package’s ARIMA model and built forecasting models, one with Auto-ARIMA and another with ARIMA with fine tuned parameters. And we compared the results against the one from Prophet with no parameter setting, and found that the ARIMA with fine tuned parameters produced the best result for this data.

Now, is it always like that?

In the next post, I will try with other scenarios and compare the results to answer, so stay tuned!

(Video) Forecasting with the FB Prophet Model

If you don’t have Exploratory Desktop yet, you can sign up from here for a 30 days free trial. If you are currently a student or teacher, then it’s free!

Videos

1. Time Series Forecasting with Machine Learning
(CodeEmporium)
2. Forecasting Future Sales Using ARIMA and SARIMAX
(Krish Naik)
3. Univariate and Multivariate Time Series Forecasting With Facebook Prophet | Satyajit Pattnaik
(Satyajit Pattnaik)
4. What is Facebook Prophet? Time Series Forecasting
(Professor Ryan)
5. ARIMA Models for Stock Price Prediction ❌ How to Choose the p, d, q Terms to Build ARIMA Model (1/2)
(DecisionForest)
6. What is the Prophet Model
(Aric LaBarr)
Top Articles
Latest Posts
Article information

Author: Nathanael Baumbach

Last Updated: 01/30/2023

Views: 5789

Rating: 4.4 / 5 (75 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Nathanael Baumbach

Birthday: 1998-12-02

Address: Apt. 829 751 Glover View, West Orlando, IN 22436

Phone: +901025288581

Job: Internal IT Coordinator

Hobby: Gunsmithing, Motor sports, Flying, Skiing, Hooping, Lego building, Ice skating

Introduction: My name is Nathanael Baumbach, I am a fantastic, nice, victorious, brave, healthy, cute, glorious person who loves writing and wants to share my knowledge and understanding with you.