# Oil Prices in Ecuador

*The only thing that's predictable, it seems, is oil's unpredictability.*

*The only thing that's predictable, it seems, is oil's unpredictability.*

I agree with the statement above, that the only thing that is predictable is oil's price unpredictability. Therefore I have ahead a very difficult task of trying to predict oil prices in Ecuador.

Oil prices have significant impact on Ecuador's economy and it would be interesting to see what the model has to tell us. I am applying ARIMA since this is a time series data set. Oil prices range from 2013 till 2017. I will try to explain as much as I can all steps I have taken in this analysis.

It is interesting to see that in a period of two years, oil prices frop from 100 dolars in 2013 to below 50 in 2015. During the first semester of 2015 oil prices increase from 50 to 60, but continuously up to 2016, hitting the lowest price of 30 dollars. Since then till now oil price is in a upward trend, in the rane of 45 to 55 dollars.

Let us start with our analysis. There are patterns that can be noticed by looking at the graphic. The time series has a modest seasonality pattern, where the oil price is higher in the middle of each year. But this my be insignificant. And there is clearly an overall decreasing trend in oil prices.

For time series analysis, ARIMA model in one of the best methods that can be used to forecast. It incorporates three parameters that take into account the properties of time series: seasonality, trend, and noise. I will next describe how to automate the process of identifying the optimal set of parameters for the ARIMA model.

In order to do a correct parametrization of ARIMA models, I will make use of Python code to select the optimal parameter values for our ARIMA(p,d,q)(P,D,Q)s model. I will use a "grid search"(or hyperparameter optimization) that will iterate over different combinations of parameters, and I will use the one that has the lowest AIC value.

The AIC (Akaike Information Criterion) measures how well a model fits the data while taking into account the overall complexity of the model.

Please note that I am using s= 12 since this dataset was converted to a monthly basis. Therefore I have 12 observations for each year.

After fitting the Arima model it is important to ensure that none of the assumptions have been violated.

First thing we have to look for is the residuals of our model.They have to be uncorrelated and normally distributed. If it is not the case, we should further improve the model.

The graph on the top right, the red KDE line follows closely with with the yellow line, the normal distribution line. This is an indication that the residuals may be normally distributed.

The graph on the bottom left shows that the ordered distribution of residuals (blue dots). They follow the linear trend of the samples taken from a standard normal distribution with N(0, 1). Again, this is an indication that the residuals are normally distributed.

This is the time where we will use the model to predict future oil prices.

We will compare the predicted oil price with the real ones, which wil help us understand the accuracy of our forecasts

The attributes get_prediction() and conf_int() allow us to obtain the values and associated confidence intervals for forecasts of our dataset.

The forecasts align with the true values well, but don't show any obvious trend.

We should now quantify the accuracy of our forecasts. I will use the MSE (Mean Squared Error), that tell us the average error of our forecasts.

For each predicted value, we compute its distance to the true value and square the result.

The MSE of our one-step ahead forecasts yields a value of 35.06, which is a little high.

But there is a better representation of our predictive results can be obtained using dynamic forecasts, that is, setting dynamic=True. This means that we use information from the data up to a certain point and then the forecasts will be generated using values from previous forecasted values

Once again, we verify the performance of our forecasts by computing the MSE. The Mean Squared Error of our forecasts is 1270.95 Here we are relying on less historical data than before, but even that, the MSE is high enough that at this point I should be thinking about applying a different model to predict oil prices.

Oil prices are a tricky thing to predict and the natural volatilty attached to it makes it hard to get a reasonable result.

I will finally describe how to forecast future value using Arima. The get_forecast() attribute of our time series object can compute forecasted values for a specified number of steps ahead, in my case, 20 steps.

Our forecasts predict that oil prices will be in the 50 dollars range in the next couple of years. Looking at the graph we see that since second quarter 2016 oil prices pretty much stabilized around this price range. We will see in the future if our model was correct.