Prediction using neural networks

Lecture



Neural networks are a very powerful and flexible prediction mechanism. When determining what needs to be predicted, it is necessary to specify the variables that are analyzed and predicted. The required level of detail is very important here. The level of detail used is influenced by many factors: the availability and accuracy of the data, the cost of the analysis and the preferences of the users of the forecast results. In situations where the best set of variables is unclear, you can try different alternatives and choose one of the options that gives the best results. Usually, this is the choice made when developing prediction systems based on the analysis of historical data.

The second important step in building a neural network predictive system is the definition of the following three parameters: the forecast period, the forecast horizon, and the forecast interval. The forecasting period is the basic unit of time for which the forecast is made. The forecast horizon is the number of future periods covered by the forecast. That is, you may need a forecast for 10 days ahead, with data for each day. In this case, the period is a day, and the horizon is 10 days. Finally, the forecast interval is the frequency with which the new forecast is made. Often the forecast interval coincides with the forecast period. The choice of the forecast period and horizon is usually dictated by the decision-making conditions in the area for which the forecast is made. The choice of these two parameters is almost the most difficult in neural network forecasting. In order for forecasting to make sense, the forecasting horizon must be no less than the time required to implement the decision made on the basis of the forecast. Thus, forecasting is very much dependent on the nature of the decision being made. In some cases, the time required to implement the solution is not defined, for example, as in the case of the supply of spare parts for the replenishment of stocks of repair companies. There are methods of working under conditions of such uncertainty, but they increase the variation of the prediction error. Since, with an increase in the forecast horizon, the accuracy of the forecast usually decreases, it is often possible to improve the decision-making process, reducing the time required to implement the decision and, consequently, reducing the horizon and forecast error.

In some cases, it is not so important to predict specific values ​​of the predicted variable as predicting significant changes in its behavior. Such a problem arises, for example, in predicting the moment when the current direction of the market (trend) changes its direction to the opposite.

The accuracy of the forecast required for a specific problem has a huge impact on the forecasting system. Also a huge impact on the forecast has a training set.

The first thing that a user of any neural package faces is the need to prepare data for a neural network. In practice, it is data preprocessing that can be the most time-consuming element of neural network analysis. Moreover, knowledge of the basic principles and techniques of data preprocessing is no less, and maybe even more important, than knowledge of the neural network algorithms themselves. The latter, as a rule, are already “sewn up” in various neuro-emulators available on the market. The very process of solving applied problems, including data preparation, falls entirely on the user's shoulders.

Consider the task of predicting the volume of sales of goods of the enterprise. The environment is non-deterministic, since conventional methods do not allow one hundred percent confidence to say what will happen next time and identify all factors that affect the predicted value is almost impossible (you can only limit the set of factors). The following set of financial indicators is available:

Enterprise activity:

  • sales history (quantity, amount);
  • warehouse status history;
  • advertising activity indicators.

External factors:

  • price lists of competitors;
  • market condition;
  • inflation;
  • Dollar, Euro, etc .;
  • stock indices (RTS, NASDAQ, Dow Jones and others).

As a result of comprehensive studies, secondary factors were identified that have an impact on the company's sales, which must be taken into account. These indicators are presented in table 1 (taken from the dissertation of A. Bychkov). As can be seen from the table, the listed parameters have different significance, the values ​​of these parameters are of different nature and are mined from various sources. As a result of a meaningful analysis of the listed parameters, it was revealed that some of them cannot be included in the model due to the impossibility of obtaining data, and some do not have a strong influence on the model dynamics and, therefore, they can be excluded from the model without significant loss of accuracy.

Table 1

Secondary parameters used for decision making

Independent variables

Significance

1.Concentration

++++

2. Warehouse status

++++

3. Saving on scale

++

4. Product Differentiation

+++

5. Advertising intensity

+++

6. The ratio of assets - production

++

7. Growth.

+++

8. Diversification

++

9. Geographical location.

+

10. Risk

++++

11. Export

+

12. Import

+

13. Market share

+++

14. Customer Concentration

+

15. Intensity of research and development

+

16. Strategic grouping

+

It should be noted that the history of sales of the enterprise itself provides for learning of the neural network approximately 60% of the necessary information (based on the author’s experience).

The task of forecasting the volume of sales of an enterprise has the features that make expedient the use of neural network modeling methods and, in particular, the “internal teacher” topology:

a) the data table may have a small size [1];

b) data skips may be present in the data table;

c) in the data possible distortion ("noise");

d) it is necessary to adapt the model when new data is received;

e) it is difficult to obtain a linear algebraic model;

e) a large number of items of the nomenclature.

General prediction algorithm using the neural network

The algorithm consists of the following points:

  • obtaining a time series with an interval in the selected time iteration;
  • filling gaps in history;
  • smoothing a series by the method of moving averages (or others);
  • obtaining a series of relative changes in the predicted value;
  • the formation of a table of "windows" with the immersion depth of time intervals;
  • adding additional data to the table (for example, a change in the value for previous years);
  • scaling;
  • definition of training and validation samples;
  • selection of neural network parameters;
  • learning neural network;
  • testing the performance of a neural network in real conditions.

Explain the term "window table". Data must be converted according to a special scheme. First, we transform the obtained time series into a series of increments of the predicted value, i.e., we will predict a change in the value, but not the absolute values ​​of the series. Then choose the depth of the dive, i.e. the number of time intervals for which we will predict the next. Take an immersion depth of 4, i.e. the prediction of the value for the next iteration will be carried out according to the results of the four previous iterations. Next, convert the value to the following form:

Table 4.1

The first version of the "window" of data

Hist1

Hist2

Hist3

Hist4

Hist0

D-1

D-2

D-3

D-4

D

D-2

D-3

D-4

D-5

D-1

D-3

D-4

D-5

D-6

D-2

...

...

...

...

...

The first four columns are the inputs of the neural network, the last one is the output, that is, based on the previous values ​​of the value change, the next value of the series is predicted. Thus, we get the so-called "sliding window", which presents data for five weeks. The window can be moved along the time axis and change its width. To take into account previous years and take into account possible seasonal dependencies, add another column to the sample, which shows the change in the value of last year over the same period.

Table 4.2

The second version of the "window" of data

LastY

Hist1

Hist2

Hist3

Hist4

Hist0

L

D-1

D-2

D-3

D-4

D

L-1

D-2

D-3

D-4

D-5

D-1

L-2

D-3

D-4

D-5

D-6

D-2

...

...

...

...

...

...

Thus, a training sample is being prepared and it is in this form that the data are provided for further analysis. It is possible not to be limited only to the last year, but to submit data for several previous years, but it should be borne in mind that the network expands in this case, which sometimes leads to poor results.

The disadvantages of prediction using neural networks include the following: long learning time, the problem of retraining, the difficulty of determining the position of the training sample and meaningful inputs.


[1] For example, data on sales of a new product.


Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

The practical application of artificial intelligence

Terms: The practical application of artificial intelligence