![]() |
|
|
|
|
Products & Solutions / By Solution / Forecaster / FAQ |
|||||
For customer acquisition, risk mitigation, customer management and fraud prevention solutions we recommend our strategic partner - Scorto Corp.
|
General Where can I get additional information about neural networks? How could I improve things to get better forecasting? When neural networks are a bad choice for my forecasting? Data Analysis and Preprocessing How much historical data do I need? Why some columns are grayed after Data Analysis and cannot be selected as targets? What is a categorical column? How can I see which records and columns were removed from analysis? What is your algorithm of removing misplaced data? Network Preparation What is network training? What is the best training algorithm for my problem? Why the absolute error became disabled during the Network Preparation step? What is “minimum improvement in error”? How could I speed-up network selection? How much hidden layers and units do I need? How much time is required for training? Can I change the network parameters after training? Forecasting and Reporting How could I forecast several values at once without entering them manually? How could I change report format? General Where can I get additional information about neural networks? There is a good introductory book written by Kevin Gurney and available online at: http://www.shef.ac.uk/psychology/gurney/notes/index.html You can also try Dr. Leslie Smith’s brief online introduction to neural networks packed with pictures and examples at: http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html. A good introductory book for managers and business analysts is: For engineers and technically-minded people we’d recommend to start with: Fausett, L. (1994), Fundamentals of Neural Networks: Architectures, Algorithms, and Applications, Englewood Cliffs, NJ: Prentice Hall. For financial specialists, bankers and traders we recommend starting with: E. Michael Azoff (1994). Neural Network Time Series: Forecasting of Financial Markets NY: John Wiley and Sons, Inc. How could I improve things to get better
forecasting?
top
What is a categorical column? Each value of a categorical column represents a certain category. For example, categorical is a column that contains only “Male” or “Female” as its values. Typically, the number of different values in a categorical column is much less than the number of records. Categorical data should be encoded in a special way to be suitable for a neural network. You may manually mark a column as categorical in Expert Mode (using Details button at Data Analysis Progress step). This feature may be beneficial for some cases. For example, your data has a column “Model” that has values “1”, “2”, “3”. By default, this column will be considered as a numeric, but it will be more beneficial to encode it as a categorical one. How can I see which records and columns were removed from analysis? During the Data Analysis step click the “Details” button and you will see your data with grayed columns and rows. All colored cells will be removed from further use. In the Details window you may also see a reason of removing a record. The cells containing missing, misplaced data or outliers are painted with different colors. You can control this process in Expert Mode. In this mode you can set your preferences for data analysis. What is your algorithm of removing misplaced data? If all data in one of your columns contain numbers with the exception of several values, Wizard will identify this column as numeric. These several values will be identified as misplaced and records containing them will be removed. The same is true for other types of columns. The main question is this algorithm is “How many these “several” can be?” If you suspect that your data may have misplaced values, you need to give the Wizard a clue of how much misplaced values can be in your noisiest column. You can do it during Data Analysis step in Expert Mode. There is no misplaced data handling in Standard Mode. All columns are considered to be free of misplaced values, and if a numeric column contains at least one text value, it will be considered a text one. top
Network Preparation What is network training? Network training means adjusting neural network weights. During training the network analyzes the data you have provided and changes weights between network units to reflect dependencies found in your data. What is the best training algorithm for my problem? If your data have up to 10 input columns, the best training algorithm will be Levenberg-Marquardt. It is fast and quite reliable. If you have a data set with hundreds of thousands of records and more, we recommend trying Incremental Back Propagation first. For all other cases it fully depends on your type of problem and dependencies inside your data. We recommend to start with Conjugate Gradient Descent and then try Quick Propagation and as the last step Batch Back Propagation or Incremental Back Propagation. Why the absolute error became disabled during the Network Preparation step? When your target column is not numeric, it is hard to define unambiguously what the absolute error is. For such cases it is better to use only relative errors, which is enough to completely control the training process. In Expert Mode you may use CCR (Correct Classification Rate) instead of error threshold definition. What is “minimum improvement in error”? Minimum improvement in error specifies the minimum error change during each iteration (or during several last iterations). This parameter is useful for detection of situations where the network cannot further improve its performance and training should be stopped to save time. Although one should be careful with this parameter because in certain cases the error can be decreased after a lot of “motionless” iterations. It’s impossible to automatically detect such cases. We recommend to set 10 iterations, which is enough for most of of problems. For certainty you can set up to 100 iterations. How much time is required for network selection? The time required for network selection depends on the number of inputs, amount of data, complexity of the task and capability of your computer. The network selection can last from several seconds to several hours. How could I speed-up network selection? The first way is to select the “Rough search” method, which is the quickest one but does not guarantee the best results. The second way is to specify the minimum and maximum number of hidden units your problem may require (Expert Mode only). This way requires some experience in neural networks and at least approximate estimation of problem complexity. How much hidden layers and units do I need? In our experience, the majority of problems (ca. 80%) have a good solution with 1 hidden layer, another part (ca. 20%) has a good solution with 2 layers, and only 1-2% of problems need 3 layers or more. More than two hidden layers are typically beneficial only for special problems, such as ZIP code recognition. If you have a small number of hidden units you will get a big error during forecasting, because there is not enough power to find and encode dependencies of your data. If you have a big number of hidden units neural network tends to memorize your data rather than encode dependencies and this will also lead to a big error during forecasting. For majority of problems, there is only one way to find the best
number of hidden units: train several networks with different number
of hidden units and find the best network by comparing forecasting
errors on testing subset. Forecasting and Reporting How could I forecast several values at once without entering them manually? Alyuda Forecaster doesn’t have this feature. How could I change report format? During the Reporting step press the “Show Report” button. You will see report preview. Click “Save As…” in the “File” menu and select desired format in the “Save as type” dropdown list. top
|
||||