A hyper-heuristic is a heuristic search method that seeks to automate, often by the incorporation of machine learning techniques, the process of selecting, combining, generating or adapting several simpler heuristics (or components of such heuristics) to efficiently solve computational search problems. One of the motivations for studying hyper-heuristics is to build systems which can handle classes of problems rather than solving just one problem.[1][2][3]
There might be multiple heuristics from which one can choose for solving a problem, and each heuristic has its own strength and weakness. The idea is to automatically devise algorithms by combining the strength and compensating for the weakness of known heuristics.[4] In a typical hyper-heuristic framework there is a high-level methodology and a set of low-level heuristics (either constructive or perturbative heuristics). Given a problem instance, the high-level method selects which low-level heuristic should be applied at any given time, depending upon the current problem state (or search stage) determined by features.[2][5][6]
Hyper-heuristics versus metaheuristics
The fundamental difference between metaheuristics and hyper-heuristics is that most implementations of metaheuristics search within a search space of problem solutions, whereas hyper-heuristics always search within a search space of heuristics. Thus, when using hyper-heuristics, we are attempting to find the right method or sequence of heuristics in a given situation rather than trying to solve a problem directly. Moreover, we are searching for a generally applicable methodology rather than solving a single problem instance.
Hyper-heuristics could be regarded as "off-the-peg" methods as opposed to "made-to-measure" metaheuristics. They aim to be generic methods, which should produce solutions of acceptable quality, based on a set of easy-to-implement low-level heuristics.
Motivation
Despite the significant progress in building search methodologies for a wide variety of application areas so far, such approaches still require specialists to integrate their expertise in a given problem domain. Many researchers from computer science, artificial intelligence and operational research have already acknowledged the need for developing automated systems to replace the role of a human expert in such situations. One of the main ideas for automating the design of heuristics requires the incorporation of machine learning mechanisms into algorithms to adaptively guide the search. Both learning and adaptation processes can be realised on-line or off-line, and be based on constructive or perturbative heuristics.
A hyper-heuristic usually aims at reducing the amount of domain knowledge in the search methodology. The resulting approach should be cheap and fast to implement, requiring less expertise in either the problem domain or heuristic methods, and (ideally) it would be robust enough to effectively handle a range of problem instances from a variety of domains. The goal is to raise the level of generality of decision support methodology perhaps at the expense of reduced - but still acceptable - solution quality when compared to tailor-made metaheuristic approaches.[7] In order to reduce the gap between tailor-made schemes and hyperheuristic-based strategies, parallel hyperheuristics have been proposed.[8]
Origins
The term "hyperheuristics" was first coined in a 2000 publication by Cowling and Soubeiga, who used it to describe the idea of "heuristics to choose heuristics".[9] They used a "choice function" machine learning approach which trades off exploitation and exploration in choosing the next heuristic to use.[10] Subsequently, Cowling, Soubeiga, Kendall, Han, Ross and other authors investigated and extended this idea in areas such as evolutionary algorithms, and pathological low level heuristics. The first journal article to use the term appeared in 2003.[11] The origin of the idea (although not the term) can be traced back to the early 1960s[12][13] and was independently re-discovered and extended several times during the 1990s.[14][15][16] In the domain of Job Shop Scheduling, the pioneering work by Fisher and Thompson,[12][13] hypothesized and experimentally proved, using probabilistic learning, that combining scheduling rules (also known as priority or dispatching rules) was superior than any of the rules taken separately. Although the term was not then in use, this was the first "hyper-heuristic" paper. Another root inspiring the concept of hyper-heuristics comes from the field of artificial intelligence. More specifically, it comes from work on automated planning systems, and its eventual focus towards the problem of learning control knowledge. The so-called COMPOSER system, developed by Gratch et al.,[17][18] was used for controlling satellite communication schedules involving a number of earth-orbiting satellites and three ground stations. The system can be characterized as a hill-climbing search in the space of possible control strategies.
Classification of approaches
Hyper-heuristic approaches so far can be classified into two main categories. In the first class, captured by the phrase heuristics to choose heuristics,[9][10] the hyper-heuristic framework is provided with a set of pre-existing, generally widely known heuristics for solving the target problem. The task is to discover a good sequence of applications of these heuristics (also known as low-level heuristics within the domain of hyper-heuristics) for efficiently solving the problem. At each decision stage, a heuristic is selected through a component called selection mechanism and applied to an incumbent solution. The new solution produced from the application of the selected heuristic is accepted/rejected based on another component called acceptance criterion. Rejection of a solution means it is simply discarded while acceptance leads to the replacement of the incumbent solution. In the second class, heuristics to generate heuristics, the key idea is to "evolve new heuristics by making use of the components of known heuristics."[19] The process requires, as in the first class of hyper-heuristics, the selection of a suitable set of heuristics known to be useful in solving the target problem. However, instead of supplying these directly to the framework, the heuristics are first decomposed into their basic components.
These two main broad types can be further categorised according to whether they are based on constructive or perturbative search. An
additional orthogonal classification of hyper-heuristics considers the source providing feedback during the learning process, which can be either one instance (on-line learning) or many instances of the underlying problem studied (off-line learning).
Methodologies to choose heuristics
Discover good combinations of fixed, human-designed, well-known low-level heuristics.
Based on constructive heuristics
Based on perturbative heuristics
Methodologies to generate heuristics
Generate new heuristic methods using basic components of previously existing heuristic methods.
Based on basic components of constructive heuristics
Based on basic components of perturbative heuristics
On-line learning hyper-heuristics
The learning takes place while the algorithm is solving an instance of a problem, therefore, task-dependent local properties can be used by the high-level strategy to determine the appropriate low-level heuristic to apply. Examples of on-line learning approaches within hyper-heuristics are: the use of reinforcement learning for heuristic selection, and generally the use of metaheuristics as high-level search strategies over a search space of heuristics.
Off-line learning hyper-heuristics
The idea is to gather knowledge in form of rules or programs, from a set of training instances, which would hopefully generalise to the process of solving unseen instances. Examples of off-line learning approaches
within hyper-heuristics are: learning classifier systems, case-base reasoning and genetic programming.
An extended classification of selection hyper-heuristics was provided in 2020,[20] to provide a more comprehensive categorisation of contemporary selection hyper-heuristic methods.
Applications
Hyper-heuristics have been applied across many different problems. Indeed, one of the motivations of hyper-heuristics is to be able to operate across different problem types. The following list is a non-exhaustive selection of some of the problems and fields in which hyper-heuristics have been explored:
Hyper-heuristics are not the only approach being investigated in the quest for more general and applicable search methodologies. Many researchers from computer science, artificial intelligence and operational research have already acknowledged the need for developing automated systems to replace the role of a human expert in the process of tuning and adapting search methodologies. The following list outlines some related areas of research:
adaptation and self-adaptation of algorithm parameters
^ abP. Ross, Hyper-heuristics, Search Methodologies: Introductory Tutorials in Optimization and Decision Support Techniques (E. K. Burke and G. Kendall, eds.), Springer, 2005, pp. 529-556.
^E. Ozcan, B. Bilgin, E. E. Korkmaz, Hill Climbers and Mutational Heuristics in Hyperheuristics, Lecture Notes in Computer Science, Springer-Verlag, The 9th International Conference on Parallel Problem Solving From Nature, 2006, pp. 202-211.
^Amaya, I., Ortiz-Bayliss, J.C., Rosales-Perez, A., Gutierrez-Rodriguez, A.E., Conant-Pablos, S.E., Terashima-Marin, H. and Coello, C.A.C., 2018. Enhancing Selection Hyper-Heuristics via Feature Transformations. IEEE Computational Intelligence Magazine, 13(2), pp.30-41. https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8335843
^Amaya, I., Ortiz-Bayliss, J.C., Gutiérrez-Rodríguez, A.E., Terashima-Marín, H. and Coello, C.A.C., 2017, June. Improving hyper-heuristic performance through feature transformation. In 2017 IEEE Congress on Evolutionary Computation (CEC) (pp. 2614-2621). IEEE. https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7969623
^C. Segura, G. Miranda and C. León: Parallel hyperheuristics for the frequency assignment problem
Special issue on nature inspired cooperative strategies for optimization, In Memetic Computing, Special issue on nature inspired cooperative strategies for optimization, (doi:10.1007/s12293-010-0044-5[1]), 2010.
^ abCowling P. and Soubeiga E. Neighborhood Structures for Personnel Scheduling: A Summit Meeting Scheduling Problem (abstract), in proceedings of the 3rd International Conference on the Practice and Theory of Automated Timetabling, Burke E.K. and Erben W. (eds), 16-18 Aug 2000, Constance, Germany
^ abCowling P., Kendall G. and Soubeiga E., A Hyperheuristic Approach to Scheduling a Sales Summit, 2001, Lecture Notes in Computer Science 2079, Springer-Verlag, pp. 176–190, 2001, ISBN3540424210, (doi:10.1007/3-540-44629-X
^ abH. Fisher and G. L. Thompson, Probabilistic learning combinations of local job-shop scheduling rules, Factory Scheduling Conference (Carnegie Institute of Technology), 1961.
^ ab* H. Fisher and G. L. Thompson, Probabilistic learning combinations of local job-shop scheduling rules, Industrial Scheduling (New Jersey) (J. F. Muth and G. L. Thompson, eds.), Prentice-Hall, Inc, 1963, pp. 225–251.
^Drake J. H, Kheiri A., Ozcan E., Burke E. K., (2020) Recent Advances in Selection Hyper-heuristics. European Journal of Operational Research, 285(2), pp. 405-428. (doi:10.1016/j.ejor.2019.07.073[3])