Defining Metrics for Evaluating General Artificial Intelligence Across Diverse Problem Domains
Keywords:
General Artificial Intelligence, Evaluation Metrics, Adaptability, Benchmarking, Domain-AgnosticAbstract
Defining robust and universal metrics for evaluating General Artificial Intelligence (GAI) is essential for its development and implementation across diverse problem domains. This paper explores the theoretical and practical aspects of GAI evaluation, focusing on frameworks that assess adaptability, problem-solving capability, and generalization. Existing benchmarks often emphasize narrow tasks, failing to capture the broader spectrum of intelligence characteristics. By integrating insights from psychology, neuroscience, and computational theory, this work proposes a multidimensional evaluation model incorporating metrics such as knowledge transferability, reasoning depth, and robustness against novel challenges. The study also discusses the importance of domain-agnostic evaluation standards to ensure fairness and comprehensiveness.
References
Turing, Alan M. "Computing Machinery and Intelligence." Mind, vol. 59, no. 236, 1950, pp. 433–460.
McCarthy, John, and Patrick J. Hayes. "Some Philosophical Problems from the Standpoint of Artificial Intelligence." Machine Intelligence, vol. 4, 1969, pp. 463–502.
Newell, Allen, and Herbert A. Simon. "Computer Science as Empirical Inquiry: Symbols and Search." Communications of the ACM, vol. 19, no. 3, 1976, pp. 113–126.
Russell, Stuart, and Peter Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, 1995.
Hutter, Marcus. Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability. Springer Science & Business Media, 2005.
Legg, Shane, and Marcus Hutter. "Universal Intelligence: A Definition of Machine Intelligence." Minds and Machines, vol. 17, no. 4, 2007, pp. 391–444.
Schank, Roger C., and Robert P. Abelson. Scripts, Plans, Goals, and Understanding: An Inquiry into Human Knowledge Structures. Lawrence Erlbaum Associates, 1977.
Wang, Pei. Rigid Flexibility: The Logic of Intelligence. Springer Science & Business Media, 2006.
Yudkowsky, Eliezer. "Artificial Intelligence as a Positive and Negative Factor in Global Risk." Global Catastrophic Risks, edited by Nick Bostrom and Milan M. Ćirković, Oxford University Press, 2008, pp. 308–345.
Lake, Brenden M., et al. "Building Machines that Learn and Think Like People." Behavioral and Brain Sciences, vol. 40, 2017, e253.