|国家预印本平台
首页|Evaluating Front-end & Back-end of Human Automation Interaction Applications \Delta-EVAL A Hypothetical Benchmark

Evaluating Front-end & Back-end of Human Automation Interaction Applications \Delta-EVAL A Hypothetical Benchmark

Evaluating Front-end & Back-end of Human Automation Interaction Applications \Delta-EVAL A Hypothetical Benchmark

来源:Arxiv_logoArxiv
英文摘要

Human Factors, Cognitive Engineering, and Human-Automation Interaction (HAI) form a trifecta, where users and technological systems of ever increasing autonomous control occupy a centre position. But with great autonomy comes great responsibility. It is in this context that we propose metrics and a benchmark framework based on known regimes in Artificial Intelligence (AI). A benchmark is a set of tests and metrics or measurements conducted on those tests or tasks. We hypothesise about possible tasks designed to assess operator-system interactions and both the front-end and back-end components of HAI applications. Here, front-end pertains to the user interface and direct interactions the user has with a system, while the back-end is composed of the underlying processes and mechanisms that support the front-end experience. By evaluating HAI systems through the proposed metrics, based on Cognitive Engineering studies of judgment and prediction, we attempt to unify many known taxonomies and design guidelines for HAI systems in a benchmark. This is facilitated by providing a structured approach to quantifying the efficacy and reliability of these systems in a formal way inspired by the recent fast developments in AI benchmarking techniques, thus, we attempt to guide designing principles towards a testable benchmark capable of reproducible results that is future-proof, general, and insightful both in the cognitive and technological stacks of any HAI application.

Gon?alo Hora de Carvalho

自动化基础理论计算技术、计算机技术

Gon?alo Hora de Carvalho.Evaluating Front-end & Back-end of Human Automation Interaction Applications \Delta-EVAL A Hypothetical Benchmark[EB/OL].(2024-07-12)[2025-05-02].https://arxiv.org/abs/2407.18953.点此复制

评论