Interface AnswerEvaluator

  • All Known Implementing Classes:
    DoubleAnswerEvaluator, ExactAnswerEvaluator, IntegerAnswerEvaluator, MCQEvaluator, WordAnswerEvaluator

    public interface AnswerEvaluator
    Core object of the answer evaluation pipeline. This encapsulates all three steps of the evaluation process, used to determine how accurate a given answer is. Unlike score evaluation which can vary because of external factors such as time or round ranking, answer accuracy is said to be absolute, meaning that regardless of the context, if the same answer is provided twice to the evaluator, the two computed accuracies should be theoretically identical.

    The 3 steps of the pipeline are the following:
    1. Answer processing
      All answers should be "reformulated" by the evaluator first. For instance, if only a two-digit precision is required, all double answers should be rounded before verified, or else 4.291 would be considered "wrong" in case the correct answer is 4.29. Another example is true-false style questions. In order for T or Yes to be interpreted as True, the answer should be processed. For this case in particular, it would be possible to not process the answer but rather use a list of all possible answers, however this system allows more flexibility. By default, answers are simply processed by getting 'cleaned' (trailing spaces and double spaces removed).
      Step represented by AnswerProcessor
    2. Validity check
      Once the answer is properly processed, the evaluator needs to ensure that the answer is 'valid'. In this context, valid and correct are very different terms. An answer is valid if its format is correct. For instance, assuming 5.78 is the correct answer, 90.1 would be a valid (yet incorrect) answer, whereas hello would be invalid. Another example is MCQ (multiple-choice question). Assuming there are 4 possible answers (A, B, C, D), any answer that's not either of these four would be invalid.
      The idea behind this step is that some rounds may behave differently if the player's answer is incorrect or invalid. Typically, so-called 'soft' rounds would not count the guess as wrong in case it's invalid, allowing the player to retry.
      Step represented by ValidityEvaluator
    3. Answer evaluation
      Now that the answer has been processed and is known to be valid, the last step of the pipeline takes care of evaluating the accuracy (between 0.0 and 1.0) of the answer. Several implementations are available, such as ExactAnswerEvaluator, WordAnswerEvaluator, or DoubleAnswerEvaluator. For double-style answer evaluation, there are multiple implementations of loss functions, such as BinaryLossFunction or LinearLossFunction.
      Step represented by CorrectnessEvaluator
    Note that most evaluators provided by the API should be created from their respective factory, and not directly.
    See Also:
    AnswerProcessor, ValidityEvaluator, CorrectnessEvaluator
    • Method Detail

      • getCorrectnessEvaluator

        CorrectnessEvaluator getCorrectnessEvaluator()
        Retrieve the third step of the process, used to evaluate the accuracy of the processed and valid answer.
        Returns:
        the CorrectnessEvaluator object
      • getValidityEvaluator

        ValidityEvaluator getValidityEvaluator()
        Retrieve the second step of the process, used to verify that a processed answer is valid.
        Returns:
        the ValidityEvaluator object
      • getProcessor

        default AnswerProcessor getProcessor()
        Retrieve the first step of the process, used to process a raw answer.
        Returns:
        the AnswerProcessor object, a CleanStringProcessor by default