Each task in IHEval has three different settings:
- Aligned: All low-priority inputs align with the highest-priority instruction
- Conflict: Low-priority inputs contains some instructions that are incompatible with the highest-priority instruction.
- Reference: A model's response to hierarchical instructions is affected by both its original task performance and its ability to follow the instruction hierarchy (IH-following). To disentangle these two factors, we add a reference setting that tests the original task performance by merging all hierarchical instructions from the aligned setting into a single user message.