Dennis Wei
NeurIPS 2020
Developing value-aligned agents is a complex un- dertaking and an ongoing challenge in the field of AI. Indeed, designing Large Language Models (LLMs) that can balance multiple possibly conflicting moral values based on the context is a problem of paramount importance. In this paper, we propose a system that performs contextual value alignment based on contextual aggregation of possible responses. This aggregation is achieved by integrating a subset of possible LLM responses that are best suited to a user’s input while taking into account features extracted about the user’s moral preferences. The proposed system trained using the Moral Integrity Corpus displays better alignment to human values than state-of-the-art baselines. Index Terms—value alignment, contextual alignment
Dennis Wei
NeurIPS 2020
Amirhossein Reisizadeh, Haochuan Li, et al.
ICASSP 2025
Hyo Jin Do, Rachel Ostrand, et al.
CHI 2024
Felicia S. Jing, Sara E. Berger, et al.
FAccT 2023