Histoires Morales:
A French Dataset for Assessing Moral Alignment

1Laboratoire Hubert Curien, CNRS, Saint-Étienne, France

2Université Lumière Lyon 2 · Université Claude Bernard Lyon 1 · ERIC

3Télécom Paris, Institut Polytechnique de Paris


Lead authors with equal contributions

* Correspondence to Irina Proskurina and Thibaud Leteno

Overview

We introduce HistoiresMorales, the first corpus for situated social reasoning in French, adapted from the English MoralStories dataset. The dataset is designed to support the study of moral alignment of large language models in a multilingual setting.

Our main contributions are the following:

  • We introduce HistoiresMorales, a dataset of 12,000 short narratives describing moral norms, situations, intentions, moral and immoral actions, and their consequences in French. The dataset remains parallel to its English counterpart, enabling controlled cross-lingual comparisons.
  • We propose a multi-step translation pipeline based on error-explanation prompts, manual annotations, and human feedback, designed to ensure grammatical fluency and culturally appropriate translations.
  • We assess the quality and cultural alignment of the dataset through validation by native French speakers, showing that the norms and actions are generally aligned with moral values commonly shared in France.
  • We compare LLMs’ moral alignment with human norms using sentence likelihood and declarative classification of moral actions, in both French and English.
  • Finally, we investigate the robustness of multilingual moral alignment by shifting model preferences toward moral or immoral actions using Direct Preference Optimization.

Our results show that LLMs tend to align more strongly with moral norms in English than in French, and that this alignment exhibits low robustness under preference optimization, highlighting the need for further research on multilingual moral alignment.


The HistoiresMorales Dataset

HistoiresMorales is the first corpus for situated social reasoning in French, consisting of 12,000 stories that encompass moral norms, intentions, situations, actions (both deviating from norms and not), and the consequences of these actions.

HistoiresMorales is adapted to French from the widely used MoralStories dataset. We first translate the MoralStories dataset and then refine the translations through a multi-step translation pipeline using error-explanation prompts, manual annotations, and human feedback to achieve high-quality translations.

HistoiresMorales and MoralStories consist of short narratives that describe moral and deviant behaviour in social situations centred around personal relationships, education, commerce, domestic affairs, and meals.

Because HistoiresMorales remains parallel to its English counterpart, it can be used alongside parallel English data for comparative analysis and enables controlled comparisons of moral alignment across languages, as well as studies of robustness under preference optimization.

Dataset Structure

  • Moral norm: a moral norm
  • Situation: a description of the social situation and its participants
  • Intention: the actor's intention
  • Moral action and its consequence
  • Immoral action and its consequence

Dataset Usage

Load HistoiresMorales (French) and its English counterpart MoralStories with 🤗 datasets:

from datasets import load_dataset

# Load datasets
fr = load_dataset("LabHC/histoires_morales", split="train")
en = load_dataset("LabHC/moral_stories", split="train")

Cultural Value Alignment with French Annotators

Alignment of moral norms and actions in HistoiresMorales as evaluated by native French speakers.

Norms
Moral actions
Immoral actions
Agreement
Uncertainty
Disagreement

Experiments with HistoiresMorales

The HistoiresMorales dataset is designed to support controlled experiments on moral alignment in large language models. Because the dataset is parallel with its English counterpart, it enables direct and controlled comparisons of moral reasoning across French and English.

In our work, we demonstrate three main experimental uses of the dataset:

  • Likelihood-based evaluation
  • Action selection via prompting
  • Robustness analysis with preference optimization

Below, we illustrate a minimal and runnable example of Direct Preference Optimization (DPO), used to shift a model’s preference toward moral actions.

Example: Direct Preference Optimization (DPO)

We construct preference pairs such that:

  • chosen = moral action
  • rejected = immoral action

The following snippet shows a minimal DPO training setup using fewer than 100 examples from HistoiresMorales.

# !pip install -U trl unsloth datasets transformers

import random, torch
from datasets import load_dataset
from trl import DPOTrainer, DPOConfig
from unsloth import FastLanguageModel

random.seed(0)
torch.manual_seed(0)

dataset = load_dataset("LabHC/histoires_morales", split="train")

dataset = dataset.map(lambda x: {
    "prompt": f"{x['norm']} {x['situation']} {x['intention']}",
    "chosen": x["moral_action"],
    "rejected": x["immoral_action"],
})

dataset = dataset.shuffle(seed=0).select(range(100))

model, tokenizer = FastLanguageModel.from_pretrained(
    "mistralai/Mistral-7B-v0.1",
    load_in_4bit=True,
    max_seq_length=2048,
)

trainer = DPOTrainer(
    model=model,
    ref_model=None,
    args=DPOConfig(output_dir="./dpo", beta=0.1),
    train_dataset=dataset,
    tokenizer=tokenizer,
)

trainer.train()

Despite the very small training set, fewer than 100 preference examples are sufficient to noticeably shift a model’s moral preferences, demonstrating the limited robustness of moral alignment under preference optimization.

BibTeX

@inproceedings{leteno-etal-2025-histoiresmorales,
    title = "{HISTOIRESMORALES}: A {F}rench Dataset for Assessing Moral Alignment",
    author = "Leteno, Thibaud  and
      Proskurina, Irina  and
      Gourru, Antoine  and
      Velcin, Julien  and
      Laclau, Charlotte  and
      Metzler, Guillaume  and
      Gravier, Christophe",
    editor = "Chiruzzo, Luis  and
      Ritter, Alan  and
      Wang, Lu",
    booktitle = "Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)",
    month = apr,
    year = "2025",
    address = "Albuquerque, New Mexico",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.naacl-long.131/",
    doi = "10.18653/v1/2025.naacl-long.131",
    pages = "2590--2612",
    ISBN = "979-8-89176-189-6"}