Coding Agent Misalignment

Replication package for the paper "How Coding Agents Fail Their Users: A Large-Scale Analysis of Developer-Agent Misalignment in 20,574 Real-World Sessions". Read the full paper on arXiv.

Due to copyright considerations, raw chat traces are not redistributed; only episodes from repositories whose licenses explicitly permit redistribution (e.g., MIT, Apache-2.0) are released, while those from non-permissively licensed repositories are used only in aggregate analysis.

We provide an interactive viewer for the identified misalignment cases, as well as the annotation labels, available at Coding Agent Misalignment Atlas. The viewer includes only cases from permissively licensed repositories, with all personally identifiable information removed.

Modules

session-formatting/: preprocessing step. Formats parsed sessions into LLM-ready text files for extraction.
batch-runner/: reusable OpenAI Batch toolkit (build, submit, check, download, retry, postprocess).
misalignment-extraction/: extracts candidate misalignment episodes from formatted sessions.
misalignment-validation/: validates extracted episodes and filters unsupported cases.
misalignment-annotation/: multi-axial annotation of validated episodes.
data-aggregation/: aggregates intermediate outputs into downstream analysis tables; see data specs in this folder.
distribution-analysis/: notebooks and utilities for paper figures/tables and analysis-ready outputs.
misalignment-viewer/: static viewer for browsing the misalignment corpus.
workspace/: expected data layout (not distributed here) for repository/session-level inputs.
misalignments.json: aggregated list of all identified misalignment episodes with metadata and annotations, filtered to include only those from permissively licensed repositories.

Minimal Pipeline Order

Session preprocessing: session-formatting/
Extraction: misalignment-extraction/
Validation: misalignment-validation/
Annotation: misalignment-annotation/
Aggregation: data-aggregation/
Distribution analysis: distribution-analysis/
Misalignment viewer: misalignment-viewer/

Data Layout (Expected)

Typical structure under workspace/:

workspace/
└── {repo_id}/
    ├── session_parsed/    # Per-session parsed chat records
    │   ├── session_001.json
    │   └── ...
    ├── session_formatted/ # Per-session formatted chat records for LLM analysis
    │   ├── session_001.txt
    │   └── ...
    └── meta.json          # Repository metadata (e.g., name, language, session count)

Citation

@article{tang2026coding,
  title={How Coding Agents Fail Their Users: A Large-Scale Analysis of Developer-Agent Misalignment in 20,574 Real-World Sessions},
  author={Tang, Ningzhi and Chen, Chaoran and Xu, Gelei and Shi, Yiyu and Huang, Yu and McMillan, Collin and Dong, Tao and Li, Toby Jia-Jun},
  journal={arXiv preprint arXiv:2605.29442},
  year={2026}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Coding Agent Misalignment

Modules

Minimal Pipeline Order

Data Layout (Expected)

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data-aggregation		data-aggregation
distribution-analysis		distribution-analysis
misalignment-annotation		misalignment-annotation
misalignment-extraction		misalignment-extraction
misalignment-validation		misalignment-validation
misalignment-viewer		misalignment-viewer
session-formatting		session-formatting
workspace		workspace
.gitignore		.gitignore
LICENSE		LICENSE
ReadMe.md		ReadMe.md
misalignments.json		misalignments.json
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Coding Agent Misalignment

Modules

Minimal Pipeline Order

Data Layout (Expected)

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages