Parse radiology free-text reports into structured data. No ML. No GPU. No dependencies.
Radiology reports come out as free-text PDFs. Downstream systems — EMRs, telehealth portals, billing platforms, research pipelines — need structured data. This library bridges that gap.
Three things it does well:
- Parse — splits any free-text report into labeled sections, extracts measurements, links findings to anatomy
- Detect — flags critical/urgent findings with negation awareness (no false alerts for "no pneumothorax")
- Export — outputs FHIR R4 DiagnosticReport resources ready for any EMR
pip install radreport-parserZero required dependencies. Works on Python 3.9+.
from radreport_parser import ReportParser, CriticalFindingsDetector, FHIRExporter
import json
report_text = """
INDICATION: Chest pain, rule out PE.
FINDINGS:
Lungs: Filling defect in the right main pulmonary artery consistent with
pulmonary embolism. No pneumothorax.
IMPRESSION:
Pulmonary embolism, right main pulmonary artery. Urgent correlation recommended.
"""
# 1. Parse
parser = ReportParser()
report = parser.parse(report_text, modality="CT")
print(report.impression)
# → "Pulmonary embolism, right main pulmonary artery. Urgent correlation recommended."
# 2. Detect critical findings
detector = CriticalFindingsDetector()
report = detector.detect(report)
for cf in report.critical_findings:
if not cf.negated:
print(f"[{cf.severity.upper()}] {cf.term} ({cf.category})")
print(f" Context: {cf.context}")
# → [CRITICAL] pulmonary embolism (pulmonary)
# Context: Filling defect in the right main pulmonary artery consistent with pulmonary embolism.
# 3. Export to FHIR
exporter = FHIRExporter()
fhir = exporter.export(report, patient_id="pt-001")
print(json.dumps(fhir, indent=2))After installation, the radreport command is available for single-file and batch processing:
# Parse a single report to JSON
radreport report.txt
# Parse with critical findings detection
radreport report.txt --critical
# Export as FHIR DiagnosticReport
radreport report.txt --fhir --patient-id pt-001 --modality CT
# Batch process multiple files → JSON array
radreport reports/*.txt --critical -o batch.json
# Specify modality for all files
radreport *.txt --modality MRI --fhir -o fhir_batch.jsonFlags:
| Flag | Short | Description |
|---|---|---|
--modality MOD |
-m |
CT, MRI, XR, US, NM, PET … |
--critical |
-c |
Run critical findings detection |
--fhir |
-f |
Export as FHIR R4 DiagnosticReport (implies --critical) |
--patient-id ID |
FHIR Patient resource ID | |
--output FILE |
-o |
Write output to file instead of stdout |
The parser recognizes standard radiology report sections regardless of formatting style:
| Section key | Matched headers |
|---|---|
indication |
Indication, Clinical Indication, History, Reason for Exam |
technique |
Technique, Procedure, Protocol |
comparison |
Comparison, Prior Study, Previous |
findings |
Findings, Observations |
impression |
Impression, Conclusion, Assessment, Diagnosis |
recommendation |
Recommendation, Follow-up, Advised |
report = parser.parse(text, modality="MRI")
findings = report.get_section("findings")
print(findings.raw_text)
impression = report.get_section("impression")
print(impression.raw_text)All measurements are extracted and normalized to millimeters:
for m in report.all_measurements:
print(f" Raw: {m.raw}")
print(f" Normalized (mm): {m.dimensions_mm}")
print(f" Largest dimension: {m.largest_dimension_mm} mm")
# Raw: 2.3 x 1.8 cm
# Normalized (mm): [23.0, 18.0]
# Largest dimension: 23.0 mmHandles: 1.2 x 0.8 cm, 12mm, 1.2cm, 12 x 8 x 5 mm, 1.2 x 0.8 x 0.5 cm
findings_section = report.get_section("findings")
for finding in findings_section.findings:
print(f"Anatomy: {finding.anatomy or 'unspecified'}")
print(f"Text: {finding.text}")reports = parser.parse_batch(list_of_texts, modality="CT")
# Returns list[ParsedReport | None] — None for empty/unparseable inputs
active = [r for r in reports if r is not None]report = parser.parse(text, modality="CT")
# As dict
d = report.to_dict()
# As JSON string (shorthand)
json_str = report.to_json()
json_str = report.to_json(indent=4)Rule-based. Fully auditable. No black boxes.
Covers 45+ terms across 8 categories:
| Category | Examples |
|---|---|
vascular |
aortic dissection, DVT, aortic aneurysm |
pulmonary |
pulmonary embolism, PE, pneumothorax, hemothorax |
neuro |
subdural hematoma, midline shift, intracranial hemorrhage |
abdominal |
free air, bowel perforation, appendicitis |
cardiac |
cardiac tamponade, pericardial effusion |
spinal |
cord compression, cervical fracture |
oncologic |
malignancy, metastasis, carcinoma |
# "No pneumothorax identified" → negated=True, won't trigger alert
# "Pneumothorax present" → negated=False, triggers alert
active = [cf for cf in report.critical_findings if not cf.negated]critical— requires immediate action (PE, subdural hematoma, pneumothorax)urgent— requires same-day follow-up (DVT, bowel obstruction, appendicitis)significant— requires follow-up (malignancy, metastasis)
from radreport_parser.critical_findings import CRITICAL_TERMS
CRITICAL_TERMS["tension pneumothorax"] = ("pulmonary", "critical")
CRITICAL_TERMS["septic emboli"] = ("vascular", "urgent")Outputs a valid FHIR R4 DiagnosticReport resource.
from datetime import datetime
fhir = exporter.export(
report,
patient_id="pt-001", # Optional: links to FHIR Patient resource
report_id="rpt-20240315", # Optional: custom resource ID
issued_dt=datetime.now(), # Optional: defaults to UTC now
)resourceType:DiagnosticReportstatus:finalcode: LOINC code matched to modality (CT, MRI, US, etc.)conclusion: impression textpresentedForm: full report text as base64 attachmentcontained: FHIR Observations for each active (non-negated) critical findingextension: structured sections for downstream parsingsubject: patient reference (whenpatient_idprovided)
import json
from radreport_parser import ReportParser, CriticalFindingsDetector, FHIRExporter
parser = ReportParser()
detector = CriticalFindingsDetector()
exporter = FHIRExporter()
def process_report(text: str, modality: str, patient_id: str) -> dict:
report = parser.parse(text, modality=modality)
report = detector.detect(report)
active_criticals = [cf for cf in report.critical_findings if not cf.negated]
if active_criticals:
print(f"WARNING: {len(active_criticals)} critical finding(s) detected")
return exporter.export(report, patient_id=patient_id)
fhir_json = process_report(report_text, modality="CT", patient_id="pt-001")
print(json.dumps(fhir_json, indent=2))See full_pipeline.py for a runnable end-to-end example.
No dependencies. The library installs with no third-party packages. This matters in hospital environments where every dependency goes through security review.
Rule-based, not ML-based. Every decision the library makes is traceable to a specific rule. No model weights, no GPU, no probabilistic outputs. Clinical teams can audit exactly why a finding was flagged.
Negation-aware. A library that can't distinguish "no pneumothorax" from "pneumothorax" is dangerous in clinical contexts. Negation detection is built into the core.
FHIR-first output. Every modern EMR speaks FHIR. The export format is designed to drop into existing integrations without transformation.
pip install radreport-parser[dev]
pytest tests/ -v- CLI tool for single-file and batch processing (
radreportcommand) -
parse_batch()API for processing lists of reports -
to_json()convenience method onParsedReport - Template matching for common report types (Chest XR, CT Abdomen, MRI Brain)
- Structured output for follow-up recommendations
- Additional FHIR resource types (ImagingStudy, Condition)
- CSV export mode for research/analytics workflows
This library is a developer tool for structuring report text. It is not a medical device and is not intended for direct clinical decision-making. Critical findings detection is designed to assist human review workflows, not replace radiologist judgment.
MIT