PyYAML
General Information
| Field | Value |
|---|---|
| Package Name | PyYAML |
| Manufacturer / Vendor | YAML/Python Community (Kirill Simonov, original author; maintained by YAML/Python communities) |
| Software Category | Library |
| Primary Documentation | Documentation, GitHub, PyPI |
| Programming Language(s) | Python, C (LibYAML bindings) |
| License | MIT License |
| Deployed Version(s) | >=6.0.2 (version-locked at 6.0.3 across services) |
| Most Recent Available Version | 6.0.3 |
| Last Review Date | 2026-01-27 |
Overview
PyYAML is a full-featured YAML processing framework for Python. It provides a complete YAML 1.1 parser and emitter, Unicode support (including UTF-8/UTF-16 input/output), a low-level event-based parser and emitter API (similar to SAX), and a high-level API for serializing and deserializing native Python objects. PyYAML also includes optional LibYAML-based C bindings for enhanced performance.
Within the medical device software, PyYAML serves as the configuration and data interchange parsing layer in the legithp-essentials library. It is integrated in the following capacity:
- YAML file parsing: The
YamlReaderclass provides a standardized interface for parsing YAML-encoded configuration data from both local filesystem and cloud storage (S3). This reader is part of the unified file reading abstraction alongsideJsonReader, enabling consistent data access patterns across the device - Safe parsing enforcement: The implementation exclusively uses
yaml.safe_load(), which only supports standard YAML tags and is safe to use with documents from any source. This prevents arbitrary code execution vulnerabilities associated with the fullyaml.load()function - Configuration management: The
read_yaml()andread_yaml_from_s3()convenience functions enable loading of configuration files for microservice initialization and operational parameters
PyYAML was selected over alternatives (ruamel.yaml, strictyaml, oyaml) due to:
- De facto standard status as the most widely adopted YAML library in the Python ecosystem
- Complete YAML 1.1 specification compliance ensuring broad interoperability
- Optional C-based LibYAML bindings for performance-critical applications
- Mature, well-tested codebase with over a decade of production use
- Simple, intuitive API with clear separation between safe and unsafe loading modes
- MIT license permitting commercial use in medical device software
- Active maintenance by the YAML/Python communities with responsive security handling
Functional Requirements
The following functional capabilities of this SOUP are relied upon by the medical device software.
| Requirement ID | Description | Source / Reference |
|---|---|---|
| FR-001 | Safely parse YAML documents into Python data structures | yaml.safe_load() function |
| FR-002 | Support UTF-8 encoded YAML documents | Unicode support in parser |
| FR-003 | Parse YAML scalar types (strings, integers, floats, booleans) | YAML 1.1 scalar resolution |
| FR-004 | Parse YAML collections (sequences, mappings) | YAML 1.1 collection types |
| FR-005 | Handle empty YAML documents gracefully (return None) | safe_load() behavior |
| FR-006 | Raise descriptive errors for malformed YAML | yaml.YAMLError exception |
| FR-007 | Support YAML anchors and aliases for data reuse | YAML 1.1 anchor/alias specification |
Performance Requirements
The following performance expectations are relevant to the medical device software.
| Requirement ID | Description | Acceptance Criteria |
|---|---|---|
| PR-001 | YAML parsing shall complete within acceptable configuration latency | Parsing completes during service initialization without delay |
| PR-002 | Memory usage shall scale linearly with document size | No memory leaks during repeated parsing operations |
| PR-003 | Parser shall handle configuration files of expected sizes | Configuration files up to 1MB parse successfully |
Hardware Requirements
The following hardware dependencies or constraints are imposed by this SOUP component.
| Requirement ID | Description | Notes / Limitations |
|---|---|---|
| HR-001 | x86-64 or ARM64 processor architecture | Pre-built wheels available for common platforms |
| HR-002 | Sufficient system memory for document parsing | Memory scales with document size and complexity |
Software Requirements
The following software dependencies and environmental assumptions are required by this SOUP component.
| Requirement ID | Description | Dependency / Version Constraints |
|---|---|---|
| SR-001 | Python runtime environment | Python >=3.8 (device uses Python >=3.12) |
| SR-002 | LibYAML C library (optional, for performance) | System library if C extensions enabled; not required |
Known Anomalies Assessment
This section evaluates publicly reported issues, defects, or security vulnerabilities associated with this SOUP component and their relevance to the medical device software.
| Anomaly Reference | Status | Applicable | Rationale | Reviewed At |
|---|---|---|---|---|
| CVE-2020-14343 (Arbitrary code execution via FullLoader) | Fixed | No | Affects PyYAML <5.4; the device uses version-locked 6.0.3 which includes the fix. Additionally, the device exclusively uses safe_load(), not the vulnerable full_load() or FullLoader | 2026-01-27 |
| CVE-2020-1747 (Arbitrary code execution via python/object/new constructor) | Fixed | No | Affects PyYAML <5.3.1; the device uses version-locked 6.0.3 which includes the fix. The device exclusively uses safe_load() which restricts constructors to safe standard types | 2026-01-27 |
| CVE-2019-20477 (Class deserialization vulnerability) | Fixed | No | Affects PyYAML 5.1 through 5.1.2; the device uses version-locked 6.0.3 which includes the fix. The device exclusively uses safe_load() which does not deserialize arbitrary Python objects | 2026-01-27 |
| CVE-2017-18342 (yaml.load() arbitrary code execution) | Fixed | No | Affects PyYAML <5.1; the device uses version-locked 6.0.3 which includes the fix. The device's YamlReader implementation exclusively uses safe_load(), never the deprecated load() function | 2026-01-27 |
PyYAML is actively maintained by the YAML and Python communities. The project maintains a security policy for coordinated vulnerability disclosure. According to public vulnerability databases, no new CVEs have been reported for PyYAML in 2024, 2025, or 2026. The historical vulnerabilities listed above were all addressed in versions released before 2021.
The device's usage pattern minimizes attack surface exposure:
- Safe loading only: The
YamlReaderclass exclusively usesyaml.safe_load(), which restricts parsed content to standard YAML tags (strings, integers, floats, lists, dicts, None, booleans). This prevents arbitrary Python object instantiation that could lead to code execution - Version locking: Requirements lock files pin PyYAML to version 6.0.3, which includes fixes for all known security vulnerabilities
- Controlled input sources: YAML files are only loaded from controlled sources (local filesystem configuration, S3 buckets with access controls), not from arbitrary external input
- Input validation: The
YamlReaderwraps parsing errors in domain-specific exceptions, ensuring malformed YAML does not propagate unhandled errors - No untrusted deserialization: The device does not use YAML for deserializing user-provided data; YAML is used exclusively for internal configuration files
- Container isolation: Microservices run in isolated containers with restricted filesystem access, limiting potential impact of any exploitation
Risk Control Measures
The following risk control measures are implemented to mitigate potential security and operational risks associated with this SOUP component:
- Version locking via requirements_lock.txt ensures reproducible, auditable deployments with known-secure versions
- Exclusive use of
safe_load()prevents arbitrary code execution vulnerabilities - Configuration files loaded only from controlled, authenticated storage locations
- Error handling wraps YAML parsing exceptions in domain-specific errors for consistent processing
- Container isolation limits potential impact of any exploitation
Assessment Methodology
The following methodology was used to identify and assess known anomalies:
-
Sources consulted:
- National Vulnerability Database (NVD) search for "pyyaml"
- GitHub Security Advisories for yaml/pyyaml
- CVE Details for PyYAML
- Snyk vulnerability database for PyYAML
- PyPI package security reports
-
Criteria for determining applicability:
- Vulnerability must affect deployed versions (PyYAML 6.0.3)
- Vulnerability must be exploitable through the device's operational context (configuration loading, safe parsing only)
- Attack vector must be reachable through the device's interfaces
- Use of
safe_load()must not already mitigate the vulnerability
Signature meaning
The signatures for the approval process of this document can be found in the verified commits at the repository for the QMS. As a reference, the team members who are expected to participate in this document and their roles in the approval process, as defined in Annex I Responsibility Matrix of the GP-001, are:
- Author: Team members involved
- Reviewer: JD-003, JD-004
- Approver: JD-001