Skip to content

CI Test: Develop branch#7

Draft
jdsika wants to merge 8 commits intomainfrom
develop
Draft

CI Test: Develop branch#7
jdsika wants to merge 8 commits intomainfrom
develop

Conversation

@jdsika
Copy link
Copy Markdown

@jdsika jdsika commented Apr 2, 2026

DO NOT MERGE

@jdsika jdsika self-assigned this Apr 2, 2026
jdsika added 4 commits April 25, 2026 14:02
Emit warnings for abstract class covering axiom edge cases:

- Zero children: warn that no covering axiom will be generated
- One child: warn that the covering axiom degenerates to an equivalence
  (Parent = Child), recommending --skip-abstract-class-as-unionof-subclasses

Both axioms are still emitted when applicable (semantically correct per
OWL 2), but warnings alert users who extend the ontology downstream.

Tests verify warnings are logged, flag suppression works, the
single-child covering axiom triple is correctly asserted, plus
negative tests for multi-child and concrete class cases, and the
mixin-only children edge case.

Refs: linkml#3309, linkml#3219
Signed-off-by: jdsika <carlo.van-driesten@bmw.de>
… consistency

JSON-LD processors treat xsd:anyURI as an opaque string literal,
so range:uri/uriorcurie slots get xsd:anyURI coercion instead of
proper IRI node semantics (@type:@id, owl:ObjectProperty, sh:IRI).

Add an opt-in --xsd-anyuri-as-iri flag that promotes xsd:anyURI ranges
to IRI semantics across all three generators:

  - JSON-LD context: @type: xsd:anyURI → @type: @id
  - OWL: DatatypeProperty → ObjectProperty (no rdfs:range restriction)
  - SHACL: sh:datatype xsd:anyURI → sh:nodeKind sh:IRI

The flag only affects types whose XSD mapping is xsd:anyURI (uri and
uriorcurie). The curie type (xsd:string) is correctly excluded via
is_xsd_anyuri_range() to maintain cross-generator consistency.

Standards basis:
  - OWL 2 §5.3-5.4 (ObjectProperty vs DatatypeProperty)
  - SHACL §4.8.1 (sh:nodeKind sh:IRI)
  - JSON-LD 1.1 §4.2.2 (type coercion with @id)
  - RDF 1.1 §3.2-3.3 (IRIs as first-class nodes, not string literals)

Signed-off-by: jdsika <carlo.van-driesten@bmw.de>
…lib serialization

Add a --deterministic / --no-deterministic CLI flag (default off) to OWL,
SHACL, JSON-LD Context, and JSON-LD generators that produces byte-identical
output across invocations.

Three-phase hybrid pipeline for Turtle generators:
1. RDFC-1.0 canonicalization (W3C Recommendation) via pyoxigraph
2. Weisfeiler-Lehman structural hashing for diff-stable blank node IDs
3. Hybrid rdflib re-serialization for idiomatic Turtle (inline blank
   nodes, collection syntax, prefix filtering)

JSON generators use deterministic_json() with recursive deep-sort and
JSON-LD-aware key ordering that preserves conventional @context structure.

Collection items (owl:oneOf, sh:in, sh:ignoredProperties) are sorted
when --deterministic is set to ensure reproducible RDF list order.

pyoxigraph >= 0.4.0 is imported lazily only when --deterministic is used.
Tests skip gracefully when pyoxigraph is unavailable.

Refs: linkml#1847
Signed-off-by: Carlo van Driesten <carlo.van-driesten@bmw.de>
Signed-off-by: jdsika <carlo.van-driesten@bmw.de>
… names

Add an opt-in --normalize-prefixes flag to OWL, SHACL, and JSON-LD
Context generators that normalises non-standard prefix aliases to
well-known names from a static prefix map (derived from rdflib 7.x
defaults, cross-checked against prefix.cc consensus).

Key design decisions:
- Static frozen map (MappingProxyType) instead of runtime
  Graph().namespaces() lookup eliminates rdflib version dependency
- Both http://schema.org/ and https://schema.org/ map to 'schema'
- Shared normalize_graph_prefixes() helper used by OWL and SHACL
- Two-phase graph normalisation: Phase 1 normalises schema-declared
  prefixes, Phase 2 cleans up runtime-injected bindings
- Collision detection: skip with warning when standard prefix name
  is already user-declared for a different namespace
- Phase 2 guard prevents overwriting HTTPS bindings with HTTP variants

The flag defaults to off, preserving existing behaviour.

Tests cover OWL, SHACL, and context generators with sdo->schema,
dce->dc, http/https edge case, custom prefix preservation, flag-off
backward compatibility, cross-generator consistency, prefix collision
detection, schema1 regression prevention, Phase 2 HTTPS guard, empty
schema edge case, and static map integrity.

Signed-off-by: jdsika <carlo.van-driesten@bmw.de>
@jdsika jdsika force-pushed the develop branch 2 times, most recently from 40ca31d to ed8c177 Compare April 25, 2026 12:36
Emit @type: @vocab with a scoped @vocab namespace for enum-ranged
slots whose permissible values all have meaning IRIs that share a
single namespace and whose text matches the meaning local name.

This enables bare string enum values (e.g. "RoadTypeMotorway") to
expand to full IRIs via JSON-LD 1.1 type coercion (section 4.2.3)
combined with scoped contexts (section 4.1.8).

The combined context preserves backward compatibility with structured
{text, description, meaning} objects via the existing SKOS mappings.

Enums that don't meet the eligibility criteria (missing meanings,
mixed namespaces, or text/local-name mismatches) fall back to the
existing ENUM_CONTEXT behavior.

Refs: linkml#2497
Signed-off-by: Carlo van Driesten <carlo.van-driesten@bmw.de>
jdsika added 3 commits April 27, 2026 14:22
…erals

Add a `--default-language` CLI option to both gen-owl and gen-shacl that
emits BCP 47 language-tagged string literals for human-readable annotations.

gen-owl changes:
- New `default_language` field on OwlSchemaGenerator
- `_LANGUAGE_TAGGABLE_RANGES` frozenset (string, ncname) guards tagging
- `_resolve_language()` checks element-level in_language first, then default
- `_literal()` helper creates properly tagged Literal objects
- `add_metadata()` tags string-range and fallback-range literals
- `add_enum()` PV labels respect language tags
- New `--default-language` Click option

gen-shacl changes:
- New `default_language` field on ShaclGenerator
- NodeShape rdfs:label / rdfs:comment get language tags
- PropertyShape sh:name / sh:description get language tags via prop_pv_text()
- Numeric literals (sh:order, sh:minCount, etc.) are never tagged
- New `--default-language` Click option

Tests:
- 3 new OWL tests: tagged labels, backward-compat plain literals, URI ranges
- 4 new SHACL tests: NodeShape, PropertyShape, plain literals, numeric guard

Signed-off-by: Carlo van Driesten <carlo.van-driesten@bmw.de>
…apes

Add a --message-template CLI option to gen-shacl that generates
sh:message literals on every property shape from a user-supplied
template string.

Supported placeholders:
- {name}  - slot name (underscore-separated LinkML identifier)
- {title} - slot title (human-readable), falls back to name
- {class} - enclosing class name
- {path}  - property IRI

Example usage:
  gen-shacl --message-template 'Validation of {name} failed!' schema.yaml

This enables SHACL validators (e.g. pyshacl) to produce human-friendly
error messages identifying the specific constraint that was violated.

Changes:
- New message_template field on ShaclGenerator dataclass
- Template expansion after sh:name / sh:description emission
- Uses prop_pv_text for language tag support when combined with
  --default-language (sh:message gets language tag per SHACL 2.1.5)
- Invalid/positional placeholders and format specs raise ValueError
  with helpful message (catches KeyError, IndexError, ValueError)
- Empty and whitespace-only templates normalised to None (no emission)
- New --message-template Click CLI option
- 10 new tests: basic, title/class placeholders, backward-compat,
  invalid/positional/format-spec errors, empty/whitespace, combined

Signed-off-by: Carlo van Driesten <carlo.van-driesten@bmw.de>
Implement SHACL-SPARQL constraint generation for the boolean-guard
pattern commonly used in conditional validation rules. When a LinkML
class has rules: blocks with preconditions (value_presence: PRESENT)
and postconditions (equals_string: true), the generator now emits
sh:SPARQLConstraint nodes on the corresponding sh:NodeShape.

Features:
- New _add_rules() method translates recognised rule patterns to SPARQL
- Boolean-guard pattern: if value present then flag must be true
- Rule description mapped to sh:message on the constraint
- Language-tagged sh:message when --default-language is set
- Deactivated rules are skipped
- Warnings emitted for bidirectional/open_world rule flags
- New --emit-rules/--no-emit-rules CLI flag (default: enabled)
- Full URI references in SPARQL (no PREFIX declarations needed)

The generated SPARQL follows W3C SHACL Section 5 and uses the pre-bound
\ variable per Section 5.3.1. Constraints are validated by pyshacl
with advanced=True.

Refs: linkml#2464
Signed-off-by: Carlo van Driesten <carlo.van-driesten@bmw.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant