HotFix bug schema version#1927
Conversation
…tractions from core server.
#1861) * fix: prevent defaults being set to undefined, and interpret numbers and enums as strings. * chore: Auto-format JavaScript files with Prettier
…nance, metricThread and task_worker (#1885) Co-authored-by: Joshua S Brown <joshbro42867@yahoo.com> Co-authored-by: Joshua S Brown <brownjs@ornl.gov>
…, to hit feature parity with web ser… (#1859)
…mat, and metadata format (#1894)
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
…ce with API spec file. (#1903)
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
…e and default to different database for tests (#1917)
… cmake (#1918) Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Reviewer's GuideAdds first-class support for both JSON Schema and LinkML schema types across the web UI, Python client, core server, and integration tests, including type/format propagation, external LinkML schema API wiring, metadata validation, and version bumping. Sequence diagram for LinkML schema creation and storagesequenceDiagram
actor User
participant WebUI as Web_UI_dlg_schema
participant WebAPI as Web_API_datafed_ws
participant Core as Core_Server_SchemaHandler
participant DB as DatabaseAPI
participant ExtStore as ExternalSchemaStorage
participant ExtAPI as External_Schema_API
User->>WebUI: Open schema dialog (mode_new)
WebUI->>WebUI: curType = "linkml"
WebUI->>WebUI: jsoned.setMode(ace/mode/yaml)
User->>WebUI: Enter id, description, type=linkml
User->>WebUI: Submit
WebUI->>WebAPI: POST /api/sch/create
activate WebAPI
WebAPI->>WebAPI: if id has no colon: id = id+":0"
WebAPI->>Core: SchemaCreateRequest(id, def, type=linkml, format=yaml,...)
deactivate WebAPI
activate Core
Core->>DB: schemaCreate(arango_req with def="{}")
activate DB
DB->>DB: Persist stub schema (type=linkml, format=yaml, def="{}")
DB-->>Core: SchemaDataReply
deactivate DB
Core->>ExtStore: storeContent(schema_id, format=yaml, def)
activate ExtStore
ExtStore->>ExtAPI: POST /schemas (LinkML definition)
ExtAPI-->>ExtStore: 201 Created
ExtStore-->>Core: true
deactivate ExtStore
Core-->>WebAPI: SchemaDataReply
WebAPI-->>WebUI: JSON reply
WebUI-->>User: Schema created (type=LinkML)
Sequence diagram for LinkML metadata validationsequenceDiagram
actor Client
participant CLI as Python_CLI
participant CoreClient as Python_CommandLib
participant Core as Core_Server_ClientWorker
participant Handler as SchemaHandler
participant DB as DatabaseAPI
participant Validator as ExternalSchemaValidator
participant ExtAPI as External_Schema_API
Client->>CLI: data create/update with metadata and sch_id
CLI->>CoreClient: _capi.recordCreate/recordUpdate(...)
CoreClient-->>Core: RecordCreate/RecordUpdateRequest
activate Core
Core->>Handler: validateMetadataContent(a_uid, sch_id, metadata)
activate Handler
Handler->>DB: setClient(a_uid)
Handler->>DB: schemaGet(sch_id)
DB-->>Handler: SchemaData(type=linkml, format=yaml or json)
Handler->>Handler: schema_type = linkml
Handler->>Handler: schema_format from record (yaml/json)
Handler->>Handler: metadata_format = json
Handler->>Validator: validateMetadata(sch_id, metadata_format, metadata)
activate Validator
Validator->>ExtAPI: POST /validate?engine=linkml
ExtAPI-->>Validator: 200 OK or 422 with errors
Validator-->>Handler: ValidationResult(valid, errors)
deactivate Validator
Handler-->>Core: errors string
deactivate Handler
alt errors not empty
Core-->>CoreClient: Error reply (validation failed)
CoreClient-->>CLI: Raise exception / print errors
CLI-->>Client: Show validation errors
else
Core-->>CoreClient: Success reply
CoreClient-->>CLI: Success
CLI-->>Client: Operation completed
end
deactivate Core
Sequence diagram for schema creation from Python CLI with type and formatsequenceDiagram
actor User
participant CLI as datafed_CLI
participant Cmd as CommandLib
participant Core as Core_Server
User->>CLI: datafed schema create SID --definition-file sch.yaml -t linkml --format yaml
CLI->>CLI: Validate options (definition xor definition_file)
CLI->>Cmd: schemaCreate(schema_id, definition_file, description, public, system, schema_type=linkml, schema_format=yaml)
activate Cmd
Cmd->>Cmd: _load_schema_file(sch.yaml) -> definition
Cmd->>Cmd: if schema_type=="json-schema" then _validate_json(definition)
Cmd->>Core: SchemaCreateRequest(id="SID:0", def=definition, type=linkml, format=yaml,...)
deactivate Cmd
Core-->>Cmd: SchemaDataReply
Cmd-->>CLI: Reply
CLI-->>User: Print created schema info (type=linkml, format=yaml)
Class diagram for extended schema handling (JSON Schema and LinkML)classDiagram
class Config {
+unordered_map~string,SchemaAPIConfig~ schemas
+static Config& getInstance()
}
class SchemaAPIConfig {
+string base_url
+string bearer_token
+bool isConfigured()
}
class SchemaHandler {
-DatabaseAPI& m_db_client
-SchemaFactory m_schema_factory
+SchemaHandler(DatabaseAPI a_db_client)
+handleCreate(string a_uid, SchemaCreateRequest a_request, AckReply a_reply, LogContext log_context)
+handleRevise(string a_uid, SchemaReviseRequest a_request, AckReply a_reply, LogContext log_context)
+handleUpdate(string a_uid, SchemaUpdateRequest a_request, AckReply a_reply, LogContext log_context)
+validateMetadataContent(string a_uid, string a_sch_id, string a_metadata, LogContext log_context) string
}
class SchemaFactory {
+registerStorage(string type, ISchemaStorage storage)
+registerValidator(string type, ISchemaValidator validator)
+setDefaultSchemaType(string type)
+getStorage(string type) ISchemaStorage
+getValidator(string type) ISchemaValidator
}
class ISchemaStorage {
<<interface>>
+storeContent(string id, string format, string content, LogContext log_context) bool
+updateContent(string id, string format, string content, LogContext log_context) bool
}
class ArangoSchemaStorage {
+storeContent(string id, string format, string content, LogContext log_context) bool
+updateContent(string id, string format, string content, LogContext log_context) bool
}
class ExternalSchemaStorage {
-unique_ptr~SchemaAPIClient~ m_client
+ExternalSchemaStorage(unique_ptr~SchemaAPIClient~ a_client)
+storeContent(string id, string format, string content, LogContext log_context) bool
+updateContent(string id, string format, string content, LogContext log_context) bool
}
class ISchemaValidator {
<<interface>>
+validateMetadata(string schema_id, string schema_format, string metadata, LogContext log_context) ValidationResult
+hasValidationCapability() bool
+cacheSchema(string schema_id, string schema_format, string schema_def, LogContext log_context) bool
}
class JsonSchemaValidator {
+JsonSchemaValidator(shared_ptr~ISchemaStorage~ storage)
+validateMetadata(string schema_id, string schema_format, string metadata, LogContext log_context) ValidationResult
+hasValidationCapability() bool
+cacheSchema(string schema_id, string schema_format, string schema_def, LogContext log_context) bool
}
class ExternalSchemaValidator {
-unique_ptr~SchemaAPIClient~ m_client
-string m_engine
+ExternalSchemaValidator(unique_ptr~SchemaAPIClient~ a_client, string engine)
+validateMetadata(string schema_id, string schema_format, string metadata, LogContext log_context) ValidationResult
+hasValidationCapability() bool
+cacheSchema(string schema_id, string schema_format, string schema_def, LogContext log_context) bool
}
class SchemaAPIClient {
-string m_base_url
-string m_bearer_token
+SchemaAPIClient(SchemaAPIConfig config)
+validateMetadata(string schema_id, string engine, string schema_format, string metadata, string errors, string warnings, LogContext log_context) bool
}
class DatabaseAPI {
+schemaCreate(SchemaCreateRequest a_request, SchemaDataReply a_reply, LogContext log_context)
+schemaRevise(SchemaReviseRequest a_request, SchemaDataReply a_reply, LogContext log_context)
+schemaUpdate(SchemaUpdateRequest a_request, LogContext log_context)
+setSchemaData(SchemaData a_schema, Value a_obj)
+setClient(string uid)
}
class SchemaCreateRequest {
+string id
+string def
+string type
+string format
+bool pub
+bool sys
}
class SchemaData {
+string id
+uint32 ver
+string def
+string type
+string format
+uint64 cnt
+string own_id
+string own_nm
+string desc
+bool pub
+bool depr
+bool ref
}
Config "1" o-- "*" SchemaAPIConfig : schemas
SchemaHandler --> DatabaseAPI
SchemaHandler --> SchemaFactory
SchemaHandler --> ISchemaStorage
SchemaHandler --> ISchemaValidator
SchemaFactory --> ISchemaStorage
SchemaFactory --> ISchemaValidator
ArangoSchemaStorage ..|> ISchemaStorage
ExternalSchemaStorage ..|> ISchemaStorage
JsonSchemaValidator ..|> ISchemaValidator
ExternalSchemaValidator ..|> ISchemaValidator
ExternalSchemaValidator --> SchemaAPIClient
ExternalSchemaStorage --> SchemaAPIClient
SchemaAPIClient --> SchemaAPIConfig
DatabaseAPI --> SchemaData
DatabaseAPI --> SchemaCreateRequest
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 3 issues, and left some high level feedback:
- In the Python CLI,
schema_typedefaults tojson-schemawhileschema_formatdefaults tojsoneven when--type linkmlis selected, which can lead to LinkML schemas being created with aformatofjson; consider either coupling the defaults (e.g., defaultschema_format=yamlwhenschema_type=linkml) or validating/enforcing thatlinkmlis only used withyaml. - For
schemaReviseandschemaUpdateinCommandLib, the newschema_typeparameter is only used to gate JSON validation and is not propagated to the request message, while the CLI exposes--type; if the schema engine type is immutable, it may be clearer to omit the flag or explicitly reject attempts to change type rather than silently ignoring it.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In the Python CLI, `schema_type` defaults to `json-schema` while `schema_format` defaults to `json` even when `--type linkml` is selected, which can lead to LinkML schemas being created with a `format` of `json`; consider either coupling the defaults (e.g., default `schema_format=yaml` when `schema_type=linkml`) or validating/enforcing that `linkml` is only used with `yaml`.
- For `schemaRevise` and `schemaUpdate` in `CommandLib`, the new `schema_type` parameter is only used to gate JSON validation and is not propagated to the request message, while the CLI exposes `--type`; if the schema engine type is immutable, it may be clearer to omit the flag or explicitly reject attempts to change type rather than silently ignoring it.
## Individual Comments
### Comment 1
<location path="python/datafed_pkg/datafed/CommandLib.py" line_range="412-421" />
<code_context>
+ schema_type=None):
</code_context>
<issue_to_address>
**question:** Schema type for revise/update only affects local validation and is not sent to the server, which can be confusing.
In `schemaRevise` and `schemaUpdate`, `schema_type` only controls whether `_validate_json` runs and is not included in the outgoing request (no `msg.type` set). With `-t/--type` now exposed in the CLI, users may expect this to change the stored schema type, not just validation behavior. If the engine type should be immutable (as in the web UI), either rename this parameter to reflect its validation-only role or propagate it to the request so behavior matches user expectations.
</issue_to_address>
### Comment 2
<location path="core/server/client_handlers/SchemaHandler.cpp" line_range="79-82" />
<code_context>
// Persist to Arango — exception propagates naturally on failure
- m_db_client.schemaCreate(a_request, a_reply, log_context);
+ // For external schema types, store a stub in Arango —
+ // the real content lives in external storage.
+ SchemaCreateRequest arango_req(a_request);
+ if (!a_request.type().empty() && a_request.type() != "json-schema") {
+ arango_req.set_def("{}");
+ }
</code_context>
<issue_to_address>
**issue (bug_risk):** Storing `{}` as the schema definition for non-JSON engines may break consumers that rely on the DB’s `def` field.
Non-`json-schema` types now persist a stub (`"{}"`) in `def`, with the actual content stored elsewhere. This preserves validation but changes the meaning of `def` for any code or tooling that reads `SchemaData.def` expecting the full schema. If such consumers exist, they will now see `{}` instead of the real definition. Consider either not setting `def` at all for external types (so clients can detect this) or defining a clear mechanism for clients to retrieve the full definition via the storage/validator layer rather than directly from the DB field.
</issue_to_address>
### Comment 3
<location path="python/datafed_pkg/datafed/CLI.py" line_range="2065" />
<code_context>
def _schemaCreate(schema_id, definition, definition_file, description, public, system, schema_type, schema_format):
</code_context>
<issue_to_address>
**issue:** CLI help text still claims a JSON schema definition is required despite new LinkML/yaml support.
The `_schemaCreate` help text still states “A JSON schema definition is required” and only mentions JSON, even though we now support `schema_type` (e.g. `linkml`) and `schema_format` (e.g. `yaml`). Please update the wording to reflect all supported schema types and formats so users aren’t misled into thinking only JSON is allowed.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| schema_type=None): | ||
| """ | ||
| Create a new revision of an existing schema | ||
|
|
||
| Creates a new version of the specified schema. Any fields not provided | ||
| are carried forward from the current revision. The definition may be | ||
| provided directly as a string or read from a local file. | ||
|
|
||
| Parameters | ||
| ---------- |
There was a problem hiding this comment.
question: Schema type for revise/update only affects local validation and is not sent to the server, which can be confusing.
In schemaRevise and schemaUpdate, schema_type only controls whether _validate_json runs and is not included in the outgoing request (no msg.type set). With -t/--type now exposed in the CLI, users may expect this to change the stored schema type, not just validation behavior. If the engine type should be immutable (as in the web UI), either rename this parameter to reflect its validation-only role or propagate it to the request so behavior matches user expectations.
| // For external schema types, store a stub in Arango — | ||
| // the real content lives in external storage. | ||
| SchemaCreateRequest arango_req(a_request); | ||
| if (!a_request.type().empty() && a_request.type() != "json-schema") { |
There was a problem hiding this comment.
issue (bug_risk): Storing {} as the schema definition for non-JSON engines may break consumers that rely on the DB’s def field.
Non-json-schema types now persist a stub ("{}") in def, with the actual content stored elsewhere. This preserves validation but changes the meaning of def for any code or tooling that reads SchemaData.def expecting the full schema. If such consumers exist, they will now see {} instead of the real definition. Consider either not setting def at all for external types (so clients can detect this) or defining a clear mechanism for clients to retrieve the full definition via the storage/validator layer rather than directly from the DB field.
| def _schemaCreate(schema_id, definition, definition_file, description, | ||
| public, system, schema_type, schema_format): | ||
| """ | ||
| Create a new metadata schema. A JSON schema definition is required and |
There was a problem hiding this comment.
issue: CLI help text still claims a JSON schema definition is required despite new LinkML/yaml support.
The _schemaCreate help text still states “A JSON schema definition is required” and only mentions JSON, even though we now support schema_type (e.g. linkml) and schema_format (e.g. yaml). Please update the wording to reflect all supported schema types and formats so users aren’t misled into thinking only JSON is allowed.
Summary by Sourcery
Add LinkML schema engine support across web UI, core server, CLI, and clients, including external schema API integration, while tightening schema ID/version handling and bumping component versions.
New Features:
Bug Fixes:
Enhancements:
Build:
CI:
Deployment:
Documentation:
Tests: