Skip to content

[RFC] PPL rest command #5597

Description

@noCharger

Problem Statement

PPL can read documents from indices, but it has no way to bring a cluster's
operational and topology state into a query pipeline. Information such as
cluster health, node resource usage, shard placement, cluster settings,
installed plugins, index resolution, and the caller's identity lives behind
dedicated management and _cat REST endpoints. To inspect that state today a
user must leave PPL, call the endpoint directly, and post-process the JSON by
hand. There is no way to filter, sort, aggregate, or project management-endpoint
data with the rest of the language (where, stats, sort, fields).

Current State

  • On the Calcite path, PPL row sources are limited to index scans
    (visitRelation), index subqueries, and literal values.
  • describe <index> and show datasources already expose metadata as tabular
    rows through a reserved-name system source that resolves via visitRelation.
    They cover only an index's field metadata and the datasource connection
    catalog; neither reaches cluster or _cat operational endpoints.
  • The Calcite table-function seam (visitTableFunction) is unsupported on the
    primary path: it throws CalciteUnsupportedException. A source=fn(...)
    style therefore cannot introduce a new row source today.
  • Net result: operational endpoints are reachable only outside PPL, and their
    responses cannot be composed with downstream pipeline operators or inspected
    with EXPLAIN.

Long-Term Goals

Ideal outcome. A first-class, leading PPL command that turns a curated set
of read-only management endpoints into fixed-schema rows that compose naturally
with the rest of the language.

Primary objectives.

  1. Read-only, safe access to operational and topology endpoints from within a
    PPL pipeline.
  2. A fixed, plan-time-known schema per endpoint, so downstream where, stats,
    sort, and fields work and EXPLAIN is meaningful.
  3. A default-deny allow-list, so only vetted read-only endpoints are reachable.
  4. Caller-context authorization and secret-field redaction.

Sustainability and scalability. Endpoints are expressed as data in a
registry, so adding one is a reviewed registry entry rather than new operators
or grammar. The command rides the existing system-row-source seam, so it
inherits the optimizer and execution machinery already used by describe and
the system-index family.

Does it address the root problem? Yes. It closes the gap that operational
data is not query-able in PPL, without standing up a new engine or execution
model.

Confidence. High. It reuses a shipped, proven seam (the reserved-name
system source behind describe/show) and a fixed-schema scan; both the
mechanism and the execution path are already established in the codebase.

Proposal

Add a leading command:

| rest <endpoint-path> [count=<int>] [<arg>=<value> ...]

It resolves an allow-listed, read-only management endpoint into a fixed-schema
table on the Calcite path, modeled as a system row source. It supports
row-count capping, per-endpoint server-side filter arguments with explicit
value validation, and deterministic plan-time validation that produces
client-side (400-class) errors. Endpoint responses are normalized into flat,
typed rows.

Example:

| rest '/_cat/indices' | where health = 'yellow' | sort index | fields index, health, pri

Approach

  • Grammar and AST. A new REST lexer token and restCommand parser rule;
    a RestRelation AST node. The AST builder validates the endpoint spec and
    encodes it into a single reserved table name.
  • Resolution. The storage engine decodes the reserved name into a
    RestSourceTable, exactly as describe resolves to a system index. This
    bridges through visitRelation and never reaches the unsupported
    table-function seam.
  • Execution. RestSourceTable -> CalciteLogicalRestScan ->
    CalciteEnumerableRestScan. A central registry maps each endpoint to its
    read-only transport action, its fixed output schema, the query arguments it
    accepts (with allowed value domains), and a secret-field filter. Dispatch runs
    under the caller's security context through the node client (with a standalone
    REST-client path for the datasource mode).
  • Initial endpoint set. /_cluster/health, /_cluster/state,
    /_cluster/settings, /_cat/indices, /_cat/nodes, /_cat/cluster_manager,
    /_cat/plugins, /_cat/shards, /_resolve/index,
    /_plugins/_security/authinfo.
  • Arguments. count caps emitted rows. Server-side filter arguments are
    applied per endpoint with explicit value validation: local on
    /_cluster/health, health on /_cat/indices, expand_wildcards on
    /_resolve/index. A timeout token is reserved in the grammar but rejected
    with a 400 in the initial release, since a single timeout cannot map uniformly
    across the endpoints.
  • Output shaping. Each endpoint normalizes its response into the fixed
    schema: numeric type normalization, identifier-to-name resolution (for example
    the cluster-manager node id rendered as its node name), role-name expansion,
    structural flattening of nested responses into uniform rows, secret-field
    filtering, and graceful null-valued rows when an optional plugin is absent.

Alternative

  • Table-function source source=rest('<endpoint>'). Cleaner if the endpoint
    set later becomes open-ended and value-parameterized, and it would unify with
    other parameterized sources under one seam. It requires first building a
    generic Calcite table-function-source capability, because visitTableFunction
    is unsupported on the primary path. Deferred; the leading-command form delivers
    the same result today without that prerequisite.
  • Dynamic, response-driven schema (_MAP / schema-on-read). Needed only for
    endpoints whose response shape is keyed by data values. Deferred to a future
    release that depends on the schema-on-read work; the initial endpoint set is
    fully representable with fixed schemas.
  • Raw JSON pass-through. Returning the endpoint's JSON unchanged is rejected:
    a non-fixed shape cannot publish a plan-time row type, which breaks downstream
    operators and EXPLAIN.

Implementation Discussion

  • Capability gate: table-function source. visitTableFunction throws on the
    Calcite path, so introducing a row source through a function call is not
    available today. The leading-command plus reserved-name approach sidesteps this
    by reusing the system-index scan seam, which is already supported.
  • Capability gate: plan-time schema. A Calcite scan must publish its row type
    before execution. This is why each endpoint declares a fixed schema in the
    registry, and why dynamic, response-driven endpoints are out of the initial
    scope.
  • Security. Default-deny allow-list; only read-only endpoints are
    registered; dispatch runs under the caller's security context; secret-bearing
    fields are filtered during row shaping; and user-supplied argument values are
    validated against per-argument domains rather than passed unchecked into the
    underlying request.
  • No pushdown. Rows originate from a management transport action, not a
    Lucene index, so count caps rows after the call and downstream where,
    sort, and stats run in Calcite. EXPLAIN shows the scan with downstream
    operators composed above it.
  • Testing. Unit tests cover the registry (declared schema, allow-list,
    argument key and value validation, type coercion). Integration tests cover each
    endpoint plus negative cases that assert a 400 for a non-allow-listed endpoint,
    an empty path, a disallowed argument, an out-of-domain argument value, and a
    negative count.
  • Deferred items. Dynamic, response-driven endpoints (_MAP); the
    include_defaults argument on /_cluster/settings and the level argument on
    /_cluster/health, which need additional plumbing or a different output shape;
    and a generic table-function source. Each is a follow-on, not a blocker for the
    initial command.

Metadata

Metadata

Assignees

Labels

calcitecalcite migration releated

Fields

No fields configured for Feature.

Projects

Status
In progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions