Potential privacy / IP leakage from datastore-backed speculative decoding



We are reporting a potential privacy / IP leakage issue in REST-style datastore-backed speculative decoding.

In our evaluation, we found that although the implementation limits the number of datastore tokens accepted in a single verification step, consecutive datastore-backed chunks can still accumulate into longer recovered fragments within one response. In our 1000-prompt evaluation, the average per-deployment median stitched recovery was 155 words for text-model deployments and 87 words for code-model deployments.

Representative examples below are lightly normalized for readability.

Text example
- model: `lmsys/vicuna-13b-v1.5`
- datastore: `datastore_chat_large.idx`

```text
Thank you for your patience and understanding during this challenging time.

Best regards,

[Your Name]
[Your Title]
[Your Company]
```

Code example
- model: `codellama/CodeLlama-13b-hf`
- datastore: `datastore_stack_large.idx`

```Python
def verify_reset_password_token(token):
    try:
        id = jwt.decode(token, current_app.config['SECRET_KEY'],
                       algorithms=['HS256'])['reset_password']
    except:
        return
    return User.query.get(id)
```

We understand that this is at least partly a deployment issue rather than necessarily a core correctness bug. In particular, the risk becomes more serious when REST is used with streaming output and when the datastore contains private, user-derived, or otherwise sensitive content.

For that reason, it would be useful to add explicit guidance in the documentation, for example:

- avoid streaming partial outputs in privacy-sensitive REST deployments
- do not place private or sensitive content in the datastore unless the leakage risk is acceptable
- document that per-step acceptance limits do not fully bound cumulative recovery within a single generation

We are intentionally not including low-level reproduction details in this public report. We are preparing an academic disclosure and wanted to notify the maintainers before publication. We would be happy to share a private technical write-up and affected configurations directly with the maintainers.

Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential privacy / IP leakage from datastore-backed speculative decoding #36

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Potential privacy / IP leakage from datastore-backed speculative decoding #36

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions