Skip to content

Potential privacy / IP leakage from datastore-backed speculative decoding #36

@fastdecoding

Description

@fastdecoding

We are reporting a potential privacy / IP leakage issue in REST-style datastore-backed speculative decoding.

In our evaluation, we found that although the implementation limits the number of datastore tokens accepted in a single verification step, consecutive datastore-backed chunks can still accumulate into longer recovered fragments within one response. In our 1000-prompt evaluation, the average per-deployment median stitched recovery was 155 words for text-model deployments and 87 words for code-model deployments.

Representative examples below are lightly normalized for readability.

Text example

  • model: lmsys/vicuna-13b-v1.5
  • datastore: datastore_chat_large.idx
Thank you for your patience and understanding during this challenging time.

Best regards,

[Your Name]
[Your Title]
[Your Company]

Code example

  • model: codellama/CodeLlama-13b-hf
  • datastore: datastore_stack_large.idx
def verify_reset_password_token(token):
    try:
        id = jwt.decode(token, current_app.config['SECRET_KEY'],
                       algorithms=['HS256'])['reset_password']
    except:
        return
    return User.query.get(id)

We understand that this is at least partly a deployment issue rather than necessarily a core correctness bug. In particular, the risk becomes more serious when REST is used with streaming output and when the datastore contains private, user-derived, or otherwise sensitive content.

For that reason, it would be useful to add explicit guidance in the documentation, for example:

  • avoid streaming partial outputs in privacy-sensitive REST deployments
  • do not place private or sensitive content in the datastore unless the leakage risk is acceptable
  • document that per-step acceptance limits do not fully bound cumulative recovery within a single generation

We are intentionally not including low-level reproduction details in this public report. We are preparing an academic disclosure and wanted to notify the maintainers before publication. We would be happy to share a private technical write-up and affected configurations directly with the maintainers.

Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions