Skip to content

FEAT: Deprecate use_entra_auth and add auto-detect auth for Azure Speech converters#1634

Open
varunj-msft wants to merge 2 commits intomicrosoft:mainfrom
varunj-msft:varunj-msftvarunj-msft/7679-AzureSpeechTextToAudioConverter-remove-use_entra_auth
Open

FEAT: Deprecate use_entra_auth and add auto-detect auth for Azure Speech converters#1634
varunj-msft wants to merge 2 commits intomicrosoft:mainfrom
varunj-msft:varunj-msftvarunj-msft/7679-AzureSpeechTextToAudioConverter-remove-use_entra_auth

Conversation

@varunj-msft
Copy link
Copy Markdown
Contributor

Description

Deprecates the use_entra_auth parameter across both Azure Speech converters and the scorer pass-through chain, replacing it with auto-detection logic that matches the OpenAI target pattern.

Auth is now resolved automatically:

  • String azure_speech_key (or AZURE_SPEECH_KEY env var) -> API key auth
  • Callable azure_speech_key (sync or async token provider) -> Entra token auth via aad#{resource_id}#{token} format
  • Neither provided -> automatic Entra ID auth via DefaultAzureCredential + azure_speech_resource_id

The deprecated use_entra_auth param is kept with a DeprecationWarning (removal target: v0.15.0) for backward compatibility.

Note: Both converters now enforce keyword-only args (*) per style guide. All existing callers already use keyword args, but this is technically a breaking change for positional usage. Also, azure_speech_key is no longer visible in the backend converter catalog since its type changed from Optional[str] to Union[str, Callable, None] - users configure the key via the AZURE_SPEECH_KEY env var in the backend.

Files changed:

  • azure_speech_text_to_audio_converter.py - primary target
  • azure_speech_audio_to_text_converter.py - identical auth logic
  • audio_transcript_scorer.py, audio_true_false_scorer.py, audio_float_scale_scorer.py - deprecate pass-through param
  • .env_example - updated comment

Tests and Documentation

  • 26/26 converter unit tests pass (was 11 before; added tests for auto-detect, callable providers, async providers, deprecation warnings, env var fallback, missing credential errors)
  • 963 passed, 8 skipped on broader unit suite (0 regressions)
  • Ruff check clean, pre-commit hooks pass (ruff format, mypy strict)
  • Doc notebooks (doc/code/converters/2_audio_converters.py/.ipynb) don't use use_entra_auth - no JupyText changes needed

Comment thread pyrit/prompt_converter/azure_speech_audio_to_text_converter.py Outdated
Comment thread pyrit/prompt_converter/azure_speech_audio_to_text_converter.py Outdated
}
)

async def _get_speech_config_async(self) -> "speechsdk.SpeechConfig":
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder if this whole method could belong in azure_auth as a helper?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

esp since we have identical method in other converter

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder what you thought about making this into a shared helper method?

Comment thread pyrit/prompt_converter/azure_speech_text_to_audio_converter.py Outdated
Comment thread pyrit/prompt_converter/azure_speech_audio_to_text_converter.py
Comment thread pyrit/prompt_converter/azure_speech_audio_to_text_converter.py Outdated
Comment thread pyrit/prompt_converter/azure_speech_audio_to_text_converter.py Outdated
Comment thread .env_example Outdated
*,
text_capable_scorer: Scorer,
use_entra_auth: Optional[bool] = None,
use_entra_auth: bool | None = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably stay as Optional[bool] type to match other classes right?

azure_speech_key: Optional[str | Callable[[], str | Awaitable[str]]] = None,
azure_speech_resource_id: Optional[str] = None,
use_entra_auth: bool = False,
use_entra_auth: Optional[bool] = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean the default behavior (with env vars set, no args provided) changes from not using Entra to using Entra now?


try:
converter = AzureSpeechAudioToTextConverter(use_entra_auth=self._use_entra_auth)
converter = AzureSpeechAudioToTextConverter()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't it require the resource ID? At least that's how I read the docstirng.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could get grabbed from environment variables! Wonder if we want the resource ID and api_Key to be passed into inits of scorers that use converters or if that's overcomplicating..

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Saw that afterwards. What do y mean by scorers? Audio scorers?

converter = AzureSpeechAudioToTextConverter(
azure_speech_region=region, azure_speech_resource_id=resource_id, use_entra_auth=True
)
converter = AzureSpeechAudioToTextConverter(azure_speech_region=region, azure_speech_resource_id=resource_id)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default behavior changed here. Perhaps we should use an API key if it is provided as env var?

Also, no changes required for the API-key-based integration test?

Comment on lines +65 to +74
azure_speech_region (str | None): The name of the Azure region.
azure_speech_key (str | Callable[[], str | Awaitable[str]] | None): The API key for accessing
the service, or a sync/async callable that returns a token string.
If a string key is provided (or the ``AZURE_SPEECH_KEY`` env var is set), key auth is used.
If a callable token provider is provided, it is resolved at conversion time and used with
Entra ID auth (``azure_speech_resource_id`` must also be set).
If omitted, Entra ID auth via ``DefaultAzureCredential`` is used automatically.
azure_speech_resource_id (str | None): The resource ID for accessing the service when using
Entra ID auth. Required when using a callable token provider or when no API key is available.
use_entra_auth (bool | None): **Deprecated.** Will be removed in v0.15.0.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: these types in the docstring should also say Optional[...]

self._azure_speech_key = key_value
else:
logger.info(
"No azure_speech_key provided. Falling back to Entra ID authentication via DefaultAzureCredential."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Would it be more accurate to say Entra ID authentication will be attempted via DefaultAzureCredential since auth doesn't actually occur now but rather at convert time?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants