Conversation
There was a problem hiding this comment.
Pull request overview
This PR focuses on increasing unit test coverage across the typeagent Python package and improving local coverage workflows (notably on Windows) so coverage reporting targets the actual library code under src/typeagent.
Changes:
- Added multiple new unit test modules covering previously-untested (or low-coverage) areas (e.g., search language compilation, serialization, transcripts, memory indexes).
- Updated Windows
make.batto add acoveragetarget and runcoverageviauv. - Updated
.coveragercto measure coverage for thetypeagentpackage instead of tests.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
tests/test_utils.py |
Adds additional unit tests for utility helpers and client creation paths. |
tests/test_transcripts.py |
Extends transcript tests (speaker parsing, serialize/deserialize, file round-trips, alias building). |
tests/test_textlocindex.py |
New tests for TextToTextLocationIndex (empty/clear/serialize/deserialize/get). |
tests/test_serialization.py |
Adds more serialization/deserialization edge-case tests and knowledge deserialization tests. |
tests/test_searchlang_compile.py |
New unit tests for searchlang.py compile paths that don’t require a live LLM. |
tests/test_search.py |
New tests for SearchOptions, ConversationSearchResult, and non-searchable paths. |
tests/test_messageutils.py |
New tests for chunk batching and text range helper logic. |
tests/test_memory_semrefindex.py |
New tests for memory semantic ref index helper functions. |
tests/test_mcp_server.py |
Adds more unit coverage for MCP server model response formats, repr, and argument validation. |
tests/test_email_message.py |
New tests for email message metadata parsing, knowledge extraction, and serialization. |
tests/test_convutils.py |
New tests for conversation time range helpers and prompt section generation. |
tests/test_convthreads.py |
New tests for in-memory conversation thread storage/indexing behavior. |
make.bat |
Adds coverage command for Windows developer workflow. |
.coveragerc |
Adjusts coverage source to typeagent so reports reflect library coverage. |
|
Can you turn the changes to .coveragerc and make.bat into a separate PR? Those are in theory uncontroversial, although for me, the source should be 'src', not 'typeagent'. |
KRRT7
left a comment
There was a problem hiding this comment.
Thanks for picking this up — good to see tests for modules that previously had zero coverage. The .coveragerc fix (source = test → source = typeagent) is correct, I had the same note in #242.
A few things to address before this can merge:
Merge conflict
PR #244 (merged earlier today) also modified tests/test_utils.py — added TestResolveAzureModelName and some formatting changes. You'll need to rebase onto current main.
AGENTS.md violations
The Copilot-generated tests tripped a few repo conventions (AGENTS.md):
-
Imports inside test functions — Several new tests in
test_serialization.py,test_utils.py, and others have local imports (from typeagent.knowpro.serialization import ...etc). AGENTS.md requires top-level imports unless it's__main__/main(), pydantic/logfire, or circular import avoidance. These should be hoisted to the module import section. -
MagicMockusage (test_mcp_server.py) — AGENTS.md says "don't mock; use the regular implementation (maybe introduce a fixture to create it)". For something like a__repr__test, simple sentinel objects (object()orSimpleNamespace) work fine without mock-spec concerns. -
ConversationSettings()without test embedding model (test_mcp_server.py) — The default constructor callscreate_embedding_model()which can require real API keys. Tests that need aConversationSettingsshould passcreate_test_embedding_model()so they stay offline/CI-safe. -
Vacuous assertion (
test_searchlang_compile.py) —assert len(group.terms) >= 0is always true for any list. This should assert the actual expected outcome for the both-wildcards facet case.
Context on scope
Heads up — Guido closed #242 and indicated he'd rather test coverage not become an explicit project focus right now. That said, the .coveragerc fix and tests for previously-untested modules (convthreads, convutils, email_message, messageutils, textlocindex) stand on their own as gap-filling. Might be worth reframing the PR description to emphasize those targeted additions rather than coverage numbers.
KRRT7
left a comment
There was a problem hiding this comment.
Ran the full test suite locally — 21 failures and 1 collection error. Most are because the fork is out of sync with main (PRs #235, #241, #244 landed after your branch diverged). A rebase will fix those. But there are also some genuine test bugs and AGENTS.md issues worth fixing in the same pass.
After rebase: 15 of 21 failures will resolve themselves
These are stale-fork issues, not test bugs:
test_mcp_server.py— importsresolve_azure_model_namewhich was added in #244test_utils.pypretty_print/format_code tests — #235 replacedblackwithpprint, output is now double-quotedtest_utils.pyparse_azure_endpoint tests — #241 changed endpoint parsing behaviortest_utils.py— callsparse_azure_endpoint_partsandresolve_azure_model_name, both added in #244
Genuine test bugs (won't fix on rebase)
test_memory_semrefindex.py — 5 TypeError failures. Every add_entity/add_topic/add_action call is missing the required chunk_ordinal positional argument. These have never worked. See inline comment.
test_transcripts.py — test_transcript_knowledge_extraction_slow hits a real LLM API and fails without keys.
AGENTS.md issues across the new files
- Imports from re-export barrel — 7 files import from
typeagent.knowpro.interfacesinstead of the defining module (interfaces_core,interfaces_search,interfaces_serialization). AGENTS.md: "Never import from a module that just re-exports it." - Local imports inside test functions —
test_convthreads.py(line 87),test_search.py(line 84). AGENTS.md: "Don't put imports inside functions." MagicMockusage intest_mcp_server.py— AGENTS.md: "don't mock; use the regular implementation."ConversationSettings()without test embedding model intest_mcp_server.py— default constructor callscreate_embedding_model()which needs real API config.
Weak assertions worth tightening
Several passing tests have assertions that can never fail:
test_searchlang_compile.py:assert len(group.terms) >= 0— always true for any listtest_convthreads.py:test_lookup_thread_returns_matchesonly assertsisinstance(results, list)— never checks the results are non-empty or correcttest_search.py:test_search_options_defaultsandtest_conversation_search_result_basicjust reconstruct constructor argstest_memory_semrefindex.py:test_add_term_empty_string_is_storedonly checkssize() == 1, never verifies the stored termtest_utils.py:test_uppercase_identity_not_plainname/comment claim"IDENTITY"is NOT routed to the identity provider, but the code does.lower()so it IS — the test actually passes"APIKEY123"and tests the same path astest_plain_key_returned_as_is
|
Odd about the .coverage changes. With source=typeagent, on Linux (WSL), I got "no reports". Setting it to src gave the expected report. Would your please try with src? |
moved to #249 |
for existing modules: - updated import statements - added additional test functions
- added new test modules for modules with missing tests cases
Co-authored-by: Copilot <copilot@github.com>
- added new test modules for modules with missing tests cases
Co-authored-by: Copilot <copilot@github.com>
3a0009a to
58a40f8
Compare
Removed duplicate import statements for AsyncAzureOpenAI and AsyncOpenAI.
|
@KRRT7 can you please check again ? should have addressed all you comments |
KRRT7
left a comment
There was a problem hiding this comment.
All review feedback addressed. Thanks for the quick turnaround.
I generated some tests with GH Copilot to improve the testcoverage.
This PR:
Overall we improved coverage from 72% to 80%. (report is below)
@KRRT7 can you please review the changes