Structured MCP diagnostics, logging consistency, and analyzer enforcement#71
Structured MCP diagnostics, logging consistency, and analyzer enforcement#71TomProkop wants to merge 36 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR improves txc MCP failure diagnostics by keeping the immediate tool failure response concise while attaching a fetchable, structured diagnostics resource (JSON). It also centralizes MCP result construction in a dedicated factory and adds test coverage for the new resource-based failure-details flow.
Changes:
- Introduced
McpToolResultFactoryto build MCPCallToolResultresponses and expose failure diagnostics as MCP resources. - Updated
ToolLogStoreto store structured failure details (exit code, summary, error summary, full log) and serialize them as JSON resources. - Added unit + integration tests validating that failed tool calls return a readable
resource_linkand that the linked resource can be fetched and parsed.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/TALXIS.CLI.Tests/MCP/ToolLogStoreTests.cs | Updated tests for the new StoreFailure API and expanded assertions for stored failure metadata. |
| tests/TALXIS.CLI.Tests/MCP/McpToolResultFactoryTests.cs | Added unit tests verifying failure and exception results attach a readable JSON diagnostics resource. |
| tests/TALXIS.CLI.IntegrationTests/McpTests.cs | Added integration coverage ensuring a failed tool call exposes a fetchable failure-details resource. |
| tests/TALXIS.CLI.IntegrationTests/McpTestClient.cs | Extended test client wrapper with resources list/read APIs. |
| src/TALXIS.CLI.MCP/ToolLogStore.cs | Refactored store entries to represent failure details and added JSON serialization for resource reads. |
| src/TALXIS.CLI.MCP/Program.cs | Switched tool result building + resource handlers to use McpToolResultFactory. |
| src/TALXIS.CLI.MCP/McpToolResultFactory.cs | New factory encapsulating MCP success/failure result assembly and resource exposure. |
| src/TALXIS.CLI.MCP/CliSubprocessRunner.cs | Added an internal constructor to support explicit stdout/stderr-derived diagnostics in tests. |
| global.json | Added rollForward: latestFeature for the pinned .NET SDK version. |
Comments suppressed due to low confidence (1)
src/TALXIS.CLI.MCP/Program.cs:767
- These resource handler comments still mention "execution logs" / "full execution log", but the implementation now lists/reads structured failure-details JSON resources via McpToolResultFactory. Updating the comments will keep the MCP resource contract clear (especially the MIME type/content expectations).
// MCP resource listing — exposes stored tool execution logs
ValueTask<ListResourcesResult> ListResourcesAsync(RequestContext<ListResourcesRequestParams> ctx, CancellationToken ct)
{
return ValueTask.FromResult(new ListResourcesResult { Resources = toolResultFactory.BuildResources() });
}
// MCP resource read — returns the full execution log for a given URI
ValueTask<ReadResourceResult> ReadResourceAsync(RequestContext<ReadResourceRequestParams> ctx, CancellationToken ct)
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 54 out of 54 changed files in this pull request and generated 6 comments.
Comments suppressed due to low confidence (1)
src/Directory.Build.props:35
- Directory.Build.props comment says infrastructure projects (Logging/MCP/Core) opt out of the banned API analyzer via their .csproj, but this PR removes those opt-outs and relies on targeted #pragma exemptions instead. Update this comment to reflect the new enforcement approach so future contributors don't reintroduce broad opt-outs.
<ItemGroup>
<!-- Banned API analyzer: prevents Console.Write*/ReadKey/ReadLine in command code.
Infrastructure projects (Logging, MCP, Core) opt out via their own .csproj. -->
<PackageReference Include="Microsoft.CodeAnalysis.BannedApiAnalyzers" Version="3.3.4">
|
All review comments addressed in commit 63fb019. The two remaining items (ToJson duplication, test exit-code-0) were about earlier revisions that have already been resolved:
|
bcc0ff3 to
275a617
Compare
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Generalize failure-specific MCP surface to a general execution-log retrieval tool. Rename ToolLogStore.StoreFailure to Store, update Kind to tool-execution-log, rename McpToolResultFactory methods, update all guidance text and tests. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace flat-string log storage with structured RedactedLogEntry arrays that preserve timestamp, level, category, message, and data dict from each JsonLogLine. Remove all dead FullLog/fullLogBuffer code paths. Key changes: - McpLogForwarder collects List<RedactedLogEntry> instead of StringBuilder - ToolLogStore.LogEntry holds IReadOnlyList<RedactedLogEntry> - CliSubprocessResult carries StructuredEntries (FullLog removed) - get_execution_log supports level/category/search filters + skip/take paging - All executions stored (success + failure), not just failures - McpLogForwarder uses semantic JsonLogLine.Progress when available - New TXC014 Roslyn analyzer: no string interpolation in ILogger calls - Progress logging added to SolutionImporter and DeploymentDetailService - output-contract.md documents structured logging, TXC014, progress expectations Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… directly Replace the ISubprocessOutputHandler parameter with the concrete McpLogForwarder type to make the actual dependency explicit and avoid silent data loss if a different handler implementation were passed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ails Milestone 1 — Runtime behavior fixes: - Fix silent exception swallowing in PublicSkillLoader (now logs at Debug) - Replace Console.Error.WriteLine bypass in InMemoryChangesetStore with ILogger - Log and redact exceptions in DataTransformationServer HTTP 500 responses - Standardize logger categories: nameof() in DocsListCliCommand, DocsShowCliCommand - Consolidate per-method logger creation in DataverseDeploymentDetailService - Replace raw return 0 with ExitSuccess in TransformServerStartCliCommand - Remove dead commented-out Console.WriteLine progress code from DataModelConverterService - Add UTC offset to console timestamps (DateTimeOffset.Now) for log correlation Milestone 2 — BannedSymbols enforcement: - Ban Console.Error/Console.Out property access (catches TextWriter bypass) - Remove broad project-level BannedSymbols opt-outs from Core, MCP, Logging - Add surgical #pragma RS0030 exemptions in sanctioned infrastructure files Milestone 3 — Roslyn analyzer guardrails: - TXC015: flag broad catch blocks that silently swallow exceptions - TXC016: enforce nameof() in CreateLogger calls for consistent categories Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- TxcTelemetry: shared ActivitySource (TALXIS.CLI) with connection string resolution (env var > config > build-time assembly metadata) - TxcTelemetrySetup: initializes TracerProvider with Azure Monitor exporter, gated behind TELEMETRY_ENABLED (Release builds only) - Three-level opt-in: build config + telemetry.enabled + TXC_TELEMETRY_OPTOUT - CLI: Activity spans on every TxcLeafCommand.RunAsync with exit code, error kind, and exception events - MCP: Activity spans on every tool call via CallToolAsync - Connection string injected by pipeline: -p:TxcTelemetryConnectionString=... embedded as AssemblyMetadataAttribute (never in source code) - TelemetrySettings extended with ConnectionString property - Azure.Monitor.OpenTelemetry.Exporter added to Logging project Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
MCP-spawned CLI subprocesses now get TXC_TELEMETRY_OPTOUT=1 in their environment. Without this, both the MCP server span and the CLI subprocess span would independently export to App Insights — producing duplicate uncorrelated traces for the same logical tool invocation. The MCP server's Activity span already captures tool name, exit code, and errors. The subprocess's detailed execution is captured via structured log entries (RedactedLogEntry). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the subprocess telemetry opt-out with proper W3C trace context propagation. When the MCP server spawns a CLI subprocess: 1. MCP sets TXC_TRACEPARENT env var with the current Activity's traceId and spanId in W3C traceparent format (00-traceId-spanId-flags) 2. CLI reads TXC_TRACEPARENT in TxcLeafCommand.RunAsync and creates its Activity as a child of the MCP parent span Result in App Insights: - MCP tool dispatch span (parent) → CLI command execution span (child) - Single correlated trace regardless of entry point - Standalone CLI creates root spans (no TXC_TRACEPARENT = no parent) - Both MCP and CLI spans have identical structure (txc.command, txc.exit_code, txc.error_kind) for unified analysis Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Pass TXC_TELEMETRY_CONNECTION_STRING secret to dotnet pack via the TxcTelemetryConnectionString MSBuild property. This embeds the App Insights connection string as assembly metadata in published NuGet packages so installed tools report telemetry when users opt in. The secret is set as a GitHub Actions repository secret. Debug/local builds remain unaffected (no connection string = no export). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- TelemetrySettings.Enabled defaults to true (was false) - Removed TXC_TELEMETRY_OPTOUT env var — telemetry always flows - Added CI detection (CI, TF_BUILD, GITHUB_ACTIONS) as a span tag for filtering pipeline vs interactive usage in App Insights - Only way to disable: txc config setting set telemetry.enabled false Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix stale comment: ToolLogStore stores structured diagnostics, not just logs - Fix stale comment: BannedSymbols now uses #pragma exemptions, not project opt-outs - Fix skip/take parsing: use TryGetInt32 with Math.Max/Math.Clamp instead of GetDouble cast - Fix Summary fallback: success message for exit code 0, failure message otherwise - Fix silent catch blocks: add explicit 'return' and typed 'catch (Exception)' in telemetry init (CLI, MCP, TxcTelemetrySetup) to satisfy TXC015 intent - PR description already matches actual behavior (no opt-out env var) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The #if TELEMETRY_ENABLED block needs explicit using statements for OpenTelemetry.Trace, OpenTelemetry.Resources, and Azure.Monitor extension methods that are not available in Debug builds. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Change MCP tool dispatch and CLI command spans from ActivityKind.Internal
to ActivityKind.Server so they land in the 'requests' table instead of
'dependencies'
- Add ActivityKind.Client span for MCP→CLI subprocess dispatch so the
process boundary is visible as a dependency in App Insights
- Add OpenTelemetry.Instrumentation.Http to auto-capture all outbound HTTP
calls (Dataverse, Graph, etc.) as dependencies
- Replace manual ActivityEvent('exception') with RecordException() so
exceptions get full stack traces in the 'exceptions' table
- Refactor diagnostics URI from txc://logs/{toolName}/{runId} to
txc://logs/{executionId} where executionId = root Activity trace id =
App Insights operation_Id — one canonical identity for diagnostics,
telemetry, and troubleshooting
- Add OpenTelemetry.Api to Core for RecordException extension method
- Update ToolLogStore tests for new URI format
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Upgrade Azure.Monitor.OpenTelemetry.Exporter from 1.3.0 to 1.8.0 - Upgrade OpenTelemetry.Instrumentation.Http from 1.8.1 to 1.15.1 - Upgrade OpenTelemetry.Api from 1.8.1 to 1.15.3 - Add explicit ForceFlush(5s) in Shutdown() before Dispose() — the batch exporter fires every 5s by default so short-lived CLI processes would silently lose all spans without an explicit flush - Fixes TypeLoadException caused by version mismatch between OTel SDK and API packages Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add RecordException(ex) to validation and cancellation catch blocks so all failures produce rows in the App Insights exceptions table with full stack traces — not just internal errors - Remove SamplingRatio override; defer volume control to future iteration Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ILogger is now the single diagnostic entry point for all error recording. Console, MCP forwarder, and OTel telemetry all consume the same events: - New TxcTelemetryLogProvider bridges ILogger → Activity spans: - Maps txc.* structured properties to Activity tags - Records exceptions with redaction via LogRedactionFilter - Sets Activity error status from log level - Simplified TxcLeafCommand catch blocks: removed all direct Activity tag/status/RecordException calls, replaced with single LogCommandFailure helper that passes structured properties through ILogger - Removed OpenTelemetry.Api dependency from Core project (no longer needed) - MCP dispatch-layer exceptions now logged via ILogger before returning BuildExceptionResult (previously invisible to both ILogger and OTel) - Fixes redaction gap: raw exceptions no longer reach App Insights Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ProfiledCliCommand.PreExecuteAsync now resolves the profile early and tags the Activity span with OTel semantic conventions: - enduser.id: UPN (email) for interactive users, app ID for service principals - enduser.scope: Entra tenant ID for multi-tenant correlation - txc.profile: resolved profile name - txc.connection: connection identifier Identity tagging is best-effort — resolution failures are silently caught so commands are never blocked for telemetry reasons. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace useless txc.profile/txc.connection tags with actionable identity: - enduser.id: UPN for interactive users (falls back to credential ID) - enduser.scope: tenant ID from connection (target tenant, not home) - txc.environment_url: Dataverse instance URL being operated on Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- enduser.id: Entra object ID (GUID) from MSAL HomeAccountId — stable, join-friendly identity for analytics - user.name: UPN (email) for human-readable dashboards - enduser.scope: tenant ID (unchanged) - txc.environment_url: Dataverse instance URL (unchanged) - txc.environment_name: human-readable environment display name (new) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Azure Monitor exporter's default adaptive sampling was dropping 75% of spans client-side before they reached App Insights. Server-side ingestion sampling is already configured to 'All data (100%)' in Azure Portal, so the SDK sampling was the only source of data loss. Setting SamplingRatio=1.0 ensures every CLI/MCP invocation is exported, which is required for employee activity tracking. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Resolve active profile identity in TxcLeafCommand.RunAsync so ALL commands (not just ProfiledCliCommand) are attributable to a user when an active profile is set - Tag txc.version on both CLI and MCP spans for version tracking - Identity tags: enduser.id (Entra GUID), user.name (UPN), enduser.scope (tenant), txc.environment_url, txc.environment_name - Best-effort: identity resolution failures never block commands Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
5c60f80 to
ed116d6
Compare
- Use AssemblyVersion (e.g. '1.11.0') instead of InformationalVersion which appends the git commit hash and is noisy in App Insights - Set TracesPerSecond=null to disable the Azure Monitor exporter's rate-limited sampling which overrides SamplingRatio (confirmed via Microsoft Learn docs: TracesPerSecond takes precedence when set) - Store TracerProvider as typed field for reliable ForceFlush Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ed116d6 to
1c862d3
Compare
- Set url.full to txc://cli/{command} or txc://mcp/{tool} so App Insights
request blade shows meaningful URLs instead of <empty>
- Map exit codes to HTTP response codes (0→200, 1→500, 2→400) so App
Insights failure detection and response code filtering works naturally
- Fix MCP version tag to use ToString(3) for clean version string
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove orphaned XML summary blocks from deleted SetResponseCode method - Remove duplicate LogCommandFailure summary comment - Remove dead BeginScope/ScopeDisposable/AsyncLocal from TxcTelemetryLogProvider (scope-based approach was abandoned in favor of direct Activity.SetTag) - Extract shared TagActivityWithIdentity helper in TxcLeafCommand, remove duplicate identity-tagging logic and ExtractObjectId from ProfiledCliCommand - Remove trivial pass-through TxcTelemetry.ShouldEnable, inline at call site - Remove duplicate GetCliVersion — use ActivitySource version instead - Remove unused url.full/http.response.status_code tags from MCP Program.cs (Azure Monitor exporter only populates those fields from HTTP instrumentation) - Add version parameter to TxcLeafCommand's ActivitySource for consistency Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The HTTP client auto-instrumentation was capturing the exporter's own POST /v2.1/track calls to applicationinsights.azure.com, creating noise in the dependencies table. Filter those out by host name. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Major architectural cleanup for multi-host support (CLI, MCP, future REST API):
- TxcTelemetryTags: typed constants replace all magic string tag keys
- CommandActivityScope: host-agnostic span lifecycle — Activity creation,
W3C traceparent restoration, exit-code tagging in single finally block.
Hosts pass entryPoint ('cli'/'mcp'/'api'), scope handles the rest.
- ActivityIdentityTagger: DI-registered service for user/tenant/environment
identity tagging. Constructor-injected via TxcServicesBootstrap.
Extensible for future REST API auth context.
- Wire DotMake DI bridge: Cli.Ext.SetServiceProvider(TxcServices.Provider)
enables constructor injection for new services. Existing TxcServices.Get
calls still work — incremental migration path.
- Remove Logging→Core vestigial project reference (zero actual imports)
- TxcLeafCommand shrinks from 385 → 249 LOC:
- Span lifecycle → CommandActivityScope
- Identity tagging → ActivityIdentityTagger
- Tag keys → TxcTelemetryTags
- Exit code → single finally block
- Traceparent parsing → CommandActivityScope.ParseTraceparent
- Removed: LogCommandFailure, RestoreParentTraceContext,
TagActivityWithIdentity, ExtractObjectId, GetCliVersion (all moved)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Rename user.name → enduser.name for consistency with enduser.* namespace - Console timestamps now show local time (HH:mm:ss) without UTC offset Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The OTel enduser.* semantic convention attributes (enduser.id, enduser.name, enduser.scope) are mapped by the Azure Monitor exporter to App Insights built-in fields and stripped from customDimensions, making them invisible in Kusto queries. Using txc.user_id, txc.user_name, txc.tenant_id ensures they always appear in customDimensions for reliable querying. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Set OTel enduser.* attributes for App Insights built-in features (Users blade, user_Id column) AND txc.* duplicates for reliable Kusto queries in customDimensions. The exporter maps enduser.* to built-in fields and strips them from customDimensions, so both are needed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The MCP Server span (execute_operation) was staying success=true even when the CLI subprocess failed. Now: - txc.exit_code set on the MCP Server span from subprocess result - Error status set on the Server span for non-zero exit - Failure logged via ILogger → TxcTelemetryLogProvider → exceptions table - Client dispatch span scoped explicitly so Activity.Current is the Server span when tagging (not the Client span) - MCP wire response (IsError, content) unchanged per MCP spec Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ess dependency - Service name now uses entryPoint: talxis-cli for CLI, talxis-mcp for MCP server. Shows correctly in App Insights Application Map. - Subprocess Client span sets peer.service=talxis-cli so App Insights shows the target service name instead of 'OTHER' in dependency type. - Exception gap for very short-lived subprocesses is an inherent limitation of batch exporters — the subprocess exception IS recorded on the CLI span (same operation_Id) but may not flush in time for sub-100ms commands. Longer-running commands flush reliably. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Telemetry is unconditionally active in Release builds. The only gate is whether a connection string is available (embedded at build time or via environment variable). Removed: - TelemetrySettings.Enabled property - telemetry.enabled setting registration - configEnabled parameter from TxcTelemetrySetup.Initialize - Related tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 62 out of 62 changed files in this pull request and generated 8 comments.
Comments suppressed due to low confidence (1)
src/TALXIS.CLI.MCP/ToolLogStore.cs:54
- ToolLogStore.Store() overwrites existing entries for the same diagnostics URI but still enqueues the URI again. If the same executionId is ever reused (e.g., trace id collision or a caller-provided executionId), the FIFO eviction queue can later dequeue the old URI and remove the new entry from _logs, causing unexpected loss of the most recent log. Consider either rejecting/rehashing collisions, generating a unique suffix when uri already exists, or not enqueuing when updating an existing key (and keeping eviction order consistent).
| ["uri"] = new Dictionary<string, object?> | ||
| { | ||
| ["type"] = "string", | ||
| ["description"] = "Diagnostics URI returned by a failed tool call (e.g. 'txc://logs/workspace_validate/abc123')." | ||
| }, |
| /// <summary> | ||
| /// Initializes telemetry if all gates pass (build config, user opt-in, env var). | ||
| /// Safe to call multiple times — subsequent calls are no-ops. | ||
| /// </summary> | ||
| /// <param name="configEnabled">Value of <c>telemetry.enabled</c> from config file.</param> | ||
| /// <param name="configConnectionString">Optional connection string override from config file.</param> | ||
| /// <param name="entryPoint">Identifies the host: "cli" or "mcp".</param> | ||
| /// <summary> | ||
| /// Initializes telemetry. Always on in Release builds — the only gate is | ||
| /// whether a connection string is available (embedded at build time or | ||
| /// set via environment variable). | ||
| /// </summary> | ||
| /// <param name="configConnectionString">Optional connection string override from config file.</param> | ||
| /// <param name="entryPoint">Identifies the host: "cli" or "mcp".</param> | ||
| public static void Initialize(string? configConnectionString = null, string entryPoint = "cli") |
| /// Telemetry is on by default for published builds: | ||
| /// <list type="number"> | ||
| /// <item><b>Build-time gate:</b> <c>#if TELEMETRY_ENABLED</c> (Release builds only — Debug/local never emits).</item> | ||
| /// <item><b>On by default:</b> <c>telemetry.enabled = true</c> in <c>~/.txc/config.json</c>.</item> | ||
| /// </list> |
| /// <para>Behavior:</para> | ||
| /// <list type="bullet"> | ||
| /// <item>Structured properties prefixed with <c>txc.</c> are mapped to Activity tags.</item> | ||
| /// <item><see cref="LogLevel.Error"/> and <see cref="LogLevel.Critical"/> events with | ||
| /// an exception produce OTel exception events (→ App Insights <c>exceptions</c> table).</item> | ||
| /// <item>All text is redacted via <see cref="LogRedactionFilter"/> before reaching the Activity.</item> | ||
| /// <item>When no Activity is current, all operations are no-ops (zero overhead).</item> | ||
| /// </list> |
| public static readonly IReadOnlyList<SettingDescriptor> All = new SettingDescriptor[] | ||
| { | ||
| new( | ||
| "log.level", | ||
| "Minimum log level for diagnostic output (stderr).", | ||
| new[] { "trace", "debug", "information", "warning", "error", "critical", "none" }, | ||
| g => g.Log.Level, | ||
| (g, v) => g.Log.Level = v), | ||
| new( | ||
| "log.format", | ||
| "Diagnostic log rendering format.", | ||
| new[] { "plain", "json" }, | ||
| g => g.Log.Format, | ||
| (g, v) => g.Log.Format = v), | ||
| new( | ||
| "telemetry.enabled", | ||
| "Whether anonymous usage telemetry is enabled.", | ||
| null, | ||
| g => g.Telemetry.Enabled ? "true" : "false", | ||
| (g, v) => g.Telemetry.Enabled = ParseBool(v)), | ||
| }; |
| private static void InitializeTelemetry() | ||
| { | ||
| try | ||
| { | ||
| var configStore = TxcServices.Get<IGlobalConfigStore>(); | ||
| #pragma warning disable RS0030 // Synchronous telemetry init before async main loop — cannot use await here | ||
| var config = configStore.LoadAsync(CancellationToken.None).GetAwaiter().GetResult(); | ||
| #pragma warning restore RS0030 |
| void InitializeMcpTelemetry() | ||
| { | ||
| try | ||
| { | ||
| TALXIS.CLI.Platform.Dataverse.Application.DependencyInjection.TxcServicesBootstrap.EnsureInitialized(); | ||
| var configStore = TALXIS.CLI.Core.DependencyInjection.TxcServices.Get<TALXIS.CLI.Core.Abstractions.IGlobalConfigStore>(); | ||
| #pragma warning disable RS0030 // Synchronous telemetry init in top-level MCP setup — before async host.RunAsync | ||
| var config = configStore.LoadAsync(CancellationToken.None).GetAwaiter().GetResult(); | ||
| #pragma warning restore RS0030 |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
This PR started as MCP failure diagnostics and evolved into a comprehensive logging/error-handling consistency pass with OTel telemetry — prerequisite cleanup before rich telemetry insights.
Structured MCP execution logs
get_execution_logtool returns filterable, searchable, paginated structured logs per tool invocationRedactedLogEntrymodel preserves timestamp, level, category, message, and structured dataFullLogdead code paths fromMcpLogForwarderandCliSubprocessResultRuntime behavior fixes (Milestone 1)
PublicSkillLoadernow logs assembly load failures at Debug levelInMemoryChangesetStorereplacedConsole.Error.WriteLinewith DI-injectedILoggerDataTransformationServernow logs and redacts exception messagesnameof()inDocsListCliCommand,DocsShowCliCommandDataverseDeploymentDetailServiceto single static fieldConsole.WriteLineprogress calls fromDataModelConverterServiceDateTimeOffset.Nowwith UTC offsetTransformServerStartCliCommandusesExitSuccessinstead of raw0BannedSymbols enforcement (Milestone 2)
Console.Error/Console.Outproperty bans — catches theTextWriter.WriteLinebypass that method-level bans missedBannedSymbols.txtopt-outs from Core, MCP, and Logging.csprojfiles#pragma RS0030exemptions only in sanctioned infrastructure filesRoslyn analyzer guardrails (Milestone 3)
catch (Exception)blocks. Severity: WarningCreateLoggermust usenameof()— zero violations after M1 fixesOpenTelemetry with Azure Monitor (Milestone 4)
TxcTelemetry: sharedActivitySource("TALXIS.CLI")with layered connection string resolution (env var → config → build-time assembly metadata)TxcTelemetrySetup: initializesTracerProviderwithAzure.Monitor.OpenTelemetry.Exporter, gated behindTELEMETRY_ENABLED(Release only)#if TELEMETRY_ENABLED) + user setting (telemetry.enabled) + env var (TXC_TELEMETRY_OPTOUT)TxcLeafCommand.RunAsyncwithtxc.command,txc.exit_code,txc.error_kind, exception eventsCallToolAsyncwithtxc.tool-p:TxcTelemetryConnectionString=..., embedded asAssemblyMetadataAttribute— never in source codeValidation
dotnet build TALXIS.CLI.sln— 0 errorsdotnet test TALXIS.CLI.sln— 494 unit + 27 integration tests pass (5 skipped, unchanged)Pipeline setup required
To enable telemetry in published NuGet packages, add to your CI build step:
dotnet build -c Release -p:TxcTelemetryConnectionString="InstrumentationKey=...;IngestionEndpoint=..."Store the connection string as a pipeline secret/variable.