Skip to content

🎯 OpenTelemetry Framework Integration Analysis¢

Executive SummaryΒΆ

This document analyzes the OpenTelemetry implementation and identifies components that should be added to the Neuroglia Framework as reusable, generic features versus application-specific components that should remain in Mario's Pizzeria.

πŸ—οΈ Framework Components (neuroglia.observability)ΒΆ

βœ… SHOULD BE IN FRAMEWORK - These are generic and reusableΒΆ

1. Configuration Module (neuroglia.observability.config)ΒΆ

Status: βœ… Already created Justification:

  • Generic: Initialization of OTEL SDK is identical across all applications
  • Reusable: Every application needs TracerProvider, MeterProvider setup
  • Configurable: Uses environment variables and dataclass configuration
  • Value: Eliminates boilerplate in every application

What it provides:

  • OpenTelemetryConfig dataclass with environment variable defaults
  • configure_opentelemetry() - one-line initialization
  • shutdown_opentelemetry() - graceful cleanup
  • Automatic instrumentation setup (FastAPI, HTTPX, Logging, System Metrics)
  • Resource detection and configuration

2. Tracing Module (neuroglia.observability.tracing)ΒΆ

Status: βœ… Already created Justification:

  • Generic: Tracer retrieval and span management is framework-level
  • Developer Experience: Decorators (@trace_async, @trace_sync) massively simplify instrumentation
  • Reusable: Every application needs manual instrumentation capabilities
  • Best Practices: Encapsulates OpenTelemetry best practices

What it provides:

  • get_tracer() - cached tracer retrieval
  • @trace_async() / @trace_sync() - automatic span creation decorators
  • add_span_attributes() - helper for adding span data
  • add_span_event() - event recording
  • record_exception() - exception tracking
  • Context propagation utilities

3. Metrics Module (neuroglia.observability.metrics)ΒΆ

Status: βœ… Already created Justification:

  • Generic: Meter and instrument creation is framework-level
  • Convenience: Helper functions eliminate repetitive code
  • Reusable: Every application needs counters, histograms, gauges
  • Pattern: Provides standard patterns for metric naming and usage

What it provides:

  • get_meter() - cached meter retrieval
  • create_counter() / create_histogram() / etc. - instrument creation
  • record_metric() - convenience function for one-off metrics
  • Pre-defined framework metrics (command.duration, query.duration, etc.)

Note: Application-specific metrics like MarioMetrics should be in application code, NOT framework.

4. Logging Module (neuroglia.observability.logging)ΒΆ

Status: βœ… Already created Justification:

  • Generic: Trace context injection is framework-level concern
  • Reusable: Structured logging with trace correlation benefits all apps
  • Integration: Works with OTEL logging instrumentation
  • Developer Experience: Simplifies log correlation with traces

What it provides:

  • TraceContextFilter - automatic trace_id/span_id injection
  • StructuredFormatter - JSON structured logging
  • configure_logging() - one-line logging setup with trace context
  • log_with_trace() - manual trace correlation
  • LoggingContext - contextual logging scope

πŸ”§ Framework Middleware (neuroglia.mediation / neuroglia.mvc)ΒΆ

βœ… SHOULD BE IN FRAMEWORK - Automatic instrumentation for CQRS patternΒΆ

5. CQRS Tracing Middleware (NEW - needs creation)ΒΆ

Location: neuroglia.mediation.tracing_middleware.py Justification:

  • Generic: All applications using Neuroglia use CQRS
  • Automatic: Zero-code instrumentation for commands/queries
  • Consistent: Standardizes trace naming across all apps
  • Performance: Automatic duration metrics

What it should provide:

class TracingPipelineBehavior(PipelineBehavior[TRequest, TResult]):
    """Automatically creates spans for commands and queries"""
    async def handle_async(self, request, next_handler):
        tracer = get_tracer(__name__)
        request_type = type(request).__name__
        span_name = f"CQRS.{request_type}"

        with tracer.start_as_current_span(span_name) as span:
            add_span_attributes({
                "cqrs.type": "command" if isinstance(request, Command) else "query",
                "cqrs.name": request_type,
            })

            start_time = time.time()
            try:
                result = await next_handler()
                duration_ms = (time.time() - start_time) * 1000

                # Record metrics
                metric_name = "neuroglia.command.duration" if isinstance(request, Command) else "neuroglia.query.duration"
                record_metric("histogram", metric_name, duration_ms, {"type": request_type})

                span.set_status(StatusCode.OK)
                return result
            except Exception as ex:
                record_exception(ex)
                raise

Usage (in application):

services.add_pipeline_behavior(TracingPipelineBehavior)  # One line!

6. Event Handler Tracing Middleware (NEW - needs creation)ΒΆ

Location: neuroglia.eventing.tracing_middleware.py Justification:

  • Generic: All event handlers benefit from automatic tracing
  • Async Event Chains: Traces show complete event propagation
  • Performance: Tracks event processing time

What it should provide:

class EventHandlerTracingWrapper:
    """Wraps event handlers to automatically create spans"""
    def __init__(self, handler: EventHandler):
        self.handler = handler

    async def handle_async(self, event: DomainEvent):
        tracer = get_tracer(__name__)
        event_type = type(event).__name__

        with tracer.start_as_current_span(f"Event.{event_type}") as span:
            add_span_attributes({
                "event.type": event_type,
                "event.id": getattr(event, 'id', 'unknown'),
            })

            start_time = time.time()
            try:
                await self.handler.handle_async(event)
                duration_ms = (time.time() - start_time) * 1000
                record_metric("histogram", "neuroglia.event.processing.duration",
                             duration_ms, {"event.type": event_type})
            except Exception as ex:
                record_exception(ex)
                raise

7. Repository Tracing Mixin (NEW - needs creation)ΒΆ

Location: neuroglia.data.tracing_mixin.py Justification:

  • Generic: All repositories benefit from automatic tracing
  • Database Performance: Tracks database operation duration
  • Debugging: Shows which queries are slow

What it should provide:

class TracedRepositoryMixin:
    """Mixin to add automatic tracing to repository operations"""

    async def get_async(self, id: str):
        tracer = get_tracer(__name__)
        with tracer.start_as_current_span(f"Repository.get") as span:
            add_span_attributes({
                "repository.operation": "get",
                "repository.type": type(self).__name__,
                "entity.id": id,
            })
            return await super().get_async(id)

    async def add_async(self, entity):
        tracer = get_tracer(__name__)
        with tracer.start_as_current_span(f"Repository.add") as span:
            add_span_attributes({
                "repository.operation": "add",
                "repository.type": type(self).__name__,
                "entity.type": type(entity).__name__,
            })
            return await super().add_async(entity)

    # Similar for update_async, delete_async, etc.

Usage (in application):

class UserRepository(TracedRepositoryMixin, MongoRepository[User]):
    pass  # Automatic tracing!

πŸ“¦ Application-Specific ComponentsΒΆ

❌ SHOULD NOT BE IN FRAMEWORK - These are Mario's Pizzeria specific¢

8. Mario's Pizzeria Business MetricsΒΆ

Current Location: neuroglia.observability.metrics.MarioMetrics Should Move To: samples/mario-pizzeria/observability/metrics.py

Justification:

  • Domain-Specific: Metrics like "orders.created", "pizzas.ordered" are business logic
  • Not Reusable: Other applications have different business metrics
  • Application Concern: Business KPIs belong in application layer

Recommendation:

  • Remove MarioMetrics class from framework
  • Create application-specific metrics module:
# samples/mario-pizzeria/observability/metrics.py
from neuroglia.observability import create_counter, create_histogram

# Business metrics
orders_created_counter = create_counter("mario.orders.created", unit="orders")
orders_completed_counter = create_counter("mario.orders.completed", unit="orders")
order_value_histogram = create_histogram("mario.orders.value", unit="USD")
pizzas_ordered_counter = create_counter("mario.pizzas.ordered", unit="pizzas")

9. Custom Span Attributes for Business LogicΒΆ

Justification:

  • Domain-Specific: Attributes like "pizza.type", "order.status" are application concepts
  • Not Reusable: Every application has different entities and attributes

Recommendation: Applications should add business-specific attributes in their handlers:

# In application handler
@trace_async()
async def handle_async(self, command: PlaceOrderCommand):
    # Framework handles basic span
    # Application adds business context
    add_span_attributes({
        "order.id": order.id,
        "customer.id": command.customer_id,
        "order.item_count": len(command.items),
        "order.total": order.total_amount,
    })
    # ... business logic

🎨 Grafana Dashboards¢

🀝 HYBRID APPROACH - Generic templates + application customization¢

10. Framework Dashboard Templates (NEW - needs creation)ΒΆ

Location: deployment/grafana/templates/ Justification:

  • Generic Patterns: All Neuroglia apps have commands, queries, events
  • Starting Point: Provides template dashboards for common patterns
  • Customizable: Applications can clone and modify

What it should provide:

  • neuroglia-cqrs-overview.json - Commands/queries overview template
  • neuroglia-event-processing.json - Event handler metrics template
  • neuroglia-repository-performance.json - Database operation metrics template

11. Application-Specific DashboardsΒΆ

Location: deployment/grafana/dashboards/json/ Justification:

  • Business Metrics: Unique to each application
  • Custom Visualizations: Domain-specific charts

Examples:

  • mario-business-metrics.json - Orders, pizzas, revenue
  • mario-order-pipeline.json - Order status flow visualization
  • mario-customer-analytics.json - Customer behavior metrics

πŸ“ DocumentationΒΆ

βœ… SHOULD BE IN FRAMEWORKΒΆ

12. Generic OTEL Integration GuideΒΆ

Location: docs/features/observability.md Content:

  • How to configure OpenTelemetry in any Neuroglia application
  • Using decorators and helpers
  • Best practices for instrumentation
  • Performance considerations

βœ… SHOULD BE IN APPLICATIONΒΆ

13. Mario's Pizzeria OTEL SetupΒΆ

Location: docs/samples/mario-pizzeria-observability.md Content:

  • Mario-specific dashboard explanations
  • Business metrics definitions
  • Custom instrumentation examples

🎯 Implementation Priority¢

Phase 1: Core Framework (COMPLETED βœ…)ΒΆ

  1. βœ… neuroglia.observability.config
  2. βœ… neuroglia.observability.tracing
  3. βœ… neuroglia.observability.metrics
  4. βœ… neuroglia.observability.logging

Phase 2: Middleware Integration (NEXT)ΒΆ

  1. ⏳ neuroglia.mediation.tracing_middleware - CQRS tracing
  2. ⏳ neuroglia.eventing.tracing_middleware - Event handler tracing
  3. ⏳ neuroglia.data.tracing_mixin - Repository tracing

Phase 3: Application IntegrationΒΆ

  1. ⏳ Initialize OTEL in samples/mario-pizzeria/main.py
  2. ⏳ Create Mario-specific metrics module
  3. ⏳ Add custom instrumentation to handlers

Phase 4: Visualization & DocumentationΒΆ

  1. ⏳ Create generic dashboard templates
  2. ⏳ Create Mario-specific dashboards
  3. ⏳ Write framework documentation
  4. ⏳ Write application-specific guide

πŸŽ“ Recommendation SummaryΒΆ

Add to Framework βœ…ΒΆ

  • Complete neuroglia.observability module (done)
  • CQRS tracing middleware (TracingPipelineBehavior)
  • Event handler tracing wrapper
  • Repository tracing mixin
  • Generic dashboard templates
  • Framework-level OTEL documentation

Keep in Application ❌¢

  • Business-specific metrics (MarioMetrics)
  • Domain-specific span attributes
  • Application dashboards
  • Business metric initialization

Benefit AnalysisΒΆ

Developer Experience Improvement:

  • Before: 100+ lines of OTEL boilerplate per application
  • After: 5-10 lines of configuration

Example - Application Startup:

# WITHOUT framework support (hypothetical)
# ~100+ lines of OTEL setup code

# WITH framework support
from neuroglia.observability import configure_opentelemetry
from neuroglia.mediation import TracingPipelineBehavior

configure_opentelemetry(
    service_name="my-service",
    otlp_endpoint="http://otel-collector:4317"
)

services.add_pipeline_behavior(TracingPipelineBehavior)  # Automatic CQRS tracing!

Performance Monitoring:

  • βœ… Automatic metrics for ALL commands, queries, events
  • βœ… Automatic tracing for ALL CQRS operations
  • βœ… Automatic database operation timing
  • βœ… Zero-code instrumentation

Observability Coverage:

  • 🎯 100% trace coverage of CQRS operations
  • 🎯 100% metric coverage of framework patterns
  • 🎯 Automatic log-trace correlation
  • 🎯 Consistent naming conventions

πŸ”— Next StepsΒΆ

  1. Install Dependencies: Run poetry install to get OTEL packages
  2. Create Middleware: Implement tracing middleware for CQRS, events, repositories
  3. Integrate with Mario's Pizzeria: Add OTEL initialization to main.py
  4. Test End-to-End: Verify traces/metrics/logs flow through the stack
  5. Create Dashboards: Build Grafana dashboards for visualization
  6. Document Patterns: Write comprehensive documentation

Conclusion: The neuroglia.observability module provides a solid foundation for OpenTelemetry integration that is generic, reusable, and eliminates boilerplate. The next step is creating middleware components that automatically instrument the Neuroglia framework patterns (CQRS, events, repositories), providing zero-code observability for all Neuroglia applications.