From: aidotengineer
OpenTelemetry is an open-source project maintained by the CNCF, standardizing cloud observability in cloud environments [00:00:31]. It is supported by major observability platforms like Splunk, Datadog, Dynatrace, Grafana, and Honeycomb [00:00:50].
Core Concepts of OpenTelemetry
OpenTelemetry primarily defines a protocol to standardize logging, metrics, and traces in cloud applications [00:01:05].
- Logging: Arbitrary events that can be sent anytime in an application’s lifecycle and viewed later, possibly with metadata [00:01:29].
- Metrics: Data points seen at an aggregate level, showing behavior across time or users [00:01:44]. In traditional cloud, these include CPU or memory usage and latency [00:01:58]. For GenAI applications, metrics might include token usage, latency, and error rate [00:02:08].
- Tracing: Tracking a multi-step process, such as requests spanning across multiple microservices in a cloud environment [00:02:31]. For GenAI, tracing is common for multi-step processes like chains, workflows, or agents that interact with tools [00:02:55].
Beyond the protocol, OpenTelemetry is an ecosystem that includes SDKs, instrumentations, and collectors [00:03:23]:
- SDKs (Software Development Kits): Allow manual sending of logs, metrics, and traces from applications, supporting 11 different languages [00:03:32].
- Instrumentations: Enable automatic collection of observability data by monkey patching client libraries within an application [00:03:55]. These are designed with negligible latency impact [00:05:13].
- Collectors: Self-deployable components that provide pre-processing capabilities for observability data, such as filtering, obscuring PII, or sending data to multiple providers, before it’s sent to an observability platform [00:05:28].
OpenELe.AI: Extending OpenTelemetry for GenAI
TraceLoop developed “OpenELe.AI” (referred to as “open elemetry” in the transcript) by extending the existing OpenTelemetry project to support GenAI frameworks, foundation models, and vector databases [00:06:16]. This extension leverages the existing OpenTelemetry infrastructure, allowing users to obtain observability data in any compatible platform (e.g., Datadog, Sentry, Grafana Tempo, Dynatrace) simply by configuring OpenTelemetry [00:06:42].
Supported Integrations
OpenELe.AI has developed over 40 different providers through community collaboration [00:07:09]. These include:
- Foundation Models: OpenAI, Anthropic, Cohere, Gemini, Bedrock [00:07:16].
- Vector Databases: Pinecone, Chroma [00:07:23].
- Frameworks: LangChain, LlamaIndex, CrewAI, Haystack [00:07:28].
The instrumentations automatically emit logs, metrics, and traces, which can then be connected to any desired observability platform [00:07:36].
Example: Pinecone Instrumentation
An instrumentation for Pinecone, a vector database, would provide visibility into:
- Queries sent to Pinecone [00:08:01].
- Indexing processes within Pinecone [00:08:07].
- Investigation of returned vectors, including data, distances, and scores [00:08:10].
- Latencies [00:08:20].
All this information is available in the standard OpenTelemetry format [00:08:23].
Benefits
A key advantage of using OpenTelemetry for integrating GenAI applications with observability platforms is its standardization [00:08:38]. This ensures users are not tied to a specific platform and can easily switch between providers with a simple configuration change, as all supporting platforms adhere to the exact same format [00:08:41].