From: aidotengineer

OpenLLMetry is an open-source project that extends Open Telemetry to support Generative AI (GenAI) frameworks, foundation models, and vector databases [06:20:00]. It allows developers to gain observability into their LLM-based applications by leveraging the existing capabilities of Open Telemetry [06:42:00].

Understanding Open Telemetry

Open Telemetry is a significant open-source project maintained by the CNCF (Cloud Native Computing Foundation) [00:31:00]. It standardizes cloud observability in cloud environments [00:40:00]. It is supported by every major observability platform, including Splunk, Datadog, Dynatrace, New Relic, Grafana, and Honeycomb [00:50:00].

Core Observability Pillars

At its core, Open Telemetry defines a protocol for standardizing logging, metrics, and traces within cloud applications [01:05:00]:

  • Logging: Refers to arbitrary events that can be sent at any time during an application’s lifecycle, which can then be viewed with associated metadata [01:27:00].
  • Metrics: Used for aggregate-level data, showing behavior across time or users [01:44:00]. In traditional cloud contexts, metrics include CPU/memory usage or latency [01:56:00]. For GenAI applications, relevant metrics might include token usage, latency, and error rates [02:08:00].
  • Tracing: Involves tracking a multi-step process, especially useful in microservices architectures to see how a request spans across multiple services [02:31:00]. For GenAI, tracing is common because many processes involve multi-step chains, workflows, or agents interacting with tools [02:55:00].

Open Telemetry Ecosystem Components

Beyond the protocol, Open Telemetry provides an ecosystem of tools [03:21:00]:

  • SDKs: Enable manual sending of logs, metrics, and traces from applications [03:32:00]. Open Telemetry supports SDKs for 11 different languages, including Python, TypeScript, Go, and C++ [03:37:00].
  • Instrumentations: Offer automatic collection of observability data from parts of an application [03:55:00]. They work by “monkey patching” client libraries (e.g., for SQL servers like Postgres) to automatically emit relevant data with negligible latency impact [04:40:00].
  • Collectors: Self-deployable components in a cloud environment (e.g., Kubernetes) that allow pre-processing of observability data before it’s sent to a platform [05:25:00]. They can filter data, obscure PII or sensitive information, and even send data to multiple observability providers [05:40:00].

OpenLLMetry: Extending Observability to GenAI

OpenLLMetry builds upon this established Open Telemetry framework by extending its support to cloud observability for GenAI applications [06:20:00].

Key Extensions and Supported Technologies

OpenLLMetry has developed instrumentations to support a wide range of GenAI technologies [06:36:00]:

  • Foundation Models: Includes providers like OpenAI, Anthropic, Cohere, Gemini, and Bedrock [07:16:00].
  • Vector Databases: Such as Pinecone and Chroma [07:23:00].
  • Frameworks: Supports LangChain, LlamaIndex, CrewAI, and Haystack [07:28:00].

With over 40 different providers supported, OpenLLMetry’s instrumentations automatically emit logs, metrics, and traces [07:10:00]. For example, a Pinecone instrumentation can capture queries, indexing operations, and details about returned vectors, including distances and scores, all in the standard Open Telemetry format [07:54:00].

Benefits of OpenLLMetry

  • Automatic Observability: By leveraging instrumentations, OpenLLMetry provides automatic logs, metrics, and traces for GenAI components, reducing manual effort [07:01:00].
  • Platform Agnostic: Because it relies on the standard Open Telemetry protocol, OpenLLMetry allows users to get observability data in any supported platform (e.g., DataDog, Sentry, Grafana Tempo, Dynatrace) without being tied to a specific vendor [06:44:00]. Switching between platforms becomes a simple configuration change [08:43:00].
  • Comprehensive View: Provides a holistic view of everything happening within an LLM-based system, including interactions with various models, databases, and frameworks [07:34:00].