Observability at the Edge: New OTel features in Envoy and Istio
In the dynamic world of cloud-native and distributed applications, managing microservices effectively is critical. Kubernetes has become the de facto standard for container orchestration, enabling seamless deployment, scaling, and management of containerized applications.
The distributed nature of such systems, however, adds a layer of complexity in the form of networking for in-cluster communication. Two well-known projects, Envoy and Istio, have emerged as the foundation for the smooth management and operation of such complex environments.
Together, these technologies empower organizations to build scalable, resilient, and secure distributed systems.
Istio is a service mesh, that orchestrates communication between microservices, providing features such as traffic management, security and, of course observability. Istio uses the Envoy proxy as its data plane. Envoy is a high-performance proxy, designed for single applications/services as well as a communication bus and “universal data plane” for service meshes.
Envoy and Istio projects are open source and part of the Cloud Native Computing Foundation.
Observability in Envoy and Istio
The Envoy proxy deployed by the Istio service mesh is the perfect candidate to ensure incoming and outgoing requests are properly traced. This approach provides distributed traces of the entire service mesh, giving an overview on the communication between services — even when the applications themselves are not instrumented.
Note: At minimum, applications must be configured to propagate the
traceparent
header.
Envoy offers several HTTP tracers for tracing requests, including the OpenTelemetry tracer. Tracers can be configured either directly within Envoy (when using it as a standalone component) or for all Envoy instances by using Istio.
Here is an example of how Istio and Envoy work together to trace requests:
New OTel tracing features in Envoy and Istio
Although Envoy already had support for exporting OpenTelemetry traces using gRPC, it lacked support for exporting using HTTP. OpenTelemetry supports both protocols as first-class citizens. In addition, other areas such as providing resource attributes and configurable sampling decisions were lagging behind the stable portions of the OpenTelemetry specification.
Starting from Envoy 1.29 and Istio 1.22, users have access to the new features described below.
OTLP HTTP exporter
The OpenTelemetry tracer in Envoy can now be configured to export OTLP traces using HTTP. This allows it to send telemetry to observability sinks using OTLP/HTTP, directly from Envoy proxies.
Resource detectors
Envoy now ships with the Environment Resource Detector. This resource detector follows the OTel specification and allows users to further enrich the spans produced by Envoy proxies.
The resource detector feature not only added the environment detector, but also made it possible for any other resource detector to be easily added with Envoy’s built-in extensions feature.
Custom samplers
Another exciting feature added to Envoy is the possibility of implementing and configuring custom samplers. Envoy follows the OTel Sampler interface, which makes it easy for anyone to contribute their own samplers.
Envoy ships with the Always On Sampler which simply forwards all spans. This base implementation can be used as a reference implementation for smarter samplers.
Demo
It’s time to see the new features in action! For this, we use the Istio Bookinfo application, and illustrate how to:
- Deploy in Kubernetes, with Istio as service mesh
- Export traces to Jaeger using HTTP
Install Jaeger
First, start by installing the Jaeger operator:
kubectl create namespace observability
kubectl create -f https://github.com/jaegertracing/jaeger-operator/releases/download/v1.57.0/jaeger-operator.yaml -n observability
Then deploy Jaeger all-in-one
:
kubectl apply -f - <<EOF
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: simplest
EOF
Install and configure Istio
Next, install Istio using
istioctl
:
cat <<EOF | istioctl install -y -f -
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
meshConfig:
enableTracing: true
extensionProviders:
- name: otel-tracing
opentelemetry:
port: 4318
service: simplest-collector.default.svc.cluster.local
http:
path: "/v1/traces"
timeout: 5s
resource_detectors:
environment: {}
EOF
This installs Istio and configures the OpenTelemetry tracing provider to use the
http
exporter over OTLP/HTTP, with the Jaeger collector as endpoint. This
configuration also enables the environment resource detector in
resource_detectors
.
Next, we need to enable the tracer using Istio’s Telemetry API:
kubectl apply -f - <<EOF
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
name: otel-demo
spec:
tracing:
- providers:
- name: otel-tracing
randomSamplingPercentage: 100
EOF
And finally, we configure the OTEL_RESOURCE_ATTRIBUTES
environment variable
for the Envoy proxies:
cat <<EOF | kubectl apply -f -
apiVersion: networking.istio.io/v1beta1
kind: ProxyConfig
metadata:
name: my-proxyconfig
namespace: istio-system
spec:
concurrency: 0
environmentVariables:
OTEL_RESOURCE_ATTRIBUTES: "host.name=abc-123"
EOF
Deploy the application
The final step is to deploy the bookinfo application (bookinfo.yaml):
kubectl label namespace default istio-injection=enabled
kubectl apply -f bookinfo.yaml
Test it out
To test your setup, make some requests to one of the services, for example:
kubectl exec "$(kubectl get pod -l app=ratings -o jsonpath='{.items[0].metadata.name}')" -c ratings -- curl -sS productpage:9080/productpage | grep -o "<title>.*</title>"
Then you can check it out on the Jaeger UI – you should see some traces!
From the spans produced by Envoy you can see (in order):
- Outgoing (egress) call from the
ratings
service to theproductpage
service. - Incoming (ingress) call in the
productpage
service. host-name
resource attribute we applied using theOTEL_RESOURCE_ATTRIBUTES
. This attribute was picked up by the environment resource detector and added to all spans Envoy created.
You can also see all the other downstream calls made, as all services have the Envoy sidecar injected by Istio. You have full observability of the calls between services, just by enabling the OTel tracer in Envoy!
Next steps and closing
With the new features described in this post, users gain more flexibility in exporting their traces. They can enrich their data with resource attributes and establish the groundwork for more intelligent sampling techniques to be added in the future.
The new features also unlock interesting use cases for other parties in the observability space, including cloud providers and observability vendors. With the resource detector and sampler APIs now available in Envoy, anyone can build support for custom samplers and detectors, enhancing the usefulness of the telemetry data generated by Envoy.
Another exciting next step for Envoy and OpenTelemetry is the adoption of the now-stable HTTP semantic conventions in Envoy. This will align Envoy with all OTel SDKs that are also producing the spans following the stable HTTP semantic conventions.
Collaborating with the Envoy and Istio community to bring more OTel features to these projects has been a great experience. The eagerness to adopt and the strong collaboration between OpenTelemetry and relevant CNCF projects, such as Istio and Envoy, helps solidify OpenTelemetry’s position as the de facto standard for observability.