Semantic Conventions for Database Client Calls
Status: Experimental
Warning
Existing database instrumentations that are using v1.24.0 of this document (or prior):
- SHOULD NOT change the version of the database conventions that they emit until the database semantic conventions are marked stable. Conventions include, but are not limited to, attributes, metric and span names, and unit of measure.
- SHOULD introduce an environment variable
OTEL_SEMCONV_STABILITY_OPT_IN
in the existing major version which is a comma-separated list of values. If the list of values includes:
database
- emit the new, stable database conventions, and stop emitting the old experimental database conventions that the instrumentation emitted previously.database/dup
- emit both the old and the stable database conventions, allowing for a seamless transition.- The default behavior (in the absence of one of these values) is to continue emitting whatever version of the old experimental database conventions the instrumentation was emitting previously.
- Note:
database/dup
has higher precedence thandatabase
in case both values are present- SHOULD maintain (security patching at a minimum) the existing major version for at least six months after it starts emitting both sets of conventions.
- SHOULD drop the environment variable in the next major version.
Span kind: MUST always be CLIENT
.
Span that describes database call SHOULD cover the duration of the corresponding call as if it was observed by the caller (such as client application). For example, if a transient issue happened and was retried within this database call, the corresponding span should cover the duration of the logical operation with all retries.
Name
Database spans MUST follow the overall guidelines for span names.
The span name SHOULD be {db.operation.name} {target}
if there is a
(low-cardinality) {db.operation.name}
available (see below for the exact definition of the {target}
placeholder).
If there is no (low-cardinality) db.operation.name
available, database span names
SHOULD be {target}
.
If neither {db.operation.name}
nor {target}
are available, span name SHOULD be {db.system}
.
Semantic conventions for individual database systems MAY specify different span name format.
The {target}
SHOULD describe the entity that the operation is performed against
and SHOULD adhere to one of the following values, provided they are accessible:
db.collection.name
SHOULD be used for data manipulation operations or operations on database collections.db.namespace
SHOULD be used for operations on a specific database namespace.server.address:server.port
SHOULD be used for other operations not targeting any specific database(s) or collection(s)
If a corresponding {target}
value is not available for a specific operation, the instrumentation SHOULD omit the {target}
.
For example, for an operation describing SQL query on an anonymous table like SELECT * FROM (SELECT * FROM table) t
, span name should be SELECT
.
Common attributes
These attributes will usually be the same for all operations performed over the same database connection.
Attribute | Type | Description | Examples | Requirement Level | Stability |
---|---|---|---|---|---|
db.system | string | The database management system (DBMS) product as identified by the client instrumentation. [1] | other_sql ; adabas ; cache | Required | |
db.collection.name | string | The name of a collection (table, container) within the database. [2] | public.users ; customers | Conditionally Required [3] | |
db.namespace | string | The name of the database, fully qualified within the server address and port. [4] | customers ; test.users | Conditionally Required If available. | |
db.operation.name | string | The name of the operation or command being executed. [5] | findAndModify ; HMSET ; SELECT | Conditionally Required [6] | |
error.type | string | Describes a class of error the operation ended with. [7] | timeout ; java.net.UnknownHostException ; server_certificate_invalid ; 500 | Conditionally Required If and only if the operation failed. | |
server.port | int | Server port number. [8] | 80 ; 8080 ; 443 | Conditionally Required [9] | |
db.query.text | string | The database query being executed. [10] | SELECT * FROM wuser_table where username = ? ; SET mykey "WuValue" | Recommended [11] | |
network.peer.address | string | Peer address of the database node where the operation was performed. [12] | 10.1.2.80 ; /tmp/my.sock | Recommended If applicable for this database system. | |
network.peer.port | int | Peer port number of the network connection. | 65123 | Recommended if and only if network.peer.address is set. | |
server.address | string | Name of the database host. [13] | example.com ; 10.1.2.80 ; /tmp/my.sock | Recommended | |
db.query.parameter.<key> | string | A query parameter used in db.query.text , with <key> being the parameter name, and the attribute value being a string representation of the parameter value. [14] | someval ; 55 | Opt-In |
[1]: The actual DBMS may differ from the one identified by the client. For example, when using PostgreSQL client libraries to connect to a CockroachDB, the db.system
is set to postgresql
based on the instrumentation’s best knowledge.
[2]: It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization.
If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix.
For batch operations, if the individual operations are known to have the same collection name then that collection name SHOULD be used, otherwise db.collection.name
SHOULD NOT be captured.
[3]: If readily available. The collection name MAY be parsed from the query text, in which case it SHOULD be the first collection name found in the query.
[4]: If a database system has multiple namespace components, they SHOULD be concatenated (potentially using database system specific conventions) from most general to most specific namespace component, and more specific namespaces SHOULD NOT be captured without the more general namespaces, to ensure that “startswith” queries for the more general namespaces will be valid.
Semantic conventions for individual database systems SHOULD document what db.namespace
means in the context of that system.
It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization.
[5]: It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization.
If the operation name is parsed from the query text, it SHOULD be the first operation name found in the query.
For batch operations, if the individual operations are known to have the same operation name then that operation name SHOULD be used prepended by BATCH
, otherwise db.operation.name
SHOULD be BATCH
or some other database system specific term if more applicable.
[6]: If readily available. The operation name MAY be parsed from the query text, in which case it SHOULD be the first operation name found in the query.
[7]: The error.type
SHOULD match the error code returned by the database or the client library, the canonical name of exception that occurred, or another low-cardinality error identifier. Instrumentations SHOULD document the list of errors they report.
[8]: When observed from the client side, and when communicating through an intermediary, server.port
SHOULD represent the server port behind any intermediaries, for example proxies, if it’s available.
[9]: If using a port other than the default port for this DBMS and if server.address
is set.
[10]: For sanitization see Sanitization of db.query.text
.
For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator ;
or some other database system specific separator if more applicable.
Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk.
[11]: SHOULD be collected by default only if there is sanitization that excludes sensitive information. See Sanitization of db.query.text
.
[12]: Semantic conventions for individual database systems SHOULD document whether network.peer.*
attributes are applicable. Network peer address and port are useful when the application interacts with individual database nodes directly.
If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used.
[13]: When observed from the client side, and when communicating through an intermediary, server.address
SHOULD represent the server address behind any intermediaries, for example proxies, if it’s available.
[14]: Query parameters should only be captured when db.query.text
is parameterized with placeholders.
If a parameter has no name and instead is referenced only by index, then <key>
SHOULD be the 0-based index.
The following attributes can be important for making sampling decisions and SHOULD be provided at span creation time (if provided at all):
db.system
has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
Value | Description | Stability |
---|---|---|
adabas | Adabas (Adaptable Database System) | |
cassandra | Apache Cassandra | |
clickhouse | ClickHouse | |
cockroachdb | CockroachDB | |
cosmosdb | Microsoft Azure Cosmos DB | |
couchbase | Couchbase | |
couchdb | CouchDB | |
db2 | IBM Db2 | |
derby | Apache Derby | |
dynamodb | Amazon DynamoDB | |
edb | EnterpriseDB | |
elasticsearch | Elasticsearch | |
filemaker | FileMaker | |
firebird | Firebird | |
geode | Apache Geode | |
h2 | H2 | |
hanadb | SAP HANA | |
hbase | Apache HBase | |
hive | Apache Hive | |
hsqldb | HyperSQL DataBase | |
influxdb | InfluxDB | |
informix | Informix | |
ingres | Ingres | |
instantdb | InstantDB | |
interbase | InterBase | |
intersystems_cache | InterSystems Caché | |
mariadb | MariaDB | |
maxdb | SAP MaxDB | |
memcached | Memcached | |
mongodb | MongoDB | |
mssql | Microsoft SQL Server | |
mysql | MySQL | |
neo4j | Neo4j | |
netezza | Netezza | |
opensearch | OpenSearch | |
oracle | Oracle Database | |
other_sql | Some other SQL database. Fallback only. See notes. | |
pervasive | Pervasive PSQL | |
pointbase | PointBase | |
postgresql | PostgreSQL | |
progress | Progress Database | |
redis | Redis | |
redshift | Amazon Redshift | |
spanner | Cloud Spanner | |
sqlite | SQLite | |
sybase | Sybase | |
teradata | Teradata | |
trino | Trino | |
vertica | Vertica |
error.type
has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
Value | Description | Stability |
---|---|---|
_OTHER | A fallback error value to be used when the instrumentation doesn’t define a custom value. |
Notes and well-known identifiers for db.system
The list above is a non-exhaustive list of well-known identifiers to be specified for db.system
.
If a value defined in this list applies to the DBMS to which the request is sent, this value MUST be used. If no value defined in this list is suitable, a custom value MUST be provided. This custom value MUST be the name of the DBMS in lowercase and without a version number to stay consistent with existing identifiers.
It is encouraged to open a PR towards this specification to add missing values to the list, especially when instrumentations for those missing databases are written. This allows multiple instrumentations for the same database to be aligned and eases analyzing for backends.
The value other_sql
is intended as a fallback and MUST only be used if the DBMS is known to be SQL-compliant but the concrete product is not known to the instrumentation.
If the concrete DBMS is known to the instrumentation, its specific identifier MUST be used.
Back ends could, for example, use the provided identifier to determine the appropriate SQL dialect for parsing the db.query.text
.
When additional attributes are added that only apply to a specific DBMS, its identifier SHOULD be used as a namespace in the attribute key as for the attributes in the sections below.
Sanitization of db.query.text
The db.query.text
SHOULD be collected by default only if there is sanitization that excludes sensitive information.
Sanitization SHOULD replace all literals with a placeholder value.
Such literals include, but are not limited to, String, Numeric, Date and Time,
Boolean, Interval, Binary, and Hexadecimal literals.
The placeholder value SHOULD be ?
, unless it already has a defined meaning in the given database system,
in which case the instrumentation MAY choose a different placeholder.
Placeholders in a parameterized query SHOULD not be sanitized. E.g. where id = $1
can be captured as is.
IN-clauses MAY be collapsed during sanitization,
e.g. from IN (?, ?, ?, ?)
to IN (?)
, as this can help with extremely long IN-clauses,
and can help control cardinality for users who choose to (optionally) add db.query.text
to their metric attributes.
Semantic Conventions for specific database technologies
More specific Semantic Conventions are defined for the following database technologies:
- AWS DynamoDB: Semantic Conventions for AWS DynamoDB.
- Cassandra: Semantic Conventions for Cassandra.
- Cosmos DB: Semantic Conventions for Microsoft Cosmos DB.
- CouchDB: Semantic Conventions for CouchDB.
- Elasticsearch: Semantic Conventions for Elasticsearch.
- HBase: Semantic Conventions for HBase.
- MongoDB: Semantic Conventions for MongoDB.
- MSSQL: Semantic Conventions for MSSQL.
- Redis: Semantic Conventions for Redis.
- SQL: Semantic Conventions for SQL databases.
Feedback
Was this page helpful?
Thank you. Your feedback is appreciated!
Please let us know how we can improve this page. Your feedback is appreciated!