Map vs JSON type for ClickStack
ClickStack's default schema stores resource, scope, log, and span attributes as Map(LowCardinality(String), String) columns. ClickHouse also supports a strongly typed JSON type, and ClickStack has beta support for using it in place of Map.
For typical observability workloads we recommend keeping the default Map-based schema. The JSON type is available for users who want to evaluate it on workloads with a small, stable set of attribute keys, but it is not the recommended schema for general use.
Why Map is the recommended default
Observability data is dominated by attributes such as resource attributes, scope attributes, and span and log attributes, and these sets are typically large, high-cardinality, and ingested at high throughput. The schema you pick for those attributes is the dominant factor in ingest cost and storage layout.
Map(LowCardinality(String), String) stores keys and values as a single structure. The historical disadvantage of Map was that reading a single key required reading the entire map column. That's no longer true: ClickHouse now supports bucketed map serialization, which splits the map into buckets so queries only read the buckets they need. Combined with text indexes on map keys and values, which is how ClickStack's default schema is configured, this makes Map selective and fast at read time without paying any ingest penalty for new keys.
In practice this means:
- Stable ingest cost as keys grow. Adding a new attribute key doesn't change the on-disk column layout or create new column files. Ingest cost is bounded by the data volume, not the key cardinality.
- No metadata explosion. The number of column files on disk doesn't track the number of unique attribute keys.
- Selective lookups via indexes. Text indexes on map keys and values give point lookups without scanning every row.
- Predictable behaviour at high throughput. Map handles bursty, schemaless attribute sets, common in tracing and logs, without per-key overhead.
Why not JSON by default
The JSON type takes a different approach: at insert time, ClickHouse dynamically creates a dedicated, strongly typed subcolumn for each path it sees. At read time this is attractive, since only the requested subcolumns are read, types are preserved, and no query-time casting is needed.
The tradeoff lands at ingest time. Creating and managing many dynamic subcolumns introduces write-time overhead and metadata complexity. On observability workloads, which routinely have very large or highly dynamic attribute sets and high ingest throughput, that overhead is significant. The max_dynamic_paths limit can cap the damage by spilling extra paths into a shared column, but accessing the shared column is slower than dedicated subcolumns, which erodes the read-time advantage that motivated using JSON in the first place.
With bucketed map serialization removing most of the historical read-time overhead of Map, the read-time advantage of JSON no longer outweighs its ingest-time cost for typical observability workloads.
When you might still consider JSON
The JSON type can be a reasonable fit when all of the following hold:
- Your attribute key-set is small and stable, meaning you are not seeing thousands of unique keys, and new keys appear rarely.
- Ingest throughput is modest relative to the attribute cardinality.
- You want strongly typed access to attributes without query-time casts (numbers stay numbers, booleans stay booleans).
- You are willing to operate a beta feature in ClickStack and accept that the integration may change.
If those conditions don't all hold, stay on the default Map-based schema.
Beta status
JSON type support in ClickStack is a beta feature. While the JSON type itself is production-ready in ClickHouse 25.3+, its integration within ClickStack is still under active development and may have limitations, change in the future, or contain bugs.
ClickStack has beta support for the JSON type from version 2.0.4.
Enabling JSON support
To use JSON-typed schemas instead of the default Map-based schemas, set the following environment variables.
| Variable | Set on | Purpose |
|---|---|---|
OTEL_AGENT_FEATURE_GATE_ARG='--feature-gates=clickhouse.json' | OTel collector | Creates schemas in ClickHouse using the JSON type. |
BETA_CH_OTEL_JSON_SCHEMA_ENABLED=true | HyperDX (ClickStack UI) | Enables the application layer to query JSON-typed schemas. ClickStack Open Source only. |
Managed ClickStack
To enable JSON support in Managed ClickStack, contact support@clickhouse.com prior to configuring the collector. The feature must also be enabled in the ClickStack UI (HyperDX) in ClickHouse Cloud.
Set OTEL_AGENT_FEATURE_GATE_ARG='--feature-gates=clickhouse.json' on the collector. For example:
Open Source ClickStack
Set OTEL_AGENT_FEATURE_GATE_ARG='--feature-gates=clickhouse.json' on any deployment that includes the collector, and BETA_CH_OTEL_JSON_SCHEMA_ENABLED=true on the HyperDX application layer so it can query the JSON-typed schemas.
For example:
Migrating from a Map-based schema to JSON
The JSON type is not backwards compatible with existing map-based schemas. Enabling this feature creates new tables using the JSON type and requires manual data migration.
To migrate from the default Map-based schemas, follow these steps:
Stop the OTel collector
Rename existing tables and update sources
Rename existing tables and update data sources in HyperDX.
For example:
Deploy the collector
Deploy the collector with OTEL_AGENT_FEATURE_GATE_ARG set.
Restart the HyperDX container with JSON schema support
Create new data sources
Create new data sources in HyperDX pointing to the JSON tables.
Migrating existing data (optional)
To move old data into the new JSON tables:
Recommended only for datasets smaller than ~10 billion rows. Data previously stored with the Map type didn't preserve type precision (all values were strings). As a result, this old data will appear as strings in the new schema until it ages out, requiring some casting on the frontend. Type for new data will be preserved with the JSON type.