> ## Documentation Index
> Fetch the complete documentation index at: https://private-7c7dfe99-mintlify-8c05c8a2.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Trace sampling

> Configure sample-weighted aggregations for sampled trace data in ClickStack.

export const Image = ({img, alt, size}) => {
  return <Frame>
      <img src={img} alt={alt} />
    </Frame>;
};

High-throughput services can produce millions of spans per second. Storing every span is expensive, so teams commonly run the OpenTelemetry Collector's [tail-sampling processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/tailsamplingprocessor) to keep only 1-in-N spans. Each kept span carries a `SampleRate` attribute recording N.

Once data is sampled, naive aggregations are wrong: `count()` returns N-times fewer events than actually occurred, `sum()` and `avg()` are biased, and percentiles shift. Dashboards show misleadingly low request counts, throughput, and error rates.

ClickStack solves this with a sampling-aware query engine. When you configure a sample rate expression on a trace source, the query builder rewrites SQL aggregations to weight each span by its sample rate — across dashboards, alerts, and ad-hoc searches.

<h2 id="how-it-works">
  How it works
</h2>

When a trace source has a `sampleRateExpression` configured, ClickStack wraps it as:

```sql theme={null}
greatest(toUInt64OrZero(toString(expr)), 1)
```

Spans without a `SampleRate` attribute default to weight 1, so unsampled data produces identical results to the original queries.

The query builder then rewrites aggregations:

| Aggregation       | Before             | After (sample-corrected)                  |
| ----------------- | ------------------ | ----------------------------------------- |
| count             | `count()`          | `sum(weight)`                             |
| count + condition | `countIf(cond)`    | `sumIf(weight, cond)`                     |
| avg               | `avg(col)`         | `sum(col * weight) / sum(weight)`         |
| sum               | `sum(col)`         | `sum(col * weight)`                       |
| quantile(p)       | `quantile(p)(col)` | `quantileTDigestWeighted(p)(col, weight)` |
| min / max         | unchanged          | unchanged                                 |
| count\_distinct   | unchanged          | unchanged                                 |

<Note>
  Percentiles under sampling use `quantileTDigestWeighted`, an approximate T-Digest sketch. Results are close but not exact.
</Note>

<h2 id="configuring">
  Configuring the sample rate expression
</h2>

Open your trace source in **Source Settings** and enter the ClickHouse expression that evaluates to the per-span sample rate in the **Sample Rate Expression** field.

For example, if your OpenTelemetry tail-sampling processor writes the rate into `SpanAttributes['SampleRate']`:

<Image img="https://mintcdn.com/private-7c7dfe99-mintlify-8c05c8a2/_TDydWLKO6Z3njo9/images/clickstack/trace-sampling-source-settings.png?fit=max&auto=format&n=_TDydWLKO6Z3njo9&q=85&s=638d1b58e8018ebc2551905ddcbe145f" alt="Sample Rate Expression field in ClickStack Source Settings" size="lg" width="2300" height="690" data-path="images/clickstack/trace-sampling-source-settings.png" />

Once configured, all charts, dashboards, alerts, and service dashboard panels automatically apply sample-weighted aggregations. No changes to individual queries are needed.
