> ## Documentation Index
> Fetch the complete documentation index at: https://private-7c7dfe99-mintlify-8c05c8a2.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Testing your ClickHouse integration

> Entry-level validation matrix for integrations on ClickHouse Cloud and self-hosted open source.

Validate your integration against both ClickHouse deployment modes and datasets that exercise ClickHouse's type system at a meaningful scale before you submit it for review. This page defines what "tested" means at the entry level. Formal validation is a separate process for partners progressing to higher partnership tiers.

See [Building integrations](/resources/develop-contribute/integrations/building-integrations) for ingestion and consumption paths, and [Documenting your integration](/resources/develop-contribute/integrations/documenting-your-integration) for how to publish your results.

<h2 id="test-matrix">
  Test matrix
</h2>

Cover both deployment modes. Most customers run one or the other, and behavior differs in places (auth, networking, available features).

* **ClickHouse Cloud:** sign up for a [free trial](https://clickhouse.com/cloud). No credit card is required for the development tier
* **Self-hosted (open source):** use the latest stable release from [GitHub releases](https://github.com/ClickHouse/ClickHouse/releases). The [install guide](/get-started/setup/install) is the fastest path to a local instance with Docker

Test against both, and document any feature gaps in your integration page.

<h2 id="what-to-test">
  What to test
</h2>

**Functional correctness.** Exercise every code path your integration exposes: ingestion, querying, schema discovery, error handling, and reconnection. If your product surfaces SQL to end users, confirm that the queries your UI generates round-trip cleanly.

**Type-system coverage.** ClickHouse supports arrays, tuples, maps, JSON, nested, LowCardinality, Decimal, Date and DateTime variants, UUID, IPv4 and IPv6, enums, and aggregate-function types. Integrations often hit issues with nested arrays, deeply nested tuples, and JSON columns. Your client library and UI should handle these gracefully, and at a minimum, fail with a readable error instead of silently truncating or misrendering.

**Scale.** Test at result-set sizes and row counts your customers will run. For user-facing BI, that often means tables with hundreds of millions to billions of rows, and result sets from single aggregates to tens of thousands of rows. Unbounded reads (`SELECT *`) should fail predictably or paginate, not hang.

**Authentication.** Validate at least one TLS-enabled connection. If you expose auth configuration, test every mode you document (username and password over TLS, mTLS, SSL client certificate).

**Connection lifecycle.** Confirm sane behavior on dropped connections, server restarts, and slow queries. Many escalations trace back to connection handling rather than query semantics.

<h2 id="recommended-example-datasets">
  Recommended example datasets
</h2>

The full set can be found in the [**Example datasets**](/get-started/sample-datasets/index) section. The following four datasets cover most integration testing needs:

* **[GitHub events](/get-started/sample-datasets/github-events):** 3.1B rows with nested event payloads. Best for arrays, tuples, and nested types
* **[NYC taxi data](/get-started/sample-datasets/nyc-taxi):** billions of rows with a well-known schema. Good for throughput and read-path testing
* **[Stack Overflow](/get-started/sample-datasets/stackoverflow):** multi-table relational data for JOIN-heavy BI scenarios
* **[Hacker News](/get-started/sample-datasets/hacker-news):** 28M rows, fast to load, useful for iteration

For extreme-scale validation, use **[WikiStat](/get-started/sample-datasets/wikistat)** (\~0.5 trillion records).

<h2 id="what-to-capture-from-your-testing">
  What to capture from your testing
</h2>

When you submit your integration for review, share:

* ClickHouse versions tested (Cloud and open source)
* Datasets and approximate scale (rows, on-disk size)
* Types your integration handles and types it does not (this becomes the **Known limits** section of your docs)
* Performance characteristics worth flagging, such as result-set thresholds where behavior changes

A short test report saves review cycles. A paragraph plus a table is enough.
