> ## Documentation Index
> Fetch the complete documentation index at: https://private-7c7dfe99-mintlify-8c05c8a2.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

> A table engine which provides a table-like interface to SELECT from and INSERT into files, similar to the s3 table function. Use `file` when working with local files, and `s3` when working with buckets in object storage such as S3, GCS, or MinIO.

# file

export const CloudNotSupportedBadge = () => {
  return <div className="cloudNotSupportedBadge">
            <div className="cloudNotSupportedIcon">
            <svg width="16" height="16" viewBox="0 0 16 16" fill="none" xmlns="http://www.w3.org/2000/svg">
                <path strokeWidth="1.5" d="M6.33366 12.6666L12.3739 12.6667C13.6593 12.6667 14.7073 11.6187 14.7073 10.3334C14.7073 9.04804 13.6593 8.00003 12.3739 8.00003C12.3739 8.00003 12.3337 7.66659 12.0003 7.33325M10.667 5.33322C8.00033 2.33325 4.45395 4.78537 4.14195 6.68203C2.55728 6.7627 1.29395 8.06203 1.29395 9.6667C1.29395 11.3234 2.66699 12.6666 4.00033 12.6666" stroke="currentColor" strokeLinecap="round" strokeLinejoin="round" />
                <path strokeWidth="1.5" d="M2.66699 14L12.0003 4.66663" stroke="currentColor" strokeLinecap="round" strokeLinejoin="round" />
            </svg>

        </div>
            Not supported in ClickHouse Cloud
        </div>;
};

export const ExperimentalBadge = () => {
  return <div className="experimentalBadge">
            <div className="experimentalIcon">
            <svg width="16" height="16" viewBox="0 0 16 16" fill="none" xmlns="http://www.w3.org/2000/svg">
                <path strokeWidth="1.25" d="M5.5 2H10.5" stroke="currentColor" strokeLinecap="round" strokeLinejoin="round" />
                <path strokeWidth="1.25" d="M9.50015 2V6.19625L13.4283 12.7425C13.4738 12.8183 13.4985 12.9049 13.4996 12.9934C13.5008 13.0818 13.4785 13.169 13.435 13.246C13.3914 13.323 13.3283 13.3871 13.2519 13.4317C13.1755 13.4764 13.0886 13.4999 13.0002 13.5H3.00015C2.91164 13.5 2.8247 13.4766 2.74822 13.432C2.67174 13.3874 2.60847 13.3233 2.56487 13.2463C2.52126 13.1693 2.49889 13.082 2.50004 12.9935C2.50119 12.905 2.52582 12.8184 2.5714 12.7425L6.50015 6.19625V2" stroke="currentColor" strokeLinecap="round" strokeLinejoin="round" />
                <path strokeWidth="1.25" d="M4.47656 9.56754C5.30344 9.41254 6.47656 9.47942 7.99969 10.25C10.0153 11.2707 11.4216 11.0569 12.2184 10.7282" stroke="currentColor" strokeLinecap="round" strokeLinejoin="round" />
            </svg>
        </div>
            Experimental feature. <u><a href="/docs/beta-and-experimental-features#experimental-features">Learn more.</a></u>
        </div>;
};

# file Table Function

A table engine which provides a table-like interface to SELECT from and INSERT into files, similar to the [s3](/reference/functions/table-functions/s3) table function. Use `file` when working with local files, and `s3` when working with buckets in object storage such as S3, GCS, or MinIO.

The `file` function can be used in `SELECT` and `INSERT` queries to read from or write to files.

<h2 id="syntax">
  Syntax
</h2>

```sql theme={null}
file([path_to_archive ::] path [,format] [,structure] [,compression])
```

For `SELECT` queries, `path` can also be an expression that returns an `Array(String)`:

```sql theme={null}
file(['file1.csv', 'file2.csv'], 'CSV', 'column1 UInt32, column2 UInt32')
```

<h2 id="arguments">
  Arguments
</h2>

| Parameter         | Description                                                                                                                                                                                                                                                                                                                                               |
| ----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `path`            | The relative path to the file from [user\_files\_path](/reference/settings/server-settings/settings#user_files_path), or an `Array(String)` of paths in `SELECT` queries. Supports in read-only mode the following [globs](#globs-in-path): `*`, `?`, `{abc,def}` (with `'abc'` and `'def'` being strings) and `{N..M}` (with `N` and `M` being numbers). |
| `path_to_archive` | The relative path to a zip/tar/7z archive. Supports the same globs as `path`.                                                                                                                                                                                                                                                                             |
| `format`          | The [format](/reference/formats/index) of the file.                                                                                                                                                                                                                                                                                                       |
| `structure`       | Structure of the table. Format: `'column1_name column1_type, column2_name column2_type, ...'`.                                                                                                                                                                                                                                                            |
| `compression`     | The existing compression type when used in a `SELECT` query, or the desired compression type when used in an `INSERT` query. Supported compression types are `gz`, `br`, `xz`, `zst`, `lz4`, and `bz2`.                                                                                                                                                   |

<Tip>
  When the `structure` argument is omitted, ClickHouse infers the schema from the format itself.
  Different formats produce different default column names and types.
  To see the schema for a specific format, use [`DESC`](/reference/statements/describe-table) with the [`format`](/reference/functions/table-functions/format) table function.

  For example:

  ```sql theme={null}
  DESC format(LineAsString, 'Hello\nWorld')
  ```

  ```response theme={null}
  ┌─name─┬─type───┬─default_type─┬─default_expression─┬─comment─┬─codec_expression─┬─ttl_expression─┐
  │ line │ String │              │                    │         │                  │                │
  └──────┴────────┴──────────────┴────────────────────┴─────────┴──────────────────┴────────────────┘
  ```
</Tip>

<h2 id="returned_value">
  Returned value
</h2>

A table for reading or writing data in a file.

<h2 id="examples-for-writing-to-a-file">
  Examples for Writing to a File
</h2>

<h3 id="write-to-a-tsv-file">
  Write to a TSV file
</h3>

```sql theme={null}
INSERT INTO TABLE FUNCTION
file('test.tsv', 'TSV', 'column1 UInt32, column2 UInt32, column3 UInt32')
VALUES (1, 2, 3), (3, 2, 1), (1, 3, 2)
```

As a result, the data is written into the file `test.tsv`:

```bash theme={null}
# cat /var/lib/clickhouse/user_files/test.tsv
1    2    3
3    2    1
1    3    2
```

<h3 id="partitioned-write-to-multiple-tsv-files">
  Partitioned write to multiple TSV files
</h3>

If you specify a `PARTITION BY` expression when inserting data into a table function of type `file`, then a separate file is created for each partition. Splitting the data into separate files helps to improve performance of read operations.

```sql theme={null}
INSERT INTO TABLE FUNCTION
file('test_{_partition_id}.tsv', 'TSV', 'column1 UInt32, column2 UInt32, column3 UInt32')
PARTITION BY column3
VALUES (1, 2, 3), (3, 2, 1), (1, 3, 2)
```

As a result, the data is written into three files: `test_1.tsv`, `test_2.tsv`, and `test_3.tsv`.

```bash theme={null}
# cat /var/lib/clickhouse/user_files/test_1.tsv
3    2    1

# cat /var/lib/clickhouse/user_files/test_2.tsv
1    3    2

# cat /var/lib/clickhouse/user_files/test_3.tsv
1    2    3
```

<h2 id="examples-for-reading-from-a-file">
  Examples for Reading from a File
</h2>

<h3 id="select-from-a-csv-file">
  SELECT from a CSV file
</h3>

First, set `user_files_path` in the server configuration and prepare a file `test.csv`:

```bash theme={null}
$ grep user_files_path /etc/clickhouse-server/config.xml
    <user_files_path>/var/lib/clickhouse/user_files/</user_files_path>

$ cat /var/lib/clickhouse/user_files/test.csv
    1,2,3
    3,2,1
    78,43,45
```

Then, read data from `test.csv` into a table and select its first two rows:

```sql theme={null}
SELECT * FROM
file('test.csv', 'CSV', 'column1 UInt32, column2 UInt32, column3 UInt32')
LIMIT 2;
```

```text theme={null}
┌─column1─┬─column2─┬─column3─┐
│       1 │       2 │       3 │
│       3 │       2 │       1 │
└─────────┴─────────┴─────────┘
```

<h3 id="inserting-data-from-a-file-into-a-table">
  Inserting data from a file into a table
</h3>

```sql theme={null}
INSERT INTO FUNCTION
file('test.csv', 'CSV', 'column1 UInt32, column2 UInt32, column3 UInt32')
VALUES (1, 2, 3), (3, 2, 1);
```

```sql theme={null}
SELECT * FROM
file('test.csv', 'CSV', 'column1 UInt32, column2 UInt32, column3 UInt32');
```

```text theme={null}
┌─column1─┬─column2─┬─column3─┐
│       1 │       2 │       3 │
│       3 │       2 │       1 │
└─────────┴─────────┴─────────┘
```

Reading data from `table.csv`, located in `archive1.zip` or/and `archive2.zip`:

```sql theme={null}
SELECT * FROM file('user_files/archives/archive{1..2}.zip :: table.csv');
```

<h2 id="globs-in-path">
  Globs in path
</h2>

Paths may use globbing. Files must match the whole path pattern, not only the suffix or prefix. There is one exception that if the path refers to an existing
directory and does not use globs, a `*` will be implicitly added to the path so
all the files in the directory are selected.

* `*` — Represents arbitrarily many characters except `/` but including the empty string.
* `?` — Represents an arbitrary single character.
* `{some_string,another_string,yet_another_one}` — Substitutes any of strings `'some_string', 'another_string', 'yet_another_one'`. The strings can contain the `/` symbol.
* `{N..M}` — Represents any number `>= N` and `<= M`.
* `**` - Represents all files inside a folder recursively.

Constructions with `{}` are similar to the [remote](/reference/functions/table-functions/remote) and [hdfs](/reference/functions/table-functions/hdfs) table functions.

<h2 id="examples">
  Examples
</h2>

**Example**

Suppose there are these files with the following relative paths:

* `some_dir/some_file_1`
* `some_dir/some_file_2`
* `some_dir/some_file_3`
* `another_dir/some_file_1`
* `another_dir/some_file_2`
* `another_dir/some_file_3`

Query the total number of rows in all files:

```sql theme={null}
SELECT count(*) FROM file('{some,another}_dir/some_file_{1..3}', 'TSV', 'name String, value UInt32');
```

An alternative path expression which achieves the same:

```sql theme={null}
SELECT count(*) FROM file('{some,another}_dir/*', 'TSV', 'name String, value UInt32');
```

Query the total number of rows in `some_dir` using the implicit `*`:

```sql theme={null}
SELECT count(*) FROM file('some_dir', 'TSV', 'name String, value UInt32');
```

<Note>
  If your listing of files contains number ranges with leading zeros, use the construction with braces for each digit separately or use `?`.
</Note>

**Example**

Query the total number of rows in files named `file000`, `file001`, ... , `file999`:

```sql theme={null}
SELECT count(*) FROM file('big_dir/file{0..9}{0..9}{0..9}', 'CSV', 'name String, value UInt32');
```

**Example**

Query the total number of rows from all files inside directory `big_dir/` recursively:

```sql theme={null}
SELECT count(*) FROM file('big_dir/**', 'CSV', 'name String, value UInt32');
```

**Example**

Query the total number of rows from all files `file002` inside any folder in directory `big_dir/` recursively:

```sql theme={null}
SELECT count(*) FROM file('big_dir/**/file002', 'CSV', 'name String, value UInt32');
```

<h2 id="virtual-columns">
  Virtual Columns
</h2>

* `_path` — Path to the file. Type: `LowCardinality(String)`.
* `_file` — Name of the file. Type: `LowCardinality(String)`.
* `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the file size is unknown, the value is `NULL`.
* `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.

<h2 id="hive-style-partitioning">
  use\_hive\_partitioning setting
</h2>

When setting `use_hive_partitioning` is set to 1, ClickHouse will detect Hive-style partitioning in the path (`/name=value/`) and will allow to use partition columns as virtual columns in the query. These virtual columns will have the same names as in the partitioned path.

**Example**

Use virtual column, created with Hive-style partitioning

```sql theme={null}
SELECT * FROM file('data/path/date=*/country=*/code=*/*.parquet') WHERE date > '2020-01-01' AND country = 'Netherlands' AND code = 42;
```

<h2 id="settings">
  Settings
</h2>

| Setting                                                                                                                      | Description                                                                                                                                                                 |
| ---------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [engine\_file\_empty\_if\_not\_exists](/reference/settings/session-settings#engine_file_empty_if_not_exists)                 | allows to select empty data from a file that doesn't exist. Disabled by default.                                                                                            |
| [engine\_file\_truncate\_on\_insert](/reference/settings/session-settings#engine_file_truncate_on_insert)                    | allows to truncate file before insert into it. Disabled by default.                                                                                                         |
| [engine\_file\_allow\_create\_multiple\_files](/reference/settings/session-settings#engine_file_allow_create_multiple_files) | allows to create a new file on each insert if format has suffix. Disabled by default.                                                                                       |
| [engine\_file\_skip\_empty\_files](/reference/settings/session-settings#engine_file_skip_empty_files)                        | allows to skip empty files while reading. Disabled by default.                                                                                                              |
| [storage\_file\_read\_method](/reference/settings/session-settings#engine_file_empty_if_not_exists)                          | method of reading data from storage file, one of: read, pread, mmap (only for clickhouse-local). Default value: `pread` for clickhouse-server, `mmap` for clickhouse-local. |

<h2 id="related">
  Related
</h2>

* [Virtual columns](/reference/engines/table-engines/index#table_engines-virtual_columns)
* [Rename files after processing](/reference/settings/session-settings#rename_files_after_processing)
