format

The format block enables you to define source formats for tables and sources. Formats describe the layout of the source data so that it can be collected into a table.

Tip

Use backticks (`) to delimit the layout. Tailpipe treats anything in backticks as a non-interpolated string, so you don't have to escape quotes, backslashes, etc.

Formats

You can define a format with the format block:

Format blocks have two labels:

Plugins may also export preset formats, which may be referenced by name. For example, the Nginx plugin provides the nginx_access_log.combined format, which defines the Nginx default combined log format:

You can list and view details of both your custom formats and the plugin preset formats using the introspection tailpipe format list and tailpipe format show commands.

Format Types

The format type defines the parsing mechanism which should be used. The properties of the format are specific to the format type.

Format types are implemented by plugins. A number of "generic" format types are provided by the core plugin, which is included in every Tailpipe installation. These core format types provide a mechanism for describing file layouts using general-purpose syntax such as regular expressions, Grok, and JSONL.

Any plugin may include a format type to simplify describing the layout of log files specific to the plugin using its "native" syntax. For example, the Nginx plugin provides the nginx_access_log format type. When using the nginx_access_log format, you can specify the layout using the same Nginx log_format as you use in your Nginx configuration files:

You can discover the installed format types with the introspection tailpipe format list command.

Core Plugin Formats

Grok Format

The grok format is used for parsing log lines using Grok patterns, which are a way to parse log lines into structured data.

ArgumentTypeOptional?Description
layoutStringRequiredThe Grok pattern that defines how to parse the log line
patternsMapOptionalA map of custom Grok patterns that can be referenced in the layout. This is optional, and the standard patterns are available out-of-the-box.
descriptionStringOptionalA description of the format
Tip

Use the Grok Debugger to help create and test your grok expressions.

Regex Format

The regex format is used to parse log lines using regular expressions with named capture groups.

ArgumentTypeOptional?Description
layoutStringRequiredThe regular expression pattern with named capture groups
descriptionStringOptionalA description of the format
Tip

Use the RegEx 101 to help create and test your regular expressions.

Delimited Format

The delimited format is used for parsing CSV, TSV, and other delimited file formats. The properties are passed directly to DuckDB, which implements the delimited data parsing.

ArgumentTypeOptional?Description
delimiterStringOptionalThe character that separates columns
headerBooleanOptionalWhether the file contains a header row

JSONL Format

The jsonl (JSON Lines) format is used to parse JSON data where each line is a valid JSON object.

Example:

ArgumentTypeOptional?Description
descriptionStringOptionalA description of the format
Tip

Since the jsonl format type has no arguments other than the description, you may want to use the default format (format.jsonl.default) instead of defining your own.