/ˌsiː-ɛs-ˈviː/

n. “Plain text pretending to be a spreadsheet.”

CSV, or Comma-Separated Values, is a simple text-based file format used to store tabular data. Each line represents a row, and each value within that row is separated by a delimiter — most commonly a comma. Despite its minimalism, CSV is one of the most widely used data interchange formats in computing.

A typical CSV file might represent a table of users, products, or logs. The first line often contains column headers, followed by data rows. Because the format is plain text, it can be created, viewed, and edited with anything from a text editor to a spreadsheet application to a command-line tool.

One reason CSV persists is its universality. Nearly every programming language, database, analytics tool, and spreadsheet application understands CSV. Systems that cannot easily share native formats can almost always agree on CSV as a lowest common denominator.

That simplicity, however, comes with trade-offs. CSV has no built-in data types, schemas, or encoding guarantees. Everything is text. Numbers, dates, booleans, and null values must be interpreted by the consuming system. This flexibility is powerful, but it can also lead to ambiguity and subtle bugs.

Delimiters are another subtle detail. While commas are traditional, some regions and tools use semicolons or tabs to avoid conflicts with decimal separators. Quoting rules allow values to contain commas, line breaks, or quotation marks, but these rules are often implemented inconsistently across software.

In modern data pipelines, CSV is commonly used as an interchange format in ETL workflows. Data may be exported from a database, transformed by scripts, and loaded into analytics platforms such as BigQuery or stored in Cloud Storage. Its lightweight nature makes it ideal for quick transfers and human inspection.

CSV is also favored for audits, reporting, and backups where transparency matters. You can open the file and see the data directly, without specialized tools. This visibility makes it valuable for debugging and verification, even in highly automated systems.

It is important to recognize what CSV is not. It is not self-describing, strongly typed, or optimized for very large datasets. Formats like Parquet or Avro outperform it in scale and structure. Yet CSV endures because it is simple, durable, and unpretentious.

In essence, CSV is data stripped to its bones. No metadata, no ceremony — just rows, columns, and agreement. And in a world full of complex formats, that blunt honesty is often exactly what makes it useful.