Skip to content

Migration

ParticleDB provides multiple paths for importing data: direct PostgreSQL migration, SQL dump import, the COPY protocol, and CSV/Parquet file loading.

The particledb migrate command connects directly to a running PostgreSQL instance and streams data into ParticleDB.

Terminal window
# Migrate all tables
particledb migrate --source postgres://user:pass@pghost:5432/mydb
# Migrate specific tables with a custom batch size
particledb migrate \
--source postgres://user:pass@pghost:5432/mydb \
--target 127.0.0.1:5432 \
--tables users,orders,products \
--batch-size 50000
FlagDefaultDescription
--source(required)PostgreSQL connection URL
--target127.0.0.1:5432ParticleDB server address
--tables(all)Comma-separated list of tables
--batch-size10000Rows per batch

The migrate command:

  1. Reads the source schema (CREATE TABLE statements).
  2. Creates matching tables in ParticleDB.
  3. Streams data in batches via the PG wire protocol.
  4. Reports row counts and elapsed time per table.

Create a logical backup of a running ParticleDB instance:

Terminal window
# Dump all tables
particledb dump --output backup.sql
# Dump specific tables
particledb dump --tables users,orders --output partial.sql

Restore a SQL dump into a running ParticleDB instance:

Terminal window
particledb load --input backup.sql

ParticleDB accepts standard pg_dump output in plain SQL format:

Terminal window
# Export from PostgreSQL
pg_dump --format=plain --no-owner --no-acl mydb > dump.sql
# Import into ParticleDB
particledb load --input dump.sql

Use --no-owner and --no-acl to strip PostgreSQL-specific ownership and access control statements. ParticleDB ignores unsupported DDL extensions (sequences, custom types, etc.) and logs warnings for skipped statements.

For large databases, dump and load individual tables:

Terminal window
# Dump one table at a time
pg_dump --format=plain --no-owner --no-acl --table=users mydb > users.sql
pg_dump --format=plain --no-owner --no-acl --table=orders mydb > orders.sql
# Load sequentially
particledb load --input users.sql
particledb load --input orders.sql

ParticleDB supports the PostgreSQL COPY protocol for high-throughput bulk loading. This is the fastest way to import large datasets.

Terminal window
# Tab-delimited (default)
psql -h localhost -p 5432 -c "\copy users FROM 'users.csv' WITH (FORMAT csv, HEADER true)"
# Custom delimiter
psql -h localhost -p 5432 -c "\copy events FROM 'events.tsv' WITH (DELIMITER E'\t')"

Any PostgreSQL client library that supports the COPY protocol works with ParticleDB. Example with Python (psycopg2):

import psycopg2
conn = psycopg2.connect(host="localhost", port=5432, dbname="particledb")
cur = conn.cursor()
with open("data.csv", "r") as f:
cur.copy_expert("COPY users FROM STDIN WITH (FORMAT csv, HEADER true)", f)
conn.commit()
OptionDefaultDescription
FORMATtexttext, csv, or binary
DELIMITERtab (text) / comma (csv)Column separator character
HEADERfalseFirst line is a header row (CSV only)
NULL\N (text) / empty (csv)String representing NULL values
QUOTE"Quoting character (CSV only)
ESCAPE"Escape character (CSV only)

For analytical workloads, load Parquet files directly:

-- Load a Parquet file from local disk
COPY hits FROM '/path/to/hits.parquet' WITH (FORMAT parquet);
-- Load from an S3-mounted path
COPY events FROM '/mnt/s3/events.parquet' WITH (FORMAT parquet);

ParticleDB supports a broad set of types. Key mappings from PostgreSQL:

PostgreSQLParticleDBNotes
SERIAL / BIGSERIALINT / BIGINTUse application-generated IDs or UUIDs
VARCHAR(n)VARCHAR / TEXTLength limits are accepted but advisory
TIMESTAMP / TIMESTAMPTZTIMESTAMPStored as microseconds since epoch
JSONB / JSONJSONFull JSON query support
BOOLEANBOOLEAN
FLOAT4 / FLOAT8FLOAT / DOUBLE
NUMERIC / DECIMALDECIMAL
UUIDTEXTStore as text, index normally
ARRAYJSONUse JSON arrays as an alternative

ParticleDB does not have SERIAL or SEQUENCE. Replace auto-increment patterns with:

  • UUID primary keys: Generate UUIDs in the application.
  • Application-managed counters: Use a counter table with transactions.
  • Timestamp-based IDs: Combine timestamp with random suffix.
-- Instead of SERIAL
CREATE TABLE users (
id TEXT PRIMARY KEY, -- UUID generated by application
name TEXT NOT NULL,
email TEXT NOT NULL
);

ParticleDB uses zone maps (min/max metadata per data zone) and optional secondary indexes:

-- Create secondary indexes for frequently filtered columns
CREATE INDEX idx_users_email ON users (email);
CREATE INDEX idx_orders_date ON orders (created_at);

After migrating, verify data integrity:

-- Compare row counts
SELECT COUNT(*) FROM users;
SELECT COUNT(*) FROM orders;
-- Spot-check specific records
SELECT * FROM users WHERE id = 'known-id';
-- Verify aggregates match
SELECT SUM(total) FROM orders;

Run the built-in integrity checker for a thorough check:

Terminal window
particledb verify --data-dir /var/lib/particledb --consistency true

Create a compressed archive of the data directory:

Terminal window
# Create backup
particledb backup create --data-dir /var/lib/particledb --output /backups/pdb_$(date +%Y%m%d).tar.gz
# Validate without restoring
particledb backup validate --input /backups/pdb_20260413.tar.gz
# Restore
particledb backup restore --input /backups/pdb_20260413.tar.gz --data-dir /var/lib/particledb --force
# List available backups
particledb backup list --dir /backups

Snapshots provide faster, incremental backups with WAL-based point-in-time recovery:

Terminal window
# Enable automated snapshots at startup
particledb start \
--snapshot-enabled \
--snapshot-interval 3600 \
--snapshot-max 10 \
--snapshot-dir /var/lib/particledb/snapshots
# Manual snapshot (database must be stopped, or use --force)
particledb snapshot create --data-dir /var/lib/particledb
# Archive WAL for PITR
particledb snapshot archive-wal --wal-dir /var/lib/particledb/wal
# Restore to a specific point in time
particledb snapshot restore \
--target-dir /var/lib/particledb-restored \
--pitr '2026-04-13T10:30:00Z'
# Prune old snapshots
particledb snapshot prune --max-snapshots 5 --retention-days 30

For cross-version or cross-platform portability:

Terminal window
# Export
particledb dump --output full_backup.sql
# Import into a fresh instance
particledb load --input full_backup.sql

  1. Audit source schema: Identify unsupported types, sequences, and extensions.
  2. Map data types: Adjust columns as needed (see type mapping table above).
  3. Remove PostgreSQL-specific DDL: Strip SERIAL, SEQUENCE, custom types, and ownership.
  4. Choose a migration method: particledb migrate for direct transfer, or pg_dump + particledb load for staged migration.
  5. Run the migration: Start with a small table to validate, then migrate the full dataset.
  6. Verify row counts and aggregates: Compare source and target.
  7. Run integrity checks: particledb verify --consistency true.
  8. Set up backups: Enable automated snapshots or schedule periodic backup create jobs.