Migration

ParticleDB provides multiple paths for importing data: direct PostgreSQL migration, SQL dump import, the COPY protocol, and CSV/Parquet file loading.

Migrate from PostgreSQL

The particledb migrate command connects directly to a running PostgreSQL instance and streams data into ParticleDB.

# Migrate all tables
particledb migrate --source postgres://user:pass@pghost:5432/mydb

# Migrate specific tables with a custom batch size
particledb migrate \
    --source postgres://user:pass@pghost:5432/mydb \
    --target 127.0.0.1:5432 \
    --tables users,orders,products \
    --batch-size 50000

Flag	Default	Description
`--source`	(required)	PostgreSQL connection URL
`--target`	`127.0.0.1:5432`	ParticleDB server address
`--tables`	(all)	Comma-separated list of tables
`--batch-size`	`10000`	Rows per batch

The migrate command:

Reads the source schema (CREATE TABLE statements).
Creates matching tables in ParticleDB.
Streams data in batches via the PG wire protocol.
Reports row counts and elapsed time per table.

SQL Dump Import

Export with particledb dump

Create a logical backup of a running ParticleDB instance:

# Dump all tables
particledb dump --output backup.sql

# Dump specific tables
particledb dump --tables users,orders --output partial.sql

Import with particledb load

Restore a SQL dump into a running ParticleDB instance:

particledb load --input backup.sql

pg_dump Compatibility

ParticleDB accepts standard pg_dump output in plain SQL format:

# Export from PostgreSQL
pg_dump --format=plain --no-owner --no-acl mydb > dump.sql

# Import into ParticleDB
particledb load --input dump.sql

Use --no-owner and --no-acl to strip PostgreSQL-specific ownership and access control statements. ParticleDB ignores unsupported DDL extensions (sequences, custom types, etc.) and logs warnings for skipped statements.

For large databases, dump and load individual tables:

# Dump one table at a time
pg_dump --format=plain --no-owner --no-acl --table=users mydb > users.sql
pg_dump --format=plain --no-owner --no-acl --table=orders mydb > orders.sql

# Load sequentially
particledb load --input users.sql
particledb load --input orders.sql

COPY FROM STDIN

ParticleDB supports the PostgreSQL COPY protocol for high-throughput bulk loading. This is the fastest way to import large datasets.

From psql

# Tab-delimited (default)
psql -h localhost -p 5432 -c "\copy users FROM 'users.csv' WITH (FORMAT csv, HEADER true)"

# Custom delimiter
psql -h localhost -p 5432 -c "\copy events FROM 'events.tsv' WITH (DELIMITER E'\t')"

From Application Code

Any PostgreSQL client library that supports the COPY protocol works with ParticleDB. Example with Python (psycopg2):

import psycopg2

conn = psycopg2.connect(host="localhost", port=5432, dbname="particledb")
cur = conn.cursor()

with open("data.csv", "r") as f:
    cur.copy_expert("COPY users FROM STDIN WITH (FORMAT csv, HEADER true)", f)

conn.commit()

COPY Options

Option	Default	Description
`FORMAT`	`text`	`text`, `csv`, or `binary`
`DELIMITER`	tab (text) / comma (csv)	Column separator character
`HEADER`	`false`	First line is a header row (CSV only)
`NULL`	`\N` (text) / empty (csv)	String representing NULL values
`QUOTE`	`"`	Quoting character (CSV only)
`ESCAPE`	`"`	Escape character (CSV only)

Parquet Import

For analytical workloads, load Parquet files directly:

-- Load a Parquet file from local disk
COPY hits FROM '/path/to/hits.parquet' WITH (FORMAT parquet);

-- Load from an S3-mounted path
COPY events FROM '/mnt/s3/events.parquet' WITH (FORMAT parquet);

Schema Migration Best Practices

1. Map PostgreSQL Types

ParticleDB supports a broad set of types. Key mappings from PostgreSQL:

PostgreSQL	ParticleDB	Notes
`SERIAL` / `BIGSERIAL`	`INT` / `BIGINT`	Use application-generated IDs or UUIDs
`VARCHAR(n)`	`VARCHAR` / `TEXT`	Length limits are accepted but advisory
`TIMESTAMP` / `TIMESTAMPTZ`	`TIMESTAMP`	Stored as microseconds since epoch
`JSONB` / `JSON`	`JSON`	Full JSON query support
`BOOLEAN`	`BOOLEAN`
`FLOAT4` / `FLOAT8`	`FLOAT` / `DOUBLE`
`NUMERIC` / `DECIMAL`	`DECIMAL`
`UUID`	`TEXT`	Store as text, index normally
`ARRAY`	`JSON`	Use JSON arrays as an alternative

2. Handle Sequences

ParticleDB does not have SERIAL or SEQUENCE. Replace auto-increment patterns with:

UUID primary keys: Generate UUIDs in the application.
Application-managed counters: Use a counter table with transactions.
Timestamp-based IDs: Combine timestamp with random suffix.

-- Instead of SERIAL
CREATE TABLE users (
    id TEXT PRIMARY KEY,    -- UUID generated by application
    name TEXT NOT NULL,
    email TEXT NOT NULL
);

3. Index Strategy

ParticleDB automatically prunes data ranges that cannot match a query’s filter, so range predicates are fast even without indexes. For point lookups and high-selectivity filters on non-key columns, add secondary indexes:

-- Create secondary indexes for frequently filtered columns
CREATE INDEX idx_users_email ON users (email);
CREATE INDEX idx_orders_date ON orders (created_at);

4. Validate After Migration

After migrating, verify data integrity:

-- Compare row counts
SELECT COUNT(*) FROM users;
SELECT COUNT(*) FROM orders;

-- Spot-check specific records
SELECT * FROM users WHERE id = 'known-id';

-- Verify aggregates match
SELECT SUM(total) FROM orders;

Run the built-in integrity checker for a thorough check:

particledb verify --data-dir /var/lib/particledb --consistency true

Backup and Restore

Physical Backup (tar.gz Archive)

Create a compressed archive of the data directory:

# Create backup
particledb backup create --data-dir /var/lib/particledb --output /backups/pdb_$(date +%Y%m%d).tar.gz

# Validate without restoring
particledb backup validate --input /backups/pdb_20260413.tar.gz

# Restore
particledb backup restore --input /backups/pdb_20260413.tar.gz --data-dir /var/lib/particledb --force

# List available backups
particledb backup list --dir /backups

Snapshots and Point-in-Time Recovery

Snapshots provide faster, incremental backups with WAL-based point-in-time recovery:

# Enable automated snapshots at startup
particledb start \
    --snapshot-enabled \
    --snapshot-interval 3600 \
    --snapshot-max 10 \
    --snapshot-dir /var/lib/particledb/snapshots

# Manual snapshot (database must be stopped, or use --force)
particledb snapshot create --data-dir /var/lib/particledb

# Archive WAL for PITR
particledb snapshot archive-wal --wal-dir /var/lib/particledb/wal

# Restore to a specific point in time
particledb snapshot restore \
    --target-dir /var/lib/particledb-restored \
    --pitr '2026-04-13T10:30:00Z'

# Prune old snapshots
particledb snapshot prune --max-snapshots 5 --retention-days 30

Logical Backup (SQL Dump)

For cross-version or cross-platform portability:

# Export
particledb dump --output full_backup.sql

# Import into a fresh instance
particledb load --input full_backup.sql

Migration Checklist

Audit source schema: Identify unsupported types, sequences, and extensions.
Map data types: Adjust columns as needed (see type mapping table above).
Remove PostgreSQL-specific DDL: Strip SERIAL, SEQUENCE, custom types, and ownership.
Choose a migration method: particledb migrate for direct transfer, or pg_dump + particledb load for staged migration.
Run the migration: Start with a small table to validate, then migrate the full dataset.
Verify row counts and aggregates: Compare source and target.
Run integrity checks: particledb verify --consistency true.
Set up backups: Enable automated snapshots or schedule periodic backup create jobs.