Optimizing PostgreSQL Foreign Data Wrappers for Multi-Database Integration
Use PostgreSQL FDWs to query remote databases safely, tune pushdown, and avoid common federation performance traps.
Optimizing PostgreSQL Foreign Data Wrappers for Multi-Database Integration
PostgreSQL Foreign Data Wrappers help when your data lives in more than one place and you need to query it without copying everything first. FDWs can make remote tables look local, but performance and transaction behavior depend heavily on the wrapper, network, and query shape.
What are Foreign Data Wrappers?
FDW implements the SQL/MED (Management of External Data) standard, providing:
- Transparent access: Query remote data using standard SQL
- Virtual tables: No data duplication required
- Live remote reads: Queries read from the remote source instead of a local copy
- Join capability: Combine local and remote data
- Write operations: Some wrappers support
INSERT,UPDATE, andDELETE
Common Use Cases
- Database Migration: Access legacy systems during migration
- Multi-tenant Architecture: Query across tenant databases
- Data Aggregation: Combine data from multiple sources
- Hybrid Cloud: Connect on-premise and cloud databases
- Reporting: Create centralized reporting database
- Microservices: Query across service boundaries
Available Foreign Data Wrappers
postgresql_fdw (Built-in)
- Connect to other PostgreSQL databases
- Full feature support
- Best performance
mysql_fdw
- Connect to MySQL/MariaDB
- Read and write support
oracle_fdw
- Connect to Oracle databases
- Commonly used through the open-source
oracle_fdwextension, with Oracle client libraries installed separately
mongodb_fdw
- Access MongoDB collections
- JSON data integration
file_fdw (Built-in)
- Read CSV and text files
- Log file analysis
Others
- redis_fdw, sqlite_fdw, couchdb_fdw, etc.
Setting Up postgres_fdw
Step 1: Install Extension
-- On the local database
CREATE EXTENSION postgres_fdw;
Step 2: Create Foreign Server
CREATE SERVER remote_db
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (
host 'remote-db.example.com',
port '5432',
dbname 'production_db',
fetch_size '10000'
);
Step 3: Create User Mapping
-- Map local users to remote credentials
CREATE USER MAPPING FOR postgres
SERVER remote_db
OPTIONS (
user 'remote_user',
password 'secure_password'
);
-- Map all local users
CREATE USER MAPPING FOR PUBLIC
SERVER remote_db
OPTIONS (
user 'readonly_user',
password 'readonly_pass'
);
Step 4: Create Foreign Tables
Manual table definition:
CREATE FOREIGN TABLE remote_users (
id INTEGER,
username VARCHAR(50),
email VARCHAR(100),
created_at TIMESTAMP
)
SERVER remote_db
OPTIONS (schema_name 'public', table_name 'users');
Import entire schema:
IMPORT FOREIGN SCHEMA public
FROM SERVER remote_db
INTO local_schema;
Import specific tables:
IMPORT FOREIGN SCHEMA public
LIMIT TO (users, orders, products)
FROM SERVER remote_db
INTO public;
Step 5: Query Foreign Data
-- Simple query
SELECT * FROM remote_users WHERE created_at > NOW() - INTERVAL '7 days';
-- Join local and remote data
SELECT
l.order_id,
l.amount,
r.username,
r.email
FROM local_orders l
JOIN remote_users r ON l.user_id = r.id
WHERE l.status = 'pending';
Advanced Configuration
Performance Tuning Options
CREATE SERVER fast_remote
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (
host 'db.example.com',
port '5432',
dbname 'mydb',
-- Fetch size: rows retrieved per round trip
fetch_size '50000',
-- Enable query pushdown optimizations
use_remote_estimate 'true',
-- Reuse foreign server connections inside PostgreSQL sessions
keep_connections 'on',
-- Parallel query execution
parallel_commit 'on',
parallel_abort 'on'
);
Table-Level Options
CREATE FOREIGN TABLE optimized_table (
id INTEGER,
data TEXT
)
SERVER remote_db
OPTIONS (
schema_name 'public',
table_name 'large_table',
-- Override fetch size for this table
fetch_size '100000',
-- Use remote estimates for better query planning
use_remote_estimate 'true'
);
Column-Level Options
CREATE FOREIGN TABLE mapped_columns (
local_id INTEGER OPTIONS (column_name 'remote_user_id'),
local_name TEXT OPTIONS (column_name 'remote_username')
)
SERVER remote_db
OPTIONS (table_name 'users');
Write Operations
Enable Write Operations
-- Grant permissions
GRANT INSERT, UPDATE, DELETE ON remote_users TO app_user;
-- Insert data
INSERT INTO remote_users (username, email)
VALUES ('newuser', '[email protected]');
-- Update data
UPDATE remote_users
SET email = '[email protected]'
WHERE username = 'newuser';
-- Delete data
DELETE FROM remote_users WHERE username = 'newuser';
Transactions Across Databases
BEGIN;
-- Insert into local table
INSERT INTO local_orders (user_id, amount) VALUES (123, 99.99);
-- Update remote table
UPDATE remote_users SET last_order = NOW() WHERE id = 123;
-- Both operations committed together
COMMIT;
PostgreSQL coordinates remote work through the FDW, but you should not assume every wrapper gives you full distributed transaction guarantees. For postgres_fdw, review the two_phase_commit server option if you need two-phase commit behavior, and test crash recovery before relying on it for critical writes.
Multi-Database Federation
Connecting Multiple Databases
-- Connect to production database
CREATE SERVER prod_db
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'prod.example.com', dbname 'production');
CREATE USER MAPPING FOR PUBLIC SERVER prod_db
OPTIONS (user 'readonly', password 'pass1');
-- Connect to analytics database
CREATE SERVER analytics_db
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'analytics.example.com', dbname 'warehouse');
CREATE USER MAPPING FOR PUBLIC SERVER analytics_db
OPTIONS (user 'readonly', password 'pass2');
-- Connect to archive database
CREATE SERVER archive_db
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'archive.example.com', dbname 'historical');
CREATE USER MAPPING FOR PUBLIC SERVER archive_db
OPTIONS (user 'readonly', password 'pass3');
Unified Views Across Databases
-- Import tables
IMPORT FOREIGN SCHEMA public
LIMIT TO (current_orders)
FROM SERVER prod_db INTO public;
IMPORT FOREIGN SCHEMA public
LIMIT TO (archived_orders)
FROM SERVER archive_db INTO public;
-- Create unified view
CREATE VIEW all_orders AS
SELECT *, 'current' AS source FROM current_orders
UNION ALL
SELECT *, 'archived' AS source FROM archived_orders;
-- Query across all databases
SELECT
source,
COUNT(*) AS order_count,
SUM(amount) AS total_amount
FROM all_orders
WHERE created_at > '2024-01-01'
GROUP BY source;
Connecting to MySQL
Install mysql_fdw
# Package names vary by PostgreSQL version and repository.
# On Debian/Ubuntu, search first:
apt search mysql-fdw
# Then install the package that matches your PostgreSQL major version.
sudo apt install postgresql-16-mysql-fdw
Configure MySQL Connection
CREATE EXTENSION mysql_fdw;
CREATE SERVER mysql_server
FOREIGN DATA WRAPPER mysql_fdw
OPTIONS (
host 'mysql-db.example.com',
port '3306'
);
CREATE USER MAPPING FOR PUBLIC
SERVER mysql_server
OPTIONS (
username 'mysql_user',
password 'mysql_pass'
);
CREATE FOREIGN TABLE mysql_products (
id INTEGER,
name VARCHAR(100),
price DECIMAL(10,2)
)
SERVER mysql_server
OPTIONS (dbname 'ecommerce', table_name 'products');
-- Query MySQL data from PostgreSQL
SELECT * FROM mysql_products WHERE price > 100;
Performance Optimization
1. Use WHERE Clause Pushdown
PostgreSQL pushes filters to remote server:
-- Good: Filter applied remotely
SELECT * FROM remote_users WHERE country = 'US';
-- Check query plan
EXPLAIN (VERBOSE) SELECT * FROM remote_users WHERE country = 'US';
-- Look for: "Remote SQL: SELECT ... WHERE country = 'US'"
2. Limit Data Transfer
-- Bad: Fetches all columns
SELECT * FROM remote_large_table;
-- Good: Only needed columns
SELECT id, username FROM remote_large_table;
-- Good: Use LIMIT
SELECT * FROM remote_large_table LIMIT 1000;
3. Optimize Fetch Size
-- For large result sets
ALTER SERVER remote_db OPTIONS (SET fetch_size '50000');
-- For small, frequent queries
ALTER SERVER remote_db OPTIONS (SET fetch_size '1000');
4. Use Remote Estimates
ALTER SERVER remote_db OPTIONS (ADD use_remote_estimate 'true');
-- PostgreSQL will query remote EXPLAIN for accurate cost estimates
5. Index on Remote Tables
-- Ensure remote tables have appropriate indexes
-- Connect directly to remote database:
CREATE INDEX idx_users_country ON users(country);
CREATE INDEX idx_orders_created_at ON orders(created_at);
6. Materialized Views for Frequently Accessed Data
-- Cache remote data locally
CREATE MATERIALIZED VIEW cached_remote_data AS
SELECT * FROM remote_users WHERE active = true;
-- Refresh periodically
REFRESH MATERIALIZED VIEW cached_remote_data;
-- Auto-refresh with pg_cron if that extension is installed
CREATE EXTENSION pg_cron;
SELECT cron.schedule('refresh-cache', '0 */6 * * *',
'REFRESH MATERIALIZED VIEW cached_remote_data');
Monitoring and Troubleshooting
View Foreign Servers
SELECT
srvname AS server_name,
srvoptions AS options
FROM pg_foreign_server;
View Foreign Tables
SELECT
foreign_table_schema,
foreign_table_name,
foreign_server_name
FROM information_schema.foreign_tables;
Check User Mappings
SELECT
um.umuser::regrole AS local_user,
fs.srvname AS foreign_server,
um.umoptions AS options
FROM pg_user_mapping um
JOIN pg_foreign_server fs ON um.umserver = fs.oid;
Analyze Query Performance
EXPLAIN (ANALYZE, VERBOSE, BUFFERS)
SELECT * FROM remote_users WHERE created_at > NOW() - INTERVAL '1 day';
Connection Check
Use EXPLAIN (ANALYZE, VERBOSE) to confirm remote SQL and timing. For PostgreSQL-to-PostgreSQL links, also check the remote server's pg_stat_activity while a long FDW query is running.
Common Issues and Solutions
Issue 1: Connection Timeout
Error: could not connect to server
Solutions:
-- Add connection timeout
ALTER SERVER remote_db OPTIONS (ADD connect_timeout '10');
-- Check network connectivity
-- From shell:
psql -h remote-db.example.com -U user -d dbname
Issue 2: Slow Queries
Solutions:
-- Enable remote estimates
ALTER SERVER remote_db OPTIONS (ADD use_remote_estimate 'true');
-- Increase fetch size
ALTER SERVER remote_db OPTIONS (SET fetch_size '100000');
-- Check if WHERE clause is pushed down
EXPLAIN VERBOSE SELECT * FROM remote_table WHERE id > 1000;
Issue 3: Authentication Failure
Error: password authentication failed
Solutions:
-- Update user mapping
ALTER USER MAPPING FOR postgres
SERVER remote_db
OPTIONS (SET password 'new_password');
-- Check pg_hba.conf on remote server
-- Ensure it allows connections from local server
Issue 4: Column Type Mismatch
Error: column type mismatch
Solution:
-- Explicitly cast columns
CREATE FOREIGN TABLE remote_data (
id INTEGER,
data TEXT,
created_at TIMESTAMP
)
SERVER remote_db
OPTIONS (table_name 'source_table');
-- Or import with correct types
DROP FOREIGN TABLE remote_data;
IMPORT FOREIGN SCHEMA public
LIMIT TO (source_table)
FROM SERVER remote_db INTO public;
Security Considerations
1. Use Read-Only Accounts
-- On remote database, create limited user
CREATE ROLE fdw_readonly;
GRANT CONNECT ON DATABASE production_db TO fdw_readonly;
GRANT USAGE ON SCHEMA public TO fdw_readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO fdw_readonly;
-- Use in user mapping
CREATE USER MAPPING FOR PUBLIC
SERVER remote_db
OPTIONS (user 'fdw_readonly', password 'safe_pass');
2. Secure Password Storage
-- You can use a .pgpass file for libpq connections instead of storing
-- the password in the user mapping. Confirm file ownership and mode.
-- ~/.pgpass format:
-- hostname:port:database:username:password
CREATE USER MAPPING FOR postgres
SERVER remote_db
OPTIONS (user 'remote_user');
-- Password read from .pgpass
3. Network Security
# Use SSL connections
# In postgresql.conf on remote:
ssl = on
ssl_cert_file = 'server.crt'
ssl_key_file = 'server.key'
-- Require SSL in user mapping
CREATE SERVER secure_remote
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (
host 'remote.example.com',
dbname 'mydb',
sslmode 'require'
);
Best Practices
- Minimize data transfer: Select only needed columns
- Use indexes: Ensure remote tables are properly indexed
- Cache frequently accessed data: Use materialized views
- Monitor performance: Regular EXPLAIN ANALYZE
- Limit write operations: Prefer read-only for most FDW usage
- Use connection pooling: Enable
keep_connections - Set appropriate fetch_size: Tune based on query patterns
- Test failover scenarios: Plan for remote database unavailability
- Document foreign dependencies: Maintain clear documentation
- Regular security audits: Review user mappings and permissions
Takeaway
PostgreSQL Foreign Data Wrappers are useful for database federation, especially when you need:
- Seamless integration: Query remote data with standard SQL
- Real-time access: No data duplication or sync delays
- Multi-database support: PostgreSQL, MySQL, Oracle, and more
- Performance: Query pushdown when the wrapper and query support it
- Flexibility: Read access first, with writes only after you verify the wrapper's behavior
Start with read-only queries to remote databases, optimize performance with proper indexing and fetch sizes, then expand to more complex integrations as needed.