Optimizing PostgreSQL Foreign Data Wrappers for Multi-Database Integration
Introduction
PostgreSQL Foreign Data Wrappers (FDW) enable seamless integration with external data sources, allowing you to query remote databases, files, and even web APIs as if they were local tables. This powerful feature enables database federation, cross-database joins, and centralized data access without complex ETL pipelines.
What are Foreign Data Wrappers?
FDW implements the SQL/MED (Management of External Data) standard, providing:
- Transparent access: Query remote data using standard SQL
- Virtual tables: No data duplication required
- Real-time data: Always current, no sync delays
- Join capability: Combine local and remote data
- Write operations: Some FDWs support INSERT/UPDATE/DELETE
Common Use Cases
- Database Migration: Access legacy systems during migration
- Multi-tenant Architecture: Query across tenant databases
- Data Aggregation: Combine data from multiple sources
- Hybrid Cloud: Connect on-premise and cloud databases
- Reporting: Create centralized reporting database
- Microservices: Query across service boundaries
Available Foreign Data Wrappers
postgresql_fdw (Built-in)
- Connect to other PostgreSQL databases
- Full feature support
- Best performance
mysql_fdw
- Connect to MySQL/MariaDB
- Read and write support
oracle_fdw
- Connect to Oracle databases
- Commercial and open-source versions
mongodb_fdw
- Access MongoDB collections
- JSON data integration
file_fdw (Built-in)
- Read CSV and text files
- Log file analysis
Others
- redis_fdw, sqlite_fdw, couchdb_fdw, etc.
Setting Up postgres_fdw
Step 1: Install Extension
-- On the local database
CREATE EXTENSION postgres_fdw;
Step 2: Create Foreign Server
CREATE SERVER remote_db
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (
host 'remote-db.example.com',
port '5432',
dbname 'production_db',
fetch_size '10000'
);
Step 3: Create User Mapping
-- Map local users to remote credentials
CREATE USER MAPPING FOR postgres
SERVER remote_db
OPTIONS (
user 'remote_user',
password 'secure_password'
);
-- Map all local users
CREATE USER MAPPING FOR PUBLIC
SERVER remote_db
OPTIONS (
user 'readonly_user',
password 'readonly_pass'
);
Step 4: Create Foreign Tables
Manual table definition:
CREATE FOREIGN TABLE remote_users (
id INTEGER,
username VARCHAR(50),
email VARCHAR(100),
created_at TIMESTAMP
)
SERVER remote_db
OPTIONS (schema_name 'public', table_name 'users');
Import entire schema:
IMPORT FOREIGN SCHEMA public
FROM SERVER remote_db
INTO local_schema;
Import specific tables:
IMPORT FOREIGN SCHEMA public
LIMIT TO (users, orders, products)
FROM SERVER remote_db
INTO public;
Step 5: Query Foreign Data
-- Simple query
SELECT * FROM remote_users WHERE created_at > NOW() - INTERVAL '7 days';
-- Join local and remote data
SELECT
l.order_id,
l.amount,
r.username,
r.email
FROM local_orders l
JOIN remote_users r ON l.user_id = r.id
WHERE l.status = 'pending';
Advanced Configuration
Performance Tuning Options
CREATE SERVER fast_remote
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (
host 'db.example.com',
port '5432',
dbname 'mydb',
-- Fetch size: rows retrieved per round trip
fetch_size '50000',
-- Enable query pushdown optimizations
use_remote_estimate 'true',
-- Connection pooling
keep_connections 'on',
-- Parallel query execution
parallel_commit 'on',
parallel_abort 'on'
);
Table-Level Options
CREATE FOREIGN TABLE optimized_table (
id INTEGER,
data TEXT
)
SERVER remote_db
OPTIONS (
schema_name 'public',
table_name 'large_table',
-- Override fetch size for this table
fetch_size '100000',
-- Use remote estimates for better query planning
use_remote_estimate 'true'
);
Column-Level Options
CREATE FOREIGN TABLE mapped_columns (
local_id INTEGER OPTIONS (column_name 'remote_user_id'),
local_name TEXT OPTIONS (column_name 'remote_username')
)
SERVER remote_db
OPTIONS (table_name 'users');
Write Operations
Enable Write Operations
-- Grant permissions
GRANT INSERT, UPDATE, DELETE ON remote_users TO app_user;
-- Insert data
INSERT INTO remote_users (username, email)
VALUES ('newuser', '[email protected]');
-- Update data
UPDATE remote_users
SET email = '[email protected]'
WHERE username = 'newuser';
-- Delete data
DELETE FROM remote_users WHERE username = 'newuser';
Transactions Across Databases
BEGIN;
-- Insert into local table
INSERT INTO local_orders (user_id, amount) VALUES (123, 99.99);
-- Update remote table
UPDATE remote_users SET last_order = NOW() WHERE id = 123;
-- Both operations committed together
COMMIT;
Note: Two-phase commit ensures atomicity across databases.
Multi-Database Federation
Connecting Multiple Databases
-- Connect to production database
CREATE SERVER prod_db
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'prod.example.com', dbname 'production');
CREATE USER MAPPING FOR PUBLIC SERVER prod_db
OPTIONS (user 'readonly', password 'pass1');
-- Connect to analytics database
CREATE SERVER analytics_db
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'analytics.example.com', dbname 'warehouse');
CREATE USER MAPPING FOR PUBLIC SERVER analytics_db
OPTIONS (user 'readonly', password 'pass2');
-- Connect to archive database
CREATE SERVER archive_db
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'archive.example.com', dbname 'historical');
CREATE USER MAPPING FOR PUBLIC SERVER archive_db
OPTIONS (user 'readonly', password 'pass3');
Unified Views Across Databases
-- Import tables
IMPORT FOREIGN SCHEMA public
LIMIT TO (current_orders)
FROM SERVER prod_db INTO public;
IMPORT FOREIGN SCHEMA public
LIMIT TO (archived_orders)
FROM SERVER archive_db INTO public;
-- Create unified view
CREATE VIEW all_orders AS
SELECT *, 'current' AS source FROM current_orders
UNION ALL
SELECT *, 'archived' AS source FROM archived_orders;
-- Query across all databases
SELECT
source,
COUNT(*) AS order_count,
SUM(amount) AS total_amount
FROM all_orders
WHERE created_at > '2024-01-01'
GROUP BY source;
Connecting to MySQL
Install mysql_fdw
# Ubuntu/Debian
sudo apt install postgresql-15-mysql-fdw
# CentOS/RHEL
sudo yum install mysql_fdw_15
Configure MySQL Connection
CREATE EXTENSION mysql_fdw;
CREATE SERVER mysql_server
FOREIGN DATA WRAPPER mysql_fdw
OPTIONS (
host 'mysql-db.example.com',
port '3306'
);
CREATE USER MAPPING FOR PUBLIC
SERVER mysql_server
OPTIONS (
username 'mysql_user',
password 'mysql_pass'
);
CREATE FOREIGN TABLE mysql_products (
id INTEGER,
name VARCHAR(100),
price DECIMAL(10,2)
)
SERVER mysql_server
OPTIONS (dbname 'ecommerce', table_name 'products');
-- Query MySQL data from PostgreSQL
SELECT * FROM mysql_products WHERE price > 100;
Performance Optimization
1. Use WHERE Clause Pushdown
PostgreSQL pushes filters to remote server:
-- Good: Filter applied remotely
SELECT * FROM remote_users WHERE country = 'US';
-- Check query plan
EXPLAIN (VERBOSE) SELECT * FROM remote_users WHERE country = 'US';
-- Look for: "Remote SQL: SELECT ... WHERE country = 'US'"
2. Limit Data Transfer
-- Bad: Fetches all columns
SELECT * FROM remote_large_table;
-- Good: Only needed columns
SELECT id, username FROM remote_large_table;
-- Good: Use LIMIT
SELECT * FROM remote_large_table LIMIT 1000;
3. Optimize Fetch Size
-- For large result sets
ALTER SERVER remote_db OPTIONS (SET fetch_size '50000');
-- For small, frequent queries
ALTER SERVER remote_db OPTIONS (SET fetch_size '1000');
4. Use Remote Estimates
ALTER SERVER remote_db OPTIONS (ADD use_remote_estimate 'true');
-- PostgreSQL will query remote EXPLAIN for accurate cost estimates
5. Index on Remote Tables
-- Ensure remote tables have appropriate indexes
-- Connect directly to remote database:
CREATE INDEX idx_users_country ON users(country);
CREATE INDEX idx_orders_created_at ON orders(created_at);
6. Materialized Views for Frequently Accessed Data
-- Cache remote data locally
CREATE MATERIALIZED VIEW cached_remote_data AS
SELECT * FROM remote_users WHERE active = true;
-- Refresh periodically
REFRESH MATERIALIZED VIEW cached_remote_data;
-- Auto-refresh with cron job or trigger
CREATE EXTENSION pg_cron;
SELECT cron.schedule('refresh-cache', '0 */6 * * *',
'REFRESH MATERIALIZED VIEW cached_remote_data');
Monitoring and Troubleshooting
View Foreign Servers
SELECT
srvname AS server_name,
srvoptions AS options
FROM pg_foreign_server;
View Foreign Tables
SELECT
foreign_table_schema,
foreign_table_name,
foreign_server_name
FROM information_schema.foreign_tables;
Check User Mappings
SELECT
um.umuser::regrole AS local_user,
fs.srvname AS foreign_server,
um.umoptions AS options
FROM pg_user_mapping um
JOIN pg_foreign_server fs ON um.umserver = fs.oid;
Analyze Query Performance
EXPLAIN (ANALYZE, VERBOSE, BUFFERS)
SELECT * FROM remote_users WHERE created_at > NOW() - INTERVAL '1 day';
Connection Pooling Check
-- Check active foreign connections
SELECT
datname,
usename,
application_name,
client_addr,
state
FROM pg_stat_activity
WHERE application_name LIKE '%fdw%';
Common Issues and Solutions
Issue 1: Connection Timeout
Error: could not connect to server
Solutions:
-- Add connection timeout
ALTER SERVER remote_db OPTIONS (ADD connect_timeout '10');
-- Check network connectivity
-- From shell:
psql -h remote-db.example.com -U user -d dbname
Issue 2: Slow Queries
Solutions:
-- Enable remote estimates
ALTER SERVER remote_db OPTIONS (ADD use_remote_estimate 'true');
-- Increase fetch size
ALTER SERVER remote_db OPTIONS (SET fetch_size '100000');
-- Check if WHERE clause is pushed down
EXPLAIN VERBOSE SELECT * FROM remote_table WHERE id > 1000;
Issue 3: Authentication Failure
Error: password authentication failed
Solutions:
-- Update user mapping
ALTER USER MAPPING FOR postgres
SERVER remote_db
OPTIONS (SET password 'new_password');
-- Check pg_hba.conf on remote server
-- Ensure it allows connections from local server
Issue 4: Column Type Mismatch
Error: column type mismatch
Solution:
-- Explicitly cast columns
CREATE FOREIGN TABLE remote_data (
id INTEGER,
data TEXT,
created_at TIMESTAMP
)
SERVER remote_db
OPTIONS (table_name 'source_table');
-- Or import with correct types
DROP FOREIGN TABLE remote_data;
IMPORT FOREIGN SCHEMA public
LIMIT TO (source_table)
FROM SERVER remote_db INTO public;
Security Considerations
1. Use Read-Only Accounts
-- On remote database, create limited user
CREATE ROLE fdw_readonly;
GRANT CONNECT ON DATABASE production_db TO fdw_readonly;
GRANT USAGE ON SCHEMA public TO fdw_readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO fdw_readonly;
-- Use in user mapping
CREATE USER MAPPING FOR PUBLIC
SERVER remote_db
OPTIONS (user 'fdw_readonly', password 'safe_pass');
2. Secure Password Storage
-- Use .pgpass file instead of hardcoded passwords
-- ~/.pgpass format:
-- hostname:port:database:username:password
CREATE USER MAPPING FOR postgres
SERVER remote_db
OPTIONS (user 'remote_user');
-- Password read from .pgpass
3. Network Security
# Use SSL connections
# In postgresql.conf on remote:
ssl = on
ssl_cert_file = 'server.crt'
ssl_key_file = 'server.key'
-- Require SSL in user mapping
CREATE SERVER secure_remote
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (
host 'remote.example.com',
dbname 'mydb',
sslmode 'require'
);
Best Practices
- Minimize data transfer: Select only needed columns
- Use indexes: Ensure remote tables are properly indexed
- Cache frequently accessed data: Use materialized views
- Monitor performance: Regular EXPLAIN ANALYZE
- Limit write operations: Prefer read-only for most FDW usage
- Use connection pooling: Enable
keep_connections - Set appropriate fetch_size: Tune based on query patterns
- Test failover scenarios: Plan for remote database unavailability
- Document foreign dependencies: Maintain clear documentation
- Regular security audits: Review user mappings and permissions
Conclusion
PostgreSQL Foreign Data Wrappers provide powerful database federation capabilities:
- Seamless integration: Query remote data with standard SQL
- Real-time access: No data duplication or sync delays
- Multi-database support: PostgreSQL, MySQL, Oracle, and more
- Performance: Query pushdown and optimization
- Flexibility: Read and write operations
Start with read-only queries to remote databases, optimize performance with proper indexing and fetch sizes, then expand to more complex integrations as needed.