There are many ways to import data from external sources into a database. MariaDB and PostgreSQL offer native solutions: the MariaDB CONNECT storage engine and PostgreSQL Foreign Data Wrappers. Unfortunately, these options are often overlooked, in favour of more expensive, more fragile and slower solutions. Let’s see what CONNECT and FDWs are, and why they are often the best choice for importing data.
MariaDB CONNECT
MariaDB knows nothing about how to read or write data, indexes, caches, or running transactions. The MariaDB server sees these as abstract operations, and delegate them to a special type of plugins called storage engines. Storage engines have absolute freedom on how to read and write data, as long as they return the type of variables that MariaDB expects.
CONNECT is a storage engine that works with heterogeneous data sources. Here we’ll ignore data files in various formats and other special data source types, and we’ll focus on remote databases. As long as a DBMS supports ODBC, JDBC, the native MySQL protocol or MongoDB protocol, CONNECT should be able to work with it. From a user perspective, it will be exactly working with a local table. Except that connecting to remote sources is slower. But this is not a problem for many use cases.
Copilot with GPT-4.1 created the following diagram of MariaDB CONNECT architecture:
+---------------------+
| MariaDB Server |
+---------------------+
|
v
+---------------------------+
| CONNECT Storage Engine |
+---------------------------+
| | |
v v v
+--------+ +--------+ +--------+
| FILE | | ODBC | | JDBC |
| Table | | Table | | Table |
+--------+ +--------+ +--------+
| | |
v v v
External Remote DBs Remote DBs
Files (ODBC) (JDBC)
PostgreSQL Foreign Data Wrappers
PostgreSQL has a brilliant extensions system. There are extension types for a number of different goals. One of them if postgresql_fdw, that allows users to develop Foreign Data Wrappers (FDWs). A FDW is a program that allows PostgreSQL to run SQL queries on remote data sources. A huge number of FDWs have been developed, see the list on PostgreSQL Wiki.
Originally, all PostgreSQL FDWs were written in C. Later, some frameworks emerged that allows us to use other languages:
Copilot with GPT-4.1 created this diagram for us:
+---------------------------+
| PostgreSQL Client |
| (SQL Query) |
+---------------------------+
|
v
+-------------------------------------+
| PostgreSQL Database Server |
| (FDW Extension Installed & Config'd)|
+-------------------------------------+
|
v
+-----------------------------+
| Foreign Data Wrapper (FDW) |
+-----------------------------+
| | |
v v v
+----------+ +----------+ +-------------+
| Remote | | Remote | | Remote |
| Postgres | | MySQL | | File/Csv |
| Database | | Database | | or API |
+----------+ +----------+ +-------------+
A CONNECT Table Example
There are many ways to create a CONNECT table, depending on which remote technology we are connecting to, whether we want to map all the columns, whether we need to make some transformation, and so on.
Here is a trivial example:
CREATE OR REPLACE TABLE world.country
ENGINE = CONNECT
TABLE_TYPE = MYSQL
CONNECTION = 'mysql://connect_se:secret@mariadb-source/world/country'
;
In this example, we map the country local table to a remote country table, located on a MariaDB or MySQL server. We are mapping all the columns, so we don’t need to specify them. If the remote technology was, for example, SQL Server, we might had needed to specify the columns anyway to define the type mapping.
A Foreign Data Wrapper Table Example
In this simple example, we’ll link a local PostgreSQL table to a remote PostgreSQL table.
First, we create a FDW to the server:
CREATE SERVER pg2
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'pg2.vettabase.com', dbname 'db', port '5432')
;
Then we create a user for the FDW:
CREATE USER MAPPING FOR importer
SERVER pg2
OPTIONS (user 'app', password 'Secr3t')
;
Finally, we create the table itself:
CREATE FOREIGN TABLE employee (
id INT,
first_name TEXT,
last_name TEXT
)
SERVER pg2
OPTIONS (schema_name 'db', table_name 'employee')
;
Common Features and Drawbacks
Both MariaDB CONNECT and PostgreSQL FDWs allow us to run SQL queries over heterogenous data sources. We can create a table that is linked to a remote database, a REST API, a local CSV files, and more. We can both read and write data (though some CONNECT types and some FDWs might be read-only). We can even use JOIN, subqueries, or other SQL constructs that read from multiple tables, combining multiple remote sources and local tables.
This approach is great, but it has has some drawbacks. Let’s discuss them briefly.
Security implications
The server that runs MariaDB or PostgreSQL needs access to the remote data. The database needs to store any necessary credentials.
This might look like a serious risk. However, note that many databases contain equally sensitive and valuable, such as user secrets, personal data, and financial data.
Skill mismatch
Developers know how to retrieve data from an API. But they’re usually not familiar with the CONNECT engine or FDWs. In many cases, building a proper table for importing data is trivial. In complex cases, they might need help from a DBA or a database engineer. This might be a problem for small teams, where these skills are absent.
Slow queries
Queries that need remote data, or join remote and local data, are inevitably slower than regular queries. However, this is unlikely to be a problem for scheduled data import processes, or for one-off queries.
Comparing CONNECT or FDWs to the alternatives
Let’s discuss the other methods to import data into MariaDB or PostgreSQL, and why CONNECT and FDWs are usually better options.
REST APIs
For application that are maintained by your organisation, an option to export data is to put an API in front of it, and implement calls for data export. Let’s see why this is usually not the best idea, when you can use CONNECT or FDWs:
- Developing this feature is expensive, especially if a suitable API doesn’t exist at all.
- The API also needs to be maintained, just like any script that calls the API. Data structures change, libraries need be upgraded, and so on. And this happens on both sides.
- Moving data can take time. If the export is made by a script located on a third host,, the data has to make two trips.
- For the same reason, the procedure is less reliable and security risks are higher.
Generated Data Files
It’s possible to have a job that generates data files from the source and sends them to the target. This might be done using CSV or JSON. Or it’s possible to generate a logical backup (a dump).
This is the simplest way to handle data transfers, provided that only one or just a few tables are copied. Also, if the source and target table structures change over time, the process will need be fixed. Also, this might result into a nightmare if the source and the target technologies are different.
Import Queries
A script can simply run queries against the source, obtain data, and write them to the server. This allows to handle differences between the source and the target in more flexible ways, import data more selectively, apply transormations, and so on. However, at the end of the day, you’ll simply have a script that simulates the job typically made by CONNECT or FDWs.
On top on that, the script will typically run on a third host, making the process slower, less reliable, and less secure.
Conclusions
We discussed two great way to import data into MariaDB and PostgreSQL: CONNECT and Foreign Data Wrappers. We also highlighted why, while these solutions aren’t very common, they are usually faster, more reliable, and more secure than alternatives.
Take a look at the mariadb-connect tag to see our articles on MariaDB CONNECT.
Federico Razzoli





0 Comments