Introduction
Big Oracle databases (several TB) are still tough to migrate to another version on a new server. For most of them, you’ll probably use RMAN restore or Data Guard, but datapump is always a cleaner way to migrate. With datapump, you can easily migrate to a new filesystem (ASM for example), rethink your tablespace organization, reorganize all the segments, exclude unneeded components, etc. All of these tasks in one operation. But datapump export can take hours and hours to complete. This blog post describe a method I used on several projects: it helped me a lot to optimize migration time.
Why datapump export takes so much time?
First of all, exporting data with datapump is actually extracting all the objects from the database, so it’s easy to understand why it’s much slower than copying datafiles. Regarding datapump speed, it mainly depends on disk speed where datafiles reside, and parallelism level. Increasing parallelism does not always speed up export, simply because if you’re on mechanical disks, it’s slower to read multiple objects on the same disks than actually do it serially. So there is some kind of limit, and for big databases, it can last hours to export data. Another problem is that long lasting export needs more undo data. If your datapump export lasts 10 hours, you’ll need 10 hours of undo_retention (if you need a consistent dump – at least when testing the migration because application is running). You’re also risking DDL changes on the database, and undo_retention cannot do anything for that. Be carefull because uncomplete dump is totally usable to import data, but you’ll miss several objects, not the goal I presume.
The solution would be trying to reduce the time needed for datapump export to avoid such problems.
SSD is the solution
SSD is probably the best choice for today’s databases. No more bottleneck with I/Os, that’s all we were waiting for. But your source database, an old 11gR2 or 12cR1, probably doesn’t run on SSD, especially if it’s a big database. SSD were quite small and expensive several years ago. So what? You probably didn’t plan a SSD migration on source server as you will decommission it as soon as migration is finished.
The solution is to use a temporary server fitted with fast SSDs. You don’t need a real server, with a fully rendundant configuration. You even don’t need RAID at all to protect your data because this server will only be for a single use: JBOD is OK.
How to configure this server?
This server will have:
- exactly the same OS, or something really similar compared to source server
- the exact same Oracle version
- the same configuration of the filesystems
- enough free space to restore the source database
- SSD-only storage for datafiles without redundancy
- enough cores to maximise the parallelism level
- a shared folder to put the dump, this shared folder would also be mounted on target server
- a shared folder to pick up the latest backups from source database
- enough bandwith for shared folders. 1Gbps network is only about 100MB/s, so don’t expect very high speed with that kind of network
- you don’t need a listener
- you’ll never use this database for you application
- if you’re reusing a server, make sure it will be dedicated for this purpose (no other running processes)
And regarding the license?
As you may know this server would need a license. But you also know that during the migration project, you’ll have twice the license used on your environment for several weeks: still using old servers, and already using new servers for migrated database. To avoid any problem, you can use a server previously running Oracle databases and already decommissionned. Tweak it with SSDs and it will be fine. And please make sure to be fully compliant with the Oracle license on your target environment.
How to proceed?
We won’t use this server as a one-shot path for migration because we need to try if the method is good enough and also find the best settings for datapump.
To proceed, the steps are:
- declare the database in /etc/oratab
- create a pfile on source server and copy it to $ORACLE_HOME/dbs on the temporary server
- edit the parameters to disable references to source environnement, for example local and remote_listeners and Data Guard settings. The goal is to make sure starting this database will have no impact on production
- startup the instance on this pfile
- restore the controlfile from the very latest controlfile autobackup
- restore the database
- recover the database and check the SCN
- take a new archivelog backup on the source database (to simulate the real scenario)
- catalog the backup folder on the temporary database with RMAN
- do another recover database on temporary database, it should apply the archivelogs of the day, then check again the SCN
- open the database in resetlogs mode
- create the target directory for datapump on the database
- do the datapump export with maximum parallelism level (2 times the number of cores available on your server – it will be too many at the beginning, but not enough at the end). No need for flashback_scn here.
You can try various parallelism levels to adjust to the best value. Once you’ve found the best value, you can schedule the real migration.
Production migration
Now you managed to master the method, let’s imagine that you planned to migrate to production tonight at 18:00.
09:00 – have a cup of coffee first, you’ll need it!
09:15 – remove all the datafiles on the temporary server, also remove redologs and controlfiles, and empty the FRA. Only keep the pfile.
09:30 – startup force your temporary database, it should stop in nomount mode
09:45 – restore the latest controlfile autobackup on temporary database. Make sure no datafile will be added today on production
10:00 – restore the database on the temporary server. During the restore, production is still available on source server. At the end of the restore, do a first recover but DON’T open your database with resetlogs now
18:00 – your restore should be finished now, you can disconnect everyone from source database, and take the very latest archivelog backup on source database. From now your application should be down.
18:20 – on your temporary database, catalog the backup folder with RMAN. It will discover the latest archivelog backups.
18:30 – do a recover of your temporary database again. It should apply the latest archivelogs (generated during the day). If you want to make sure that everything is OK, check the current_scn on source database, it should be nearly the same as your temporary database
18:45 – open the temporary database with RESETLOGS
19:00 – do the datapump export with your optimal settings
Once done, you now have to do the datapump import on your target database. Parallelism will depend on the cores available on target server, and the resources you would preserve for other databases already running on this server.
Benefits and drawbacks
Obvious benefit is that it probably costs less than 30 minutes to apply the archivelogs of the day on the temporary database. And total duration of the export can be cut by several hours.
First drawback is that you’ll need a server of this kind, or you’ll need to build one. Second drawback is if you’re using Standard Edition: don’t expect to save that much hours as it has no parallelism at all. Big databases are not very well deserved by Standard Edition, you may know.
Real world example
This is a recent case. Source database is 12.1, about 2TB on mechanical disks. Datapump export is not working correctly: it lasted more than 19 hours with lots of errors. One of the big problem of this database is a bigfile tablespace of 1.8TB. Who did this kind of configuration?
Temporary server is a DEV server already decommissioned running the same version of Oracle and using the same Linux kernel. This server is fitted with enough TB of SSD: mount path was changed to match source database filesystems.
On source server:
su – oracle
. oraenv <<< BP3
sqlplus / as sysdba
create pfile='/tmp/initBP3.ora' from spfile;
exit
scp /tmp/initBP3.ora oracle@db32-test:/tmp
On temporary server:
su – oracle
cp /tmp/initBP3.ora /opt/orasapq/oracle/product/12.1.0.2/dbs/
echo "BP3:/opt/orasapq/oracle/product/12.1.0.2:N" >> /etc/oratab
. oraenv <<< BP3
vi $ORACLE_HOME/dbs/initBP3.ora
remove db_unique_name, dg_broker_start, fal_server, local_listener, log_archive_config, log_archive_dest_2, log_archive_dest_state_2, service_names from this pfile
sqlplus / as sysdba
startup force nomount;
exit
ls -lrt /backup/db42-prod/BP3/autobackup | tail -n 1
/backup/db42-prod/BP3/autobackup/c-2226533455-20200219-01
rman target /
restore controlfile from '/backup/db42-prod/BP3/autobackup/c-2226533455-20200219-01';
alter database mount;
CONFIGURE DEVICE TYPE DISK PARALLELISM 8 BACKUP TYPE TO BACKUPSET;
restore database;
...
recover database;
exit;
On source server:
Take a last backup of archivelogs with your own script: the one used in scheduled tasks.
On temporary server:
su – oracle
. oraenv <<< BP3
rman target /
select current_scn from v$database;
CURRENT_SCN
-----------
11089172427
catalog start with '/backup/db42-prod/BP3/backupset/';
recover database;
select current_scn from v$database;
CURRENT_SCN
-----------
11089175474
alter database open resetlogs;
exit;
sqlplus / as sysdba
create or replace directory mig as '/backup/dumps/';
exit
expdp '/ as sysdba' full=y directory=migration dumpfile=expfull_BP3_`date +%Y%m%d_%H%M`_%U.dmp parallel=24 logfile=expfull_BP3_`date +%Y%m%d_%H%M`.log
Export was done in less than 5 hours, 4 times less than on source database. Database migration could now fit in one night. Much better isn’t it?
Other solutions
If you’re used to Data Guard, you can create a standby on this temporary server that would be dedicated to this purpose. No need to manually apply the latest archivelog backup of the day because it’s already in sync. Just convert this standby to primary without impacting the source database, or do a simple switchover then do the datapump export.
Transportable tablespace is a mixed solution where datafiles are copied to destination database, only metadata being exported and imported. But don’t expect any kind of reorganization here.
If you cannot afford a downtime of several hours of migration, you should think about logical replication. Solutions like Golden Gate are perfect for keeping application running. But as you probably know, it comes at a cost.
Conclusion
If several hours of downtime is acceptable, datapump is still a good option for migration. Downtime is all about disk speed and parallelism.