Is this a good way to contend with a start point restore for an N-tier replication task.


Here is a n-tier data migration orchestration that we are working with. I apologize for the verbosity. I am attempting to describe a sizable migration in a few sentences.


(on-prem-sql-server-dms-task)---->(s3 B1)----T1--->(s3 B2)---->( RDS sql-server-task (N-1 table replication))

on-prem-sql-server-db-dms-task(1 to 1) - Target is a single database in a multi-tenant, multi-database on prem direct connect sql-server instance. For each database participating, a new full load/cont. sync dms-task will be created. It is important to note that there is a transformation that adds the database Identifier for the source to every table from cdc. The target (s3 B1) will have a folder for each database in the replication task. If 150 databases are involved there will be 150 tasks here and 150 database dbo/cdc folders in (s3 B1).

s3 B1 - Target of the task above.

T1 - Trasformation1: Lambda triggered for s3 B1 put/post. This lambda adds a unique identifier for every incoming row for tracking and more importantly copies the dbo/cdc file over to corresponding database location in (s3 B2).

s3 B2 - Target of a new full load/cont. sync dms-task described below.

RDS sql-server-task && possibly aurora postgres - There is a task for each source database with a transformation that shoves the data from each table, for each database, into one large table in the target endpoint. That is why adding the database field is key in step one. It's like a N-1 table replication.

Question This will likely be running in a dev/test/stage/prod environment and the dev/test/stage databases are frequently restored from prod. In order to contend with database restores at the point of origin, I am hopeful that the following steps will be workable. Does anyone see any undocumented gotchas that may be encountered?

Origin Database Restore Contention

  1. Make sure that all replication tasks of type (on-prem-sql-server-dms-task) and (many to 1-RDS sql-server) have the StartReplicationTaskType set to reload-target.
  2. Expose a fire and forget API endpoint that will allow internal tooling to call into and perform the following stepsL
  • Using aws dms api call method to StopReplicationTask for (on-prem-sql-server-dms-task).
  • Using aws dms api call method to StopReplicationTask for (RDS sql-server-task).
  • Foreach table in the RDS sql-server, remove records related to source database, or mask them and delete later.
  • Using aws dms api call method to StartReplicationTask for (on-prem-sql-server-dms-task).
  • Using aws dms api call method to StartReplicationTask for (RDS sql-server-task).
  1. Test?!

I would like to solution this without having the API part, however, I can't figure out how to trigger a lambda or some other process to stop the tasks, delete the data and restart the tasks when a restore event occurs at source. That would be the ideal solution.

  • Can you explain more what is it that you are trying to achieve? Is the goal to transfer data from a source database to a target database with an additional column for every table containing an unique identifier for each row?

    Can you explain the why instead of the how? Please also provide some info on how the unique identifier is created.

  • @AWS-VG - Thanks for your reply. After realizing this question was very broad, another question with a narrower focus on the key problem was added. An attempt was made to delete this question, however, the functionality for that could not be found. The more condensed question is here->

1 Answer
answered 18 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions