shared data Model in real time

0

Hi team,

Our organization is embarking on a project to design a shared data model where we can efficiently store and manage a large payload.

Multiple lines of business will be both updating this payload and consuming the relevant data from it.

Our goal also is to perform data mapping to tailor the shared payload to the specific data models of each line of business.

It's essential for us to ensure that this shared payload does not become a single point of failure (SPOF). Instead, we aim to build a system that is

highly available, responsive, and capable of handling large volumes of data and reflect real time/most recent data for consumers.

Our objective is to have the shared data model reflect real-time and updated data so that other lines of business can access the most current information.

we want also to hide some data from one line of business to another, so not all of them can see the same data or the entire data.

The objective is to allow any line of business to put/update data on the shared data model and also read data pushed by others while keeping data consistency and integrity and the shared Model has the most recent data.

To achieve these goals, we are seeking an event-driven, cost-effective, scalable solution that can handle high loads and large data volumes. With that in mind, I would like to inquire about the best AWS services to implement this shared data model while maintaining data integrity and providing high availability for both readers and consumers.

what would be the best format to store the shared payload? JSON? what is the best option to share data is it to notify the other line of business if data changes or let them grab the data that they are interested in? who can we perform data mapping to adapt the data to each line of business? at the shared Model level

How can we perform data mapping to tailor the data to the requirements of each line of business? at the shared model level the mapping should be done?

Thank you for your input and expertise in this matter :) :)

1 Answer
0

Hello,

I understand you would like to inquire about the best AWS services to implement this shared data model while maintaining data integrity and providing high availability for both readers and consumers.

As for your objective on having the shared data model to reflect real-time, You can explore streaming services like Amazon Kinesis Data Streams or Amazon Managed Streaming Kafka to capture real-time data.

Addressing your concern below:

what would be the best format to store the shared payload?

=> It is always recommended to use parquet format, Parquet is a columnar storage format that offers efficient data retrieval.

However, JSON or other formats might be suitable depending on your specific needs.

what is the best option to share data is it to notify the other line of business if data changes ?

=> Services like Amazon SNS or Amazon SQS integrated with CloudWatch can alert interested parties about data changes.

or let them grab the data that they are interested in? who can we perform data mapping to adapt the data to each line of business? at the shared Model level./ we want also to hide some data from one line of business to another, so not all of them can see the same data or the entire data.

=> To acheive the above, you can consider using Glue data catalog tables using AWS Glue job from the streaming jobs and with the help of lakeformation you can provide column-level, row-level, and cell-level security to restrict data access for different lines of business by creating data filters

How can we perform data mapping to tailor the data to the requirements of each line of business? at the shared model level the mapping should be done?

=> You can you AWS Glue ETL jobs can transform the shared data model to meet the specific requirements of each line of business. Once the Glue ETL is done you can use Athena or quicksight for further building analytical reports

Additional Considerations:

For complex architectures, consider involving an AWS Solution Architect. AWS Support is available to answer further questions about specific services.

These are some helpful blogs to refer:

Thank you!

answered 19 days ago
profile picture
EXPERT
reviewed 19 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions