1 Answer
- Newest
- Most votes
- Most comments
0
Hi, could you please share some additional details? Do the 2 datasets have the same schema? does any of the data sets have more columns than the other? are you expecting one table or 2 tables?
If you expect 2 tables to be cataloged, and the data sets are not too different, you should separate each dataset in its own prefix (folder).
some of the files might be having more columns that you were aware of.
any other details on the classifier and the crawler you created , and on the schema of the 2 datasets may help to provide better guidance.
thank you
Relevant content
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 3 years ago
I try to do this example https://aws.amazon.com/blogs/big-data/integrate-and-deduplicate-datasets-using-aws-lake-formation-findmatches/ the two datasets have the same schema and expect one table from the crawler to complete the task I have created csv classifier which has "," as delimiter double-quote as quote symbol column headings:has headings id,title,authors,venue,year
Thank you in advance