The inaccurate Amazon Forecast results might be because of the following: When you process the data for each city individually, Amazon Forecast builds a model for the first city, another model for the second city, and so on. When you process the merged data, Amazon Forecast builds a single model rather than minimizing the forecasting errors across both cities. If the data that's merged is very different, the merging might produce an "average" model. This explanation is oversimplified. However, understanding the statistical distribution of the used data might be critical to determine whether a global model can be applied to all cities or individual models are mandatory to get proper predictions.
This is an example of a usual debate between "one global model for every situation" vs. "N specific models for each situation". This issue doesn't have a one-size-fits-all solution.