Memory error occurs in amazon sagemaker when preprocessing 400 mb of data which is stored in s3. No problem in loading the data. Dimension of data is 8-9 million rows and 4 columns, but I get 7000 columns after applying One hot encoding. One tying to train de model I get the following memory error: MemoryError: Unable to allocate 946. MiB for an array with shape (123962752,) and data type float64

Notebook instance is ml.t2.medium. How to solve this issue?

Hello, without changing your code, one option would be to switch to an instance type having a larger amount of RAM memory. The usage pricing for each instance type is detailed on

Now, there are other options to perform one-hot encoding and other Feature Engineering tranformations, like SageMaker Data Wrangler and AWS Glue DataBrew. References:

