Inconsistent keras model.summary() output shapes on AWS SageMaker and EC2
I have the following model in a jupyter notebook:
```python
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras import layers
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)
SIZE = (549, 549)
SHUFFLE = False
BATCH = 32
EPOCHS = 20
train_datagen = DataGenerator(train_files, batch_size=BATCH, dim=SIZE, n_channels=1, shuffle=SHUFFLE)
test_datagen = DataGenerator(test_files, batch_size=BATCH, dim=SIZE, n_channels=1, shuffle=SHUFFLE)
inp = layers.Input(shape=(*SIZE, 1))
x = layers.Conv2D(filters=549, kernel_size=(5,5), padding="same", activation="relu")(inp)
x = layers.BatchNormalization()(x)
x = layers.Conv2D(filters=549, kernel_size=(3, 3), padding="same", activation="relu")(x)
x = layers.BatchNormalization()(x)
x = layers.Conv2D(filters=549, kernel_size=(1, 1), padding="same", activation="relu")(x)
x = layers.BatchNormalization()(x)
x = layers.Conv2D(filters=549, kernel_size=(3, 3), padding="same", activation="sigmoid")(x)
model = Model(inp, x)
model.compile(loss=tf.keras.losses.binary_crossentropy, optimizer=Adam())
model.summary()
```
Sagemaker and EC2 are running tensorflow 2.7.1. The EC2 instance is p3.2xlarge with Deep Learning AMI GPU TensorFlow 2.7.0 (Amazon Linux 2) 20220607. The SageMaker notebook is using ml.p3.2xlarge and I am using the conda_tensorflow2_p38 kernel. The notebook is in an FSx Lustre file system that is mounted to both SageMaker and EC2 so it is definitely the same code running on both machines.
nvidia-smi output on SageMaker:
```
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000000:00:1E.0 Off | 0 |
| N/A 37C P0 24W / 300W | 0MiB / 16384MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
```
nvidia-smi output on EC2:
```
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000000:00:1E.0 Off | 0 |
| N/A 42C P0 51W / 300W | 2460MiB / 16384MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 11802 C /bin/python3.8 537MiB |
| 0 N/A N/A 26391 C python3.8 1921MiB |
+-----------------------------------------------------------------------------+
```
The model.summary() output on SageMaker is:
```python
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 549, 549, 1)] 0
conv2d (Conv2D) (None, 549, 549, 1) 7535574
batch_normalization (BatchN (None, 549, 549, 1) 4
ormalization)
conv2d_1 (Conv2D) (None, 549, 549, 1) 2713158
batch_normalization_1 (Batc (None, 549, 549, 1) 4
hNormalization)
conv2d_2 (Conv2D) (None, 549, 549, 1) 301950
batch_normalization_2 (Batc (None, 549, 549, 1) 4
hNormalization)
conv2d_3 (Conv2D) (None, 549, 549, 1) 2713158
=================================================================
Total params: 13,263,852
Trainable params: 13,263,846
Non-trainable params: 6
```
The model.summary() output on EC2 is (notice the shape change):
```python
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 549, 549, 1)] 0
conv2d (Conv2D) (None, 549, 549, 549) 14274
batch_normalization (BatchN (None, 549, 549, 549) 2196
ormalization)
conv2d_1 (Conv2D) (None, 549, 549, 549) 2713158
batch_normalization_1 (Batc (None, 549, 549, 549) 2196
hNormalization)
conv2d_2 (Conv2D) (None, 549, 549, 549) 301950
batch_normalization_2 (Batc (None, 549, 549, 549) 2196
hNormalization)
conv2d_3 (Conv2D) (None, 549, 549, 549) 2713158
=================================================================
Total params: 5,749,128
Trainable params: 5,745,834
Non-trainable params: 3,294
_________________________________________________________________
```
One other thing that is interesting, if I change my model on the EC2 instance to:
```python
inp = layers.Input(shape=(*SIZE, 1))
x = layers.Conv2D(filters=1, kernel_size=(5,5), padding="same", activation="relu")(inp)
x = layers.BatchNormalization()(x)
x = layers.Conv2D(filters=1, kernel_size=(3, 3), padding="same", activation="relu")(x)
x = layers.BatchNormalization()(x)
x = layers.Conv2D(filters=1, kernel_size=(1, 1), padding="same", activation="relu")(x)
x = layers.BatchNormalization()(x)
x = layers.Conv2D(filters=1, kernel_size=(3, 3), padding="same", activation="sigmoid")(x)
model = Model(inp, x)
model.compile(loss=tf.keras.losses.binary_crossentropy, optimizer=Adam())
```
My model.summary() output becomes:
```python
Model: "model_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_3 (InputLayer) [(None, 549, 549, 1)] 0
conv2d_8 (Conv2D) (None, 549, 549, 1) 26
batch_normalization_6 (Batc (None, 549, 549, 1) 4
hNormalization)
conv2d_9 (Conv2D) (None, 549, 549, 1) 10
batch_normalization_7 (Batc (None, 549, 549, 1) 4
hNormalization)
conv2d_10 (Conv2D) (None, 549, 549, 1) 2
batch_normalization_8 (Batc (None, 549, 549, 1) 4
hNormalization)
conv2d_11 (Conv2D) (None, 549, 549, 1) 10
=================================================================
Total params: 60
Trainable params: 54
Non-trainable params: 6
_________________________________________________________________
```
In the last model the shape is similar to SageMaker but the trainable parameters are very low.
Any ideas as to why the output shape is different and why this is happening with the filters? When I run this model on my personal computer, the shape is the same as EC2. I think there might be an issue with SageMaker.