Kaggle Competition - Airbus Ship Detection Challenge - Mask-RCNN and COCO Transfer Learning

Yup, as mentioned, I’m going to test out one more Kaggle competition Airbus Ship Detection Challenge.
I believe you’ve already got accustomed to the data preparation.

Preparation

Required Python Packages

We FIRST make sure Matterport’s package - Mask_RCNN been successfully installed.

1
2
3
4
5
6
7
8
9
10
11
➜  ~ pip show Mask_RCNN
Name: mask-rcnn
Version: 2.1
Summary: Mask R-CNN for object detection and instance segmentation
Home-page: https://github.com/matterport/Mask_RCNN
Author: Matterport
Author-email: waleed.abdulla@gmail.com
License: MIT
Location: /home/jiapei/.local/lib/python3.6/site-packages/mask_rcnn-2.1-py3.6.egg
Requires:
Required-by:

We also need to have OpenCV’s Python package - cv2 been successfully installed.

1
2
3
4
5
6
7
8
9
10
11
➜  ~ pip show cv2
➜ ~ python
Python 3.6.7 (default, Oct 22 2018, 11:32:17)
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> cv2.__version__
'4.0.0'
>>> cv2.__file__
'/home/jiapei/.local/lib/python3.6/site-packages/opencv_python-4.0.0.21-py3.6-linux-x86_64.egg/cv2/cv2.cpython-36m-x86_64-linux-gnu.so'
>>>

Models

Then, we manually download the trained data directly from Matterport Github Mask_RCNN Release website.

Test

The Code

After the above preparation, we did some trivial modifications on Airbus Mask-RCNN and COCO Transfer Learning, as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
#!/usr/bin/env python
# coding: utf-8

# **Mask-RCNN Starter Model for the Airbus Ship Detection Challenge with transfer learning **
#
# Using pre-trained COCO weights trained on http://cocodataset.org as in https://github.com/matterport/Mask_RCNN/tree/master/samples/balloon
#
# We get some amazing performance training only within the 6hrs kaggle kernel limit.

debug = False
# debug = True

import os
import sys
import random
import math
import numpy as np
import cv2
import matplotlib.pyplot as plt
import json
from imgaug import augmenters as iaa
from tqdm import tqdm
import pandas as pd
import glob


DATA_DIR = '....../airbus-ship-detection'

# Directory to save logs and trained model
ROOT_DIR = './'


# ### Install Matterport's Mask-RCNN model from github.
# See the [Matterport's implementation of Mask-RCNN](https://github.com/matterport/Mask_RCNN).


# Import Mask RCNN
sys.path.append(os.path.join(ROOT_DIR, 'Mask_RCNN')) # To find local version of the library
from mrcnn.config import Config
from mrcnn import utils
import mrcnn.model as modellib
from mrcnn import visualize
from mrcnn.model import log


train_dicom_dir = os.path.join(DATA_DIR, 'train_v2')
test_dicom_dir = os.path.join(DATA_DIR, 'test_v2')


# ### Download COCO pre-trained weights

COCO_WEIGHTS_PATH = "mask_rcnn_coco.h5"
BALLOON_WEIGHTS_PATH = "mask_rcnn_balloon.h5"

# ### Some setup functions and classes for Mask-RCNN
#
# - dicom_fps is a list of the dicom image path and filenames
# - image_annotions is a dictionary of the annotations keyed by the filenames
# - parsing the dataset returns a list of the image filenames and the annotations dictionary

# The following parameters have been selected to reduce running time for demonstration purposes
# These are not optimal

class DetectorConfig(Config):
# Give the configuration a recognizable name
NAME = 'airbus'

GPU_COUNT = 1
IMAGES_PER_GPU = 10

BACKBONE = 'resnet50'

NUM_CLASSES = 2 # background and ship classes

IMAGE_MIN_DIM = 384
IMAGE_MAX_DIM = 384
RPN_ANCHOR_SCALES = (8, 16, 32, 64)
TRAIN_ROIS_PER_IMAGE = 126
MAX_GT_INSTANCES = 14
DETECTION_MAX_INSTANCES = 14
DETECTION_MIN_CONFIDENCE = 0.95
DETECTION_NMS_THRESHOLD = 0.0

STEPS_PER_EPOCH = 12 if debug else 120
VALIDATION_STEPS = 10 if debug else 100

config = DetectorConfig()
config.display()


import os
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from skimage.io import imread
import matplotlib.pyplot as plt
from matplotlib.cm import get_cmap
from skimage.segmentation import mark_boundaries
from skimage.util import montage
from skimage.morphology import binary_opening, disk, label
import gc; gc.enable() # memory is tight

montage_rgb = lambda x: np.stack([montage(x[:, :, :, i]) for i in range(x.shape[3])], -1)
ship_dir = '../input'
train_image_dir = os.path.join(ship_dir, 'train')
test_image_dir = os.path.join(ship_dir, 'test')

def multi_rle_encode(img, **kwargs):
'''
Encode connected regions as separated masks
'''
labels = label(img)
if img.ndim > 2:
return [rle_encode(np.sum(labels==k, axis=2), **kwargs) for k in np.unique(labels[labels>0])]
else:
return [rle_encode(labels==k, **kwargs) for k in np.unique(labels[labels>0])]

# ref: https://www.kaggle.com/paulorzp/run-length-encode-and-decode
def rle_encode(img, min_max_threshold=1e-3, max_mean_threshold=None):
'''
img: numpy array, 1 - mask, 0 - background
Returns run length as string formated
'''
if np.max(img) < min_max_threshold:
return '' ## no need to encode if it's all zeros
if max_mean_threshold and np.mean(img) > max_mean_threshold:
return '' ## ignore overfilled mask
pixels = img.T.flatten()
pixels = np.concatenate([[0], pixels, [0]])
runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
runs[1::2] -= runs[::2]
return ' '.join(str(x) for x in runs)

def rle_decode(mask_rle, shape=(768, 768)):
'''
mask_rle: run-length as string formated (start length)
shape: (height,width) of array to return
Returns numpy array, 1 - mask, 0 - background
'''
s = mask_rle.split()
starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
starts -= 1
ends = starts + lengths
img = np.zeros(shape[0]*shape[1], dtype=np.uint8)
for lo, hi in zip(starts, ends):
img[lo:hi] = 1
return img.reshape(shape).T # Needed to align to RLE direction

def masks_as_image(in_mask_list):
# Take the individual ship masks and create a single mask array for all ships
all_masks = np.zeros((768, 768), dtype = np.uint8)
for mask in in_mask_list:
if isinstance(mask, str):
all_masks |= rle_decode(mask)
return all_masks

def masks_as_color(in_mask_list):
# Take the individual ship masks and create a color mask array for each ships
all_masks = np.zeros((768, 768), dtype = np.float)
scale = lambda x: (len(in_mask_list)+x+1) / (len(in_mask_list)*2) ## scale the heatmap image to shift
for i,mask in enumerate(in_mask_list):
if isinstance(mask, str):
all_masks[:,:] += scale(i) * rle_decode(mask)
return all_masks


from PIL import Image
from sklearn.model_selection import train_test_split

exclude_list = ['6384c3e78.jpg','13703f040.jpg', '14715c06d.jpg', '33e0ff2d5.jpg',
'4d4e09f2a.jpg', '877691df8.jpg', '8b909bb20.jpg', 'a8d99130e.jpg',
'ad55c3143.jpg', 'c8260c541.jpg', 'd6c7f17c7.jpg', 'dc3e7c901.jpg',
'e44dffe88.jpg', 'ef87bad36.jpg', 'f083256d8.jpg'] #corrupted images

train_names = [f for f in os.listdir(train_dicom_dir)]
test_names = [f for f in os.listdir(test_dicom_dir)]
for el in exclude_list:
if(el in train_names): train_names.remove(el)
if(el in test_names): test_names.remove(el)


# training dataset
SEGMENTATION = '....../airbus-ship-detection/train_ship_segmentations_v2.csv'
anns = pd.read_csv(SEGMENTATION)
anns.head()


train_names = anns[anns.EncodedPixels.notnull()].ImageId.unique().tolist() ## override with ships

test_size = config.VALIDATION_STEPS * config.IMAGES_PER_GPU
image_fps_train, image_fps_val = train_test_split(train_names, test_size=test_size, random_state=42)

if debug:
image_fps_train = image_fps_train[:100]
image_fps_val = image_fps_val[:100]
test_names = test_names[:100]

print(len(image_fps_train), len(image_fps_val), len(test_names))


class DetectorDataset(utils.Dataset):
"""Dataset class for training our dataset.
"""

def __init__(self, image_fps, image_annotations, orig_height, orig_width):
super().__init__(self)

# Add classes
self.add_class('ship', 1, 'Ship')

# add images
for i, fp in enumerate(image_fps):
annotations = image_annotations.query('ImageId=="' + fp + '"')['EncodedPixels']
self.add_image('ship', image_id=i, path=os.path.join(train_dicom_dir, fp),
annotations=annotations, orig_height=orig_height, orig_width=orig_width)

def image_reference(self, image_id):
info = self.image_info[image_id]
return info['path']

def load_image(self, image_id):
info = self.image_info[image_id]
fp = info['path']
image = imread(fp)
# If grayscale. Convert to RGB for consistency.
if len(image.shape) != 3 or image.shape[2] != 3:
image = np.stack((image,) * 3, -1)
return image

def load_mask(self, image_id):
info = self.image_info[image_id]
annotations = info['annotations']
# print(image_id, annotations)
count = len(annotations)
if count == 0:
mask = np.zeros((info['orig_height'], info['orig_width'], 1), dtype=np.uint8)
class_ids = np.zeros((1,), dtype=np.int32)
else:
mask = np.zeros((info['orig_height'], info['orig_width'], count), dtype=np.uint8)
class_ids = np.zeros((count,), dtype=np.int32)
for i, a in enumerate(annotations):
mask[:, :, i] = rle_decode(a)
class_ids[i] = 1
return mask.astype(np.bool), class_ids.astype(np.int32)


# ### Examine the annotation data, parse the dataset, and view dicom fields

image_fps, image_annotations = train_names, anns

ds = imread(os.path.join(train_dicom_dir, image_fps[0])) # read image from filepath
_ = plt.imshow(ds)


# Original image size: 768 x 768
ORIG_SIZE = ds.shape[0]
ORIG_SIZE


# ### Create and prepare the training dataset using the DetectorDataset class.

# prepare the training dataset
dataset_train = DetectorDataset(image_fps_train, image_annotations, ORIG_SIZE, ORIG_SIZE)
dataset_train.prepare()

# prepare the validation dataset
dataset_val = DetectorDataset(image_fps_val, image_annotations, ORIG_SIZE, ORIG_SIZE)
dataset_val.prepare()

# ### Display a random image with bounding boxes

# Load and display random sample and their bounding boxes

class_ids = [0]
while class_ids[0] == 0: ## look for a mask
image_id = random.choice(dataset_val.image_ids)
image_fp = dataset_val.image_reference(image_id)
image = dataset_val.load_image(image_id)
mask, class_ids = dataset_val.load_mask(image_id)

print(image.shape)

plt.figure(figsize=(10, 10))
plt.subplot(1, 2, 1)
plt.imshow(image)
plt.axis('off')

plt.subplot(1, 2, 2)
masked = np.zeros(image.shape[:2])
for i in range(mask.shape[2]):
masked += mask[:, :, i] ## * image[:, :, 0]
plt.imshow(masked, cmap='gray')
plt.axis('off')

print(image_fp)
print(class_ids)


# ### Image Augmentation. Try finetuning some variables to custom values

# Image augmentation (light but constant)
augmentation = iaa.Sequential([
iaa.OneOf([ ## rotate
iaa.Affine(rotate=0),
iaa.Affine(rotate=90),
iaa.Affine(rotate=180),
iaa.Affine(rotate=270),
]),
iaa.Fliplr(0.5),
iaa.Flipud(0.5),
iaa.OneOf([ ## brightness or contrast
iaa.Multiply((0.9, 1.1)),
iaa.ContrastNormalization((0.9, 1.1)),
]),
iaa.OneOf([ ## blur or sharpen
iaa.GaussianBlur(sigma=(0.0, 0.1)),
iaa.Sharpen(alpha=(0.0, 0.1)),
]),
])

# test on the same image as above
imggrid = augmentation.draw_grid(image, cols=5, rows=2)
plt.figure(figsize=(30, 12))
_ = plt.imshow(imggrid.astype(int))


# ### Now it's time to train the model. Note that training even a basic model can take a few hours.
#
# Note: the following model is for demonstration purpose only. We have limited the training to one epoch, and have set nominal values for the Detector Configuration to reduce run-time.
#
# - dataset_train and dataset_val are derived from DetectorDataset
# - DetectorDataset loads images from image filenames and masks from the annotation data
# - model is Mask-RCNN

model = modellib.MaskRCNN(mode='training', config=config, model_dir=ROOT_DIR)

# Exclude the last layers because they require a matching
# number of classes
model.load_weights(COCO_WEIGHTS_PATH, by_name=True, exclude=[
"mrcnn_class_logits", "mrcnn_bbox_fc",
"mrcnn_bbox", "mrcnn_mask"])


LEARNING_RATE = 0.004

# Train Mask-RCNN Model
import warnings
warnings.filterwarnings("ignore")

## train heads with higher lr to speedup the learning
model.train(dataset_train, dataset_val,
learning_rate=LEARNING_RATE*2,
epochs=2,
layers='heads',
augmentation=None) ## no need to augment yet

history = model.keras_model.history.history

model.train(dataset_train, dataset_val,
learning_rate=LEARNING_RATE,
epochs=4 if debug else 12,
layers='all',
augmentation=augmentation)

new_history = model.keras_model.history.history
for k in new_history: history[k] = history[k] + new_history[k]

epochs = range(1, len(history['loss'])+1)
pd.DataFrame(history, index=epochs)

plt.figure(figsize=(17,5))

plt.subplot(131)
plt.plot(epochs, history["loss"], label="Train loss")
plt.plot(epochs, history["val_loss"], label="Valid loss")
plt.legend()
plt.subplot(132)
plt.plot(epochs, history["mrcnn_class_loss"], label="Train class ce")
plt.plot(epochs, history["val_mrcnn_class_loss"], label="Valid class ce")
plt.legend()
plt.subplot(133)
plt.plot(epochs, history["mrcnn_bbox_loss"], label="Train box loss")
plt.plot(epochs, history["val_mrcnn_bbox_loss"], label="Valid box loss")
plt.legend()

plt.show()


best_epoch = np.argmin(history["val_loss"])
score = history["val_loss"][best_epoch]
print(f'Best Epoch:{best_epoch+1} val_loss:{score}')


# select trained model
dir_names = next(os.walk(model.model_dir))[1]
key = config.NAME.lower()
dir_names = filter(lambda f: f.startswith(key), dir_names)
dir_names = sorted(dir_names)

if not dir_names:
import errno
raise FileNotFoundError(
errno.ENOENT,
"Could not find model directory under {}".format(self.model_dir))

fps = []
# Pick last directory
for d in dir_names:
dir_name = os.path.join(model.model_dir, d)
# Find the last checkpoint
checkpoints = next(os.walk(dir_name))[2]
checkpoints = filter(lambda f: f.startswith("mask_rcnn"), checkpoints)
checkpoints = sorted(checkpoints)
if not checkpoints:
print('No weight files in {}'.format(dir_name))
else:
checkpoint = os.path.join(dir_name, checkpoints[best_epoch])
fps.append(checkpoint)

model_path = sorted(fps)[-1]
print('Found model {}'.format(model_path))


class InferenceConfig(DetectorConfig):
GPU_COUNT = 1
IMAGES_PER_GPU = 1

inference_config = InferenceConfig()

# Recreate the model in inference mode
model = modellib.MaskRCNN(mode='inference',
config=inference_config,
model_dir=ROOT_DIR)

# Load trained weights (fill in path to trained weights here)
assert model_path != "", "Provide path to trained weights"
print("Loading weights from ", model_path)
model.load_weights(model_path, by_name=True)


# set color for class
def get_colors_for_class_ids(class_ids):
colors = []
for class_id in class_ids:
if class_id == 1:
colors.append((.941, .204, .204))
return colors


# ### How does the predicted box compared to the expected value? Let's use the validation dataset to check.

# Show few example of ground truth vs. predictions on the validation dataset
dataset = dataset_val
fig = plt.figure(figsize=(10, 40))

for i in range(8):

image_id = random.choice(dataset.image_ids)

original_image, image_meta, gt_class_id, gt_bbox, gt_mask = modellib.load_image_gt(dataset_val, inference_config,
image_id, use_mini_mask=False)

print(original_image.shape)
plt.subplot(8, 2, 2*i + 1)
visualize.display_instances(original_image, gt_bbox, gt_mask, gt_class_id,
dataset.class_names,
colors=get_colors_for_class_ids(gt_class_id), ax=fig.axes[-1])

plt.subplot(8, 2, 2*i + 2)
results = model.detect([original_image]) #, verbose=1)
r = results[0]
visualize.display_instances(original_image, r['rois'], r['masks'], r['class_ids'],
dataset.class_names, r['scores'],
colors=get_colors_for_class_ids(r['class_ids']), ax=fig.axes[-1])


# Get filenames of test dataset images
test_image_fps = test_names


# ### Final steps - Create the submission file

# Make predictions on test images, write out sample submission
def predict(image_fps, filepath='submission.csv', min_conf=config.DETECTION_MIN_CONFIDENCE):
# assume square image
resize_factor = ORIG_SIZE / config.IMAGE_SHAPE[0]
#resize_factor = ORIG_SIZE
with open(filepath, 'w') as file:
file.write("ImageId,EncodedPixels\n")

for image_id in tqdm(image_fps):
found = False

image = imread(os.path.join(test_dicom_dir, image_id))
# If grayscale. Convert to RGB for consistency.
if len(image.shape) != 3 or image.shape[2] != 3:
image = np.stack((image,) * 3, -1)
image, window, scale, padding, crop = utils.resize_image(
image,
min_dim=config.IMAGE_MIN_DIM,
min_scale=config.IMAGE_MIN_SCALE,
max_dim=config.IMAGE_MAX_DIM,
mode=config.IMAGE_RESIZE_MODE)

results = model.detect([image])
r = results[0]

assert( len(r['rois']) == len(r['class_ids']) == len(r['scores']) )
if len(r['rois']) == 0:
pass
else:
num_instances = len(r['rois'])

for i in range(num_instances):
if r['scores'][i] > min_conf:
# print(r['scores'][i], r['rois'][i], r['masks'][i].shape, np.sum(r['masks'][...,i]), r['masks'][i], r.keys())
file.write(image_id + "," + rle_encode(r['masks'][...,i]) + "\n")
found = True

if not found:
file.write(image_id + ",\n")


submission_fp = os.path.join(ROOT_DIR, 'submission.csv')
predict(test_image_fps, filepath=submission_fp)
print(submission_fp)


output = pd.read_csv(submission_fp)
output.head(50)


# show a few test image detection example
def visualize_test():
image_id = random.choice(test_image_fps)

# original image
# print(image_id)
image = imread(os.path.join(test_dicom_dir, image_id))

# assume square image
resize_factor = ORIG_SIZE / config.IMAGE_SHAPE[0]

# If grayscale. Convert to RGB for consistency.
if len(image.shape) != 3 or image.shape[2] != 3:
image = np.stack((image,) * 3, -1)
resized_image, window, scale, padding, crop = utils.resize_image(
image,
min_dim=config.IMAGE_MIN_DIM,
min_scale=config.IMAGE_MIN_SCALE,
max_dim=config.IMAGE_MAX_DIM,
mode=config.IMAGE_RESIZE_MODE)

results = model.detect([resized_image])
r = results[0]
for bbox in r['rois']:
# print(bbox)
x1 = int(bbox[1] * resize_factor)
y1 = int(bbox[0] * resize_factor)
x2 = int(bbox[3] * resize_factor)
y2 = int(bbox[2] * resize_factor)
cv2.rectangle(image, (x1,y1), (x2,y2), (77, 255, 9), 3, 1)
width = x2 - x1
height = y2 - y1
# print("x {} y {} h {} w {}".format(x1, y1, width, height))
fig, ax = plt.subplots()
ax.set_title(f"{len(r['rois'])}: {image_id}")
plt.imshow(image)

for i in range(8):
visualize_test()

Outcome

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
➜  airbus-ship-detection python airbus_ship_detection.py
Using TensorFlow backend.

Configurations:
BACKBONE resnet50
BACKBONE_STRIDES [4, 8, 16, 32, 64]
BATCH_SIZE 10
BBOX_STD_DEV [0.1 0.1 0.2 0.2]
COMPUTE_BACKBONE_SHAPE None
DETECTION_MAX_INSTANCES 14
DETECTION_MIN_CONFIDENCE 0.95
DETECTION_NMS_THRESHOLD 0.0
FPN_CLASSIF_FC_LAYERS_SIZE 1024
GPU_COUNT 1
GRADIENT_CLIP_NORM 5.0
IMAGES_PER_GPU 10
IMAGE_CHANNEL_COUNT 3
IMAGE_MAX_DIM 384
IMAGE_META_SIZE 14
IMAGE_MIN_DIM 384
IMAGE_MIN_SCALE 0
IMAGE_RESIZE_MODE square
IMAGE_SHAPE [384 384 3]
LEARNING_MOMENTUM 0.9
LEARNING_RATE 0.001
LOSS_WEIGHTS {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0}
MASK_POOL_SIZE 14
MASK_SHAPE [28, 28]
MAX_GT_INSTANCES 14
MEAN_PIXEL [123.7 116.8 103.9]
MINI_MASK_SHAPE (56, 56)
NAME airbus
NUM_CLASSES 2
POOL_SIZE 7
POST_NMS_ROIS_INFERENCE 1000
POST_NMS_ROIS_TRAINING 2000
PRE_NMS_LIMIT 6000
ROI_POSITIVE_RATIO 0.33
RPN_ANCHOR_RATIOS [0.5, 1, 2]
RPN_ANCHOR_SCALES (8, 16, 32, 64)
RPN_ANCHOR_STRIDE 1
RPN_BBOX_STD_DEV [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD 0.7
RPN_TRAIN_ANCHORS_PER_IMAGE 256
STEPS_PER_EPOCH 120
TOP_DOWN_PYRAMID_SIZE 256
TRAIN_BN False
TRAIN_ROIS_PER_IMAGE 126
USE_MINI_MASK True
USE_RPN_ROIS True
VALIDATION_STEPS 100
WEIGHT_DECAY 0.0001


41556 1000 15606
(768, 768, 3)
....../airbus-ship-detection/train_v2/ae490e3fb.jpg
[1]
2019-01-23 12:19:43.634701: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:993] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-01-23 12:19:43.635112: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: GeForce GTX 980M major: 5 minor: 2 memoryClockRate(GHz): 1.1265
pciBusID: 0000:01:00.0
totalMemory: 3.94GiB freeMemory: 2.93GiB
2019-01-23 12:19:43.635131: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2019-01-23 12:19:43.918100: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-01-23 12:19:43.918138: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2019-01-23 12:19:43.918145: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2019-01-23 12:19:43.918284: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2636 MB memory) -> physical GPU (device: 0, name: GeForce GTX 980M, pci bus id: 0000:01:00.0, compute capability: 5.2)

Starting at epoch 0. LR=0.008

Checkpoint Path: ./airbus20190123T1219/mask_rcnn_airbus_{epoch:04d}.h5
Selecting layers to train
fpn_c5p5 (Conv2D)
fpn_c4p4 (Conv2D)
fpn_c3p3 (Conv2D)
fpn_c2p2 (Conv2D)
fpn_p5 (Conv2D)
fpn_p2 (Conv2D)
fpn_p3 (Conv2D)
fpn_p4 (Conv2D)
In model: rpn_model
rpn_conv_shared (Conv2D)
rpn_class_raw (Conv2D)
rpn_bbox_pred (Conv2D)
mrcnn_mask_conv1 (TimeDistributed)
mrcnn_mask_bn1 (TimeDistributed)
mrcnn_mask_conv2 (TimeDistributed)
mrcnn_mask_bn2 (TimeDistributed)
mrcnn_class_conv1 (TimeDistributed)
mrcnn_class_bn1 (TimeDistributed)
mrcnn_mask_conv3 (TimeDistributed)
mrcnn_mask_bn3 (TimeDistributed)
mrcnn_class_conv2 (TimeDistributed)
mrcnn_class_bn2 (TimeDistributed)
mrcnn_mask_conv4 (TimeDistributed)
mrcnn_mask_bn4 (TimeDistributed)
mrcnn_bbox_fc (TimeDistributed)
mrcnn_mask_deconv (TimeDistributed)
mrcnn_class_logits (TimeDistributed)
mrcnn_mask (TimeDistributed)
Epoch 1/2
......
2019-01-23 12:21:14.922200: I tensorflow/core/common_runtime/bfc_allocator.cc:647] Stats:
Limit: 2764636160
InUse: 2422788608
MaxInUse: 2500490496
NumAllocs: 3411
MaxAllocSize: 1257242624

2019-01-23 12:21:14.922287: W tensorflow/core/common_runtime/bfc_allocator.cc:271] *********************x*************************_**___******************************************_____
2019-01-23 12:21:14.922318: W tensorflow/core/framework/op_kernel.cc:1333] OP_REQUIRES failed at transpose_op.cc:199 : Resource exhausted: OOM when allocating tensor with shape[1260,14,14,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
File "airbus_ship_detection.py", line 356, in <module>
augmentation=None) ## no need to augment yet
File "/home/jiapei/.local/lib/python3.6/site-packages/mask_rcnn-2.1-py3.6.egg/mrcnn/model.py", line 2375, in train
File "/home/jiapei/.local/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/jiapei/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "/home/jiapei/.local/lib/python3.6/site-packages/keras/engine/training_generator.py", line 217, in fit_generator
class_weight=class_weight)
File "/home/jiapei/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1217, in train_on_batch
outputs = self.train_function(ins)
File "/home/jiapei/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
return self._call(inputs)
File "/home/jiapei/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
fetched = self._callable_fn(*array_vals)
File "/home/jiapei/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1439, in __call__
run_metadata_ptr)
File "/home/jiapei/.local/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1260,14,14,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node mrcnn_mask_bn1/FusedBatchNorm-0-0-TransposeNCHWToNHWC-LayoutOptimizer}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

[[{{node mrcnn_mask_loss/Shape_1/_4457}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

I haven’t quite figure out how to solve the Resource exhausted issue yet. But some resultant images can be viewed FIRST:

sample airbus ship
sample airbus ship and bounding box
sample airbus ships