Mask R-CNN

Mask R-CNN

  • Link : https://arxiv.org/abs/1703.06870

Object Detection vs Semantic Segmentation vs Instance Segmentation

image

image

Object Detection

  • B(Bounding)Box + Classification : can separate, cannot segment
  • Object Detection์€ object๊ฐ€ ์žˆ๋Š” ๊ณณ์„ BBox(Boundar Box)๋กœ ํ‘œ์‹œํ•˜๋ฉฐ, ๊ฐ ๊ฐ์ฒด๋ฅผ ๊ตฌ๋ณ„ํ•œ๋‹ค. ๋Œ€ํ‘œ์ ์ธ ๋ชจ๋ธ๋กœ๋Š” Faster R-CNN์ด ์žˆ๋‹ค.

Semantic Segmentation

  • Segmentation + Classification : cannot separate, can segment
  • Semantic segmentation์€ ๋ชจ๋“  ํ”ฝ์…€์— ๋Œ€ํ•ด ์นดํ…Œ๊ณ ๋ฆฌ๋ฅผ ์ •ํ•œ๋‹ค. ๋‹จ, ์ด๋•Œ ๊ฐœ๋ณ„ ๊ฐ์ฒด๋Š” ๊ตฌ๋ถ„ํ•˜์ง€ ์•Š๋Š”๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ๊ณ ์–‘์ด ๋‘๋งˆ๋ฆฌ๊ฐ€ ์žˆ๋Š” ์‚ฌ์ง„์—์„œ ๋‘ ๊ณ ์–‘์ด์˜ ํ”ฝ์…€์€ ๋ชจ๋‘ โ€˜๊ณ ์–‘์ดโ€™๋ผ๋Š” ์นดํ…Œ๊ณ ๋ฆฌ๋กœ ๋ถ„๋ฅ˜๋œ๋‹ค.

Object Detection + Semantic Segmentation โ†’ Object Segmentation

  • FCN on BBOX!
  • Object Segmentation์€ Object Detection์—์„œ ๊ฐ€๋Šฅํ•œ separation๊ณผ Semantic Segmentation์—์„œ ๊ฐ€๋Šฅํ•œ segmentation์„ ํ•ฉ์นœ ๊ฒƒ์œผ๋กœ, ํ”ฝ์…€๋‹จ์œ„๋กœ ๊ฐ์ฒด๋ฅผ ๊ตฌ๋ถ„ํ•˜๋ฉฐ, ๊ฐœ๋ณ„ ๊ฐ์ฒด ๋˜ํ•œ ๊ตฌ๋ถ„ํ•œ๋‹ค.

FCN

image image image

Fully Convolutional Networks, FCN์€ semantic segmentation์„ ์œ„ํ•ด ์ œ์•ˆ๋œ ๋ชจ๋ธ๋กœ ํฌ๊ฒŒ ์„ธ๊ฐ€์ง€ ํŠน์ง•์ด ์žˆ๋‹ค.

์ฒซ ๋ฒˆ์งธ, FCN์€ semantic segmentation์„ ์œ„ํ•ด ๊ณ ์•ˆ๋œ ์ฒซ๋ฒˆ์งธ end-to-end ๋ชจ๋ธ์ด๋‹ค. End-to-end ๊ตฌ์กฐ๋Š” ๋ชจ๋ธ์˜ input layer๋ถ€ํ„ฐ output layer๊นŒ์ง€๊ฐ€ ๋ชจ๋‘ ํ•™์Šต๊ฐ€๋Šฅํ•œ nn์œผ๋กœ ์ด๋ค„์ ธ์žˆ์Œ์„ ์˜๋ฏธํ•œ๋‹ค. ๊ธฐ์กด fully connected layer๋Š” convolutional layer์—์„œ ์ถœ๋ ฅ๋œ feature map์„ flatteningํ•˜์—ฌ input์œผ๋กœ ์‚ฌ์šฉํ–ˆ๊ธฐ์— ์ด๋ฏธ์ง€์˜ ๊ณต๊ฐ„ ์ •๋ณด๋ฅผ ๊ณ ๋ คํ•˜์ง€ ์•Š์•˜๋‹ค. FCN์—์„œ๋Š” 1x1 convolution ์—ฐ์‚ฐ์„ ํ†ตํ•ด ๊ฐ ํ”ฝ์…€ ์œ„์น˜๋งˆ๋‹ค ์ฑ„๋„์ถ•์œผ๋กœ flatteningํ•˜์—ฌ ๊ฐ ์œ„์น˜์— ํ•ด๋‹นํ•˜๋Š” ๋ฒกํ„ฐ๋ฅผ ๊ฐ๊ฐ ๊ตฌํ•œ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด, ๊ฐ ํ•„ํ„ฐ๋“ค์ด ํ•˜๋‚˜์˜ weight column๊ณผ ๊ฐ™์ด ๋™์ž‘ํ•˜๋ฉฐ ๊ณต๊ฐ„ ์ •๋ณด๋ฅผ ๊ณ ๋ คํ•  ์ˆ˜ ์žˆ๊ณ , ์ฑ„๋„ ์ˆ˜ ๋งŒํผ์˜ feature map์„ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค.

๋‘๋ฒˆ์งธ, FCN์€ ๋˜ํ•œ upsampling layer๋ฅผ ํ†ตํ•ด, ๊ธฐ์กด์— ๋„“์€ receptive field๋ฅผ ํ™•๋ณดํ•˜๊ธฐ ์œ„ํ•ด pooling์„ ์ง„ํ–‰ํ•˜์—ฌ ์ €ํ•ด์ƒ๋„์˜ output์„ ์–ป๊ฒŒ ๋˜๋Š” ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ–ˆ๋‹ค.

๋งˆ์ง€๋ง‰์œผ๋กœ, FCN์€ skip-connection์„ ์‚ฌ์šฉํ•œ๋‹ค. FCN์˜ ๋†’์€ ๋ ˆ๋ฒจ์˜ ๋ ˆ์ด์–ด์˜ ๊ฒฝ์šฐ ๋””ํ…Œ์ผํ•œ ๋ถ€๋ถ„๋“ค์— ๋Œ€ํ•œ ํŠน์ง•, fine-grainedํ•œ ํŠน์ง•์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๊ณ , ๋‚ฎ์€ ๋ ˆ๋ฒจ์˜ ๋ ˆ์ด์–ด๋Š” ๋†’์€ ๋ ˆ๋ฒจ์˜ ๋ ˆ์ด์–ด๋ณด๋‹ค ๋” coarseํ•œ ๋ ˆ๋ฒจ, ์ „๋ฐ˜์ ์ด๊ณ  ์˜๋ฏธ๋ก ์ ์ธ ํŠน์ง•์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ๋‹ด๊ณ  ์žˆ๋‹ค. Semantic segmentation์„ ์œ„ํ•ด์„œ๋Š” ์ด ๋‘ ๊ฐ€์ง€์˜ ์ •๋ณด๊ฐ€๋ชจ๋‘ ํ•„์š”ํ•˜๋‹ค. ๋”ฐ๋ผ์„œ FCN์€ skip-connection์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‚ฎ์€ ๋ ˆ๋ฒจ์˜ ๋ ˆ์ด์–ด์—์„œ์˜ feature map์„ ์ง์ ‘์ ์œผ๋กœ ๊ณ ๋ คํ•  ์ˆ˜ ์žˆ๋„๋ก ์„ค๊ณ„ํ–ˆ๋‹ค.

R-CNN Family

image

R-CNN ๊ณ„์—ด์˜ ๋ชจ๋ธ๋“ค์€ Object Detection์„ ์œ„ํ•ด ๊ณ ์•ˆ๋œ ๋ชจ๋ธ๋“ค์ด๋‹ค.

R-CNN

image

  • R-CNN์€ CNN์„ object detection์— ๋„์ž…์‹œํ‚จ ๋ชจ๋ธ์ด๋‹ค. R-CNN์€ selective search๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ region proposal์„ ๊ตฌํ•˜๊ณ , ์ด๋ ‡๊ฒŒ ์–ป์€ region๋“ค์„ CNN์„ ํ†ตํ•ด feature๋ฅผ ์ถ”์ถœํ•˜๊ณ , SVM์œผ๋กœ classificationํ•˜์—ฌ region์— ๋Œ€ํ•œ classification์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. ํ•˜์ง€๋งŒ, R-CNN์€ region proposal ํ•˜๋‚˜ํ•˜๋‚˜ ๋งˆ๋‹ค classification์„ ์ˆ˜ํ–‰ํ•ด์ค˜์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์†๋„๊ฐ€ ๋งค์šฐ ๋А๋ฆฌ๋‹ค๋Š” ๋‹จ์ ์ด ์žˆ๊ณ , end-to-end ๊ตฌ์กฐ๊ฐ€ ์•„๋‹ˆ๊ธฐ ๋•Œ๋ฌธ์— ํ•™์Šต์„ ํ†ตํ•œ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ํ•œ๊ณ„๊ฐ€ ์žˆ๋‹ค.

Fast R-CNN

image

  • Fast R-CNN์€ R-CNN์˜ ๋А๋ฆฐ ์†๋„๋ฅผ ๊ฐœ์„ ํ•˜๊ณ ์ž, ์ด๋ฏธ์ง€ ์ „์ฒด์— ๋Œ€ํ•œ feature๋ฅผ ํ•œ๋ฒˆ์— ์ถ”์ถœํ•˜๊ณ , ์ด๋ฅผ ์žฌํ™œ์šฉํ•˜์—ฌ ์—ฌ๋Ÿฌ object๋“ค์„ ํƒ์ง€ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜์˜€๋‹ค. Fast R-CNN์€ convolution layer๋ฅผ ํ†ตํ•ด ์ด๋ฏธ์ง€ ์ „์ฒด์˜ feature map์„ ์ถ”์ถœํ•˜๊ณ , ROI(Region Of Interest) Pooling ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•ด, feature map์—์„œ ROI์— ํ•ด๋‹นํ•˜๋Š” ๋ถ€๋ถ„๋งŒ ์ถ”์ถœํ•œ๋‹ค. ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ FC layer๋ฅผ ๊ฑฐ์ณ region์— ๋Œ€ํ•œ classification์„ ์ˆ˜ํ–‰ํ•˜๊ณ , bounding box regression์„ ์ˆ˜ํ–‰ํ•ด ๋” ์ •ํ™•ํ•œ bounding box๋ฅผ ์–ป๋Š”๋‹ค. ๊ทธ ๊ฒฐ๊ณผ R-CNN๋ณด๋‹ค ์•ฝ 18๋ฐฐ ๋น ๋ฅธ ์†๋„๋ฅผ ๋‹ฌ์„ฑํ•  ์ˆ˜ ์žˆ์—ˆ์ง€๋งŒ, ์—ฌ์ „ํžˆ region proposal์„ ์œ„ํ•ด huristicํ•œ ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜๊ธฐ์— ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜๋Š” ์—†์—ˆ๋‹ค.

Faster R-CNN

image

  • Faster R-CNN์—์„œ๋Š” region proposal๊นŒ์ง€ neural network ๊ธฐ๋ฐ˜์˜ ๋ฐฉ๋ฒ•์„ ํ™œ์šฉํ•˜๋Š” ์ตœ์ดˆ์˜ end-to-end object detection ๊ตฌ์กฐ๋ฅผ ์ œ์•ˆํ–ˆ๋‹ค. Faster R-CNN์—์„œ๋Š” ๊ธฐ์กด์˜ time-consuming selective search ๋ฐฉ๋ฒ•์ด ์•„๋‹Œ Region Proposal Network(RPN)์„ ํ†ตํ•ด region proposal์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. RPN์€ sliding window ๋ฐฉ์‹์œผ๋กœ ๊ฐ ํ”ฝ์…€์˜ ์œ„์น˜๋งˆ๋‹ค k๊ฐœ์˜ anchor box๋ฅผ ๊ณ ๋ คํ•œ๋‹ค. Anchor box๋Š” ๊ฐ ํ”ฝ์…€ ์œ„์น˜์—์„œ ๋ฐœ์ƒํ•  ํ™•๋ฅ ์ด ๋†’์€ bounding box๋“ค์„ ์‚ฌ์ „์— ์ •์˜ํ•ด๋‘” ์ผ์ข…์˜ ํ›„๋ณด๊ตฐ์ด๋ผ ๋ณผ ์ˆ˜ ์žˆ๋‹ค. ๊ฐ ํ”ฝ์…€ ์œ„์น˜์—์„œ 256์ฐจ์›์˜ feature ๋ฒกํ„ฐ๋ฅผ ์ถ”์ถœํ•˜๊ณ , ์ด ๋ฒกํ„ฐ๋ฅผ ์ž…๋ ฅ์œผ๋กœ classification layer๋ฅผ ๊ฑฐ์ณ object์˜ ์—ฌ๋ถ€๋ฅผ ํŒ๋ณ„ํ•˜๋Š” 2k๊ฐœ์˜ classification ์ ์ˆ˜๋ฅผ ์ถœ๋ ฅํ•˜๊ณ , regresion layer๋ฅผ ๊ฑฐ์ณ 4k๊ฐœ์˜ ์ขŒํ‘œ๊ฐ’์„ ์ถœ๋ ฅํ•œ๋‹ค.

Mask R-CNN

image

Mask R-CNN์€ Instance Segmentation์„ ์œ„ํ•ด ๊ณ ์•ˆ๋œ ๋ชจ๋ธ๋กœ, ์ด๋ฆ„์—์„œ ์•Œ ์ˆ˜ ์žˆ๋“ฏ์ด Faster R-CNN๊ณผ ์œ ์‚ฌํ•œ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€์ง€๋งŒ ๋ช‡๊ฐ€์ง€ ๊ฐœ์„ ์ ์„ ๊ฐ€์ง„๋‹ค. Faster R-CNN์˜ ๊ฒฝ์šฐ RPN์˜ region proposal ๊ธฐ๋ฐ˜์œผ๋กœ ROI pooling ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•˜์˜€๊ธฐ ๋•Œ๋ฌธ์— ์ •์ˆ˜ ์ขŒํ‘œ๋งŒ ๋‹ค๋ค˜์œผ๋‚˜, Mask R-CNN์—์„œ๋Š” ROIAlign์ด๋ผ๋Š” ์ƒˆ๋กœ์šด pooling layer๋ฅผ ์ œ์•ˆํ•˜์—ฌ interpolation์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์†Œ์ˆ˜์  ํ”ฝ์…€ ์ˆ˜์ค€์—์„œ์˜ pooling์„ ์ง€์›ํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜์—ˆ๋‹ค.

๋˜ํ•œ, Mask R-CNN์—์„œ๋Š” ๊ธฐ์กด์˜ Faster R-CNN์˜ classification, box-regression head์™€ ๋”๋ถˆ์–ด ๋ณ„๋„์˜ mask branch๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ, ํ•˜๋‚˜์˜ bounding box์— ๋Œ€ํ•ด ๋ชจ๋“  ํด๋ž˜์Šค์— ๋Œ€ํ•œ binary mask๋ฅผ ์ƒ์„ฑํ•˜๊ณ , classification head์˜ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ํ†ตํ•ด ์–ด๋–ค mask๋ฅผ ์‚ฌ์šฉํ•  ๊ฒƒ์ธ์ง€ ๊ฒฐ์ •ํ•œ๋‹ค.

Mask Branch

image

  • RoI๊ฐ€ input์œผ๋กœ ๋“ค์–ด์˜ค๊ฒŒ ๋˜๋ฉด FCN์„ ์ ์šฉํ•œ convolutional network๋ฅผ ํ†ต๊ณผํ•˜๊ฒŒ ๋œ๋‹ค. ์ด๋•Œ ๋งˆ์Šคํฌ๋Š” ๊ฐ class๋งˆ๋‹ค ์ƒ์„ฑ๋œ๋‹ค.
  • ์ผ๋ฐ˜์ ์ธ FCN์€ ๊ฐ pixel์— ํ•ด๋‹นํ•˜๋Š” class label ๊ฐ’์„ output์œผ๋กœ ๋‚ด๋Š” ๋ฐ˜๋ฉด, mask branch์— ์ ์šฉ๋œ FCN์€ ๋ฌผ์ฒด๊ฐ€ ์กด์žฌํ•˜๋Š”์ง€ ์—ฌ๋ถ€์— ๋Œ€ํ•œ binary ๊ฐ’์„ output์œผ๋กœ ๋‚ธ๋‹ค. ๋”ฐ๋ผ์„œ ๊ธฐ์กด FCN๊ณผ ๋‹ค๋ฅธ loss function์„ ์‚ฌ์šฉํ•œ๋‹ค.
    • Normal FCN: per-pixel softmax, multinomial cross-entropy loss
    • Mask branch FCN: per-pixel sigmoid, binary cross-entropy loss

Mask R-CNN Loss

image

  • mask branch์—์„œ๋Š” ๋ฌผ์ฒด์˜ class์— ๋”ฐ๋ผ๊ฐ€ ์•„๋‹Œ, ๋ฌผ์ฒด์˜ ์กด์žฌ ์—ฌ๋ถ€์— ๋”ฐ๋ผ mask๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.

RoIAlign

RoI, Region of Interest

Feature Extraction

image

  • RoI๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•ด Fast R-CNN์—์„œ๋Š” feature map์„ ์ถ”์ถœํ•œ๋‹ค. feature map์˜ ์‚ฌ์ด์ฆˆ๋Š” input ์‚ฌ์ด์ฆˆ๋ฅผ 32๋กœ ๋‚˜๋ˆ , ์ด๋ฏธ์ง€ ์ •๋ณด๋ฅผ ์••์ถ•ํ•œ๋‹ค. ์œ„์˜ ๊ทธ๋ฆผ์˜ ์˜ˆ์‹œ์—์„œ๋Š” 512x512x3์„ โ†’ 16x16x512๋กœ ์••์ถ•ํ•œ๋‹ค.

Get RoIs from the Feature Map

image

  • ๋ชจ๋“  RoI๋Š” ์ขŒํ‘œ์™€ ์‚ฌ์ด์ฆˆ๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ๋‹ค.

Quantization of coordinates on the feature map

  • Quantization : process of constraining an input from a large set of values(like real numbers) to a discrete set(like integers)

image

  • ์‚ฌ์ง„๊ณผ ๊ฐ™์€ ์˜ˆ์‹œ๋ฅผ ๋ณด๋ฉด bbox์˜ ์‚ฌ์ด์ฆˆ๋Š” 145x200์ด๋ฉฐ, top-left ์ขŒํ‘œ๋Š” (192,226)์ด๋‹ค. ์šฐ๋ฆฌ์˜ feature map์€ 16x16์ธ๋ฐ, ์œ„์˜ 200, 145์™€ ๊ฐ™์€ ์ˆซ์ž๋Š” 32๋กœ ๋‚˜๋ˆ ๋–จ์–ด์ง€์ง€ ์•Š๋Š”๋‹ค.

image

  • ๊ทธ๋ž˜์„œ ์†Œ์ˆ˜์ ์€ ๋ฒ„๋ ค, ๊ฒฐ๊ณผ์ ์œผ๋กœ ์‚ฌ์ง„์˜ ํŒŒ๋ž€์ƒ‰ ๋ถ€๋ถ„์˜ ์ •๋ณด๋ฅผ ์žƒ๊ฒŒ ๋˜๊ณ , ์ดˆ๋ก์ƒ‰ ๋ถ€๋ถ„์—์„œ์˜ ์ƒˆ๋กœ์šด ์ •๋ณด๋ฅผ ์–ป๊ฒŒ๋˜๋ฉฐ ์ด๋กœ์ธํ•ด ์›๋ž˜์˜ RoI๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ๋ชปํ•œ๋‹ค.

RoI Pooling

image

image

  • ์‚ฌ์ด์ฆˆ๊ฐ€ ๋‹ค๋ฅธ ๊ฐ RoI๋“ค์˜ ํฌ๊ธฐ๋ฅผ ๊ณ ์ •๋œ ์‚ฌ์ด์ฆˆ๋กœ ๋งž์ถฐ์ฃผ๊ธฐ ์œ„ํ•ด pooling์„ ์ง„ํ–‰ํ•œ๋‹ค. Pooling์„ ์ง„ํ–‰ํ•˜๋ฉฐ ๊ทธ๋ฆผ์˜ ๋งˆ์ง€๋ง‰ ํ–‰๊ณผ ๊ฐ™์ด ๋˜ ํ•œ๋ฒˆ ์ •๋ณด๋ฅผ ์žƒ๊ฒŒ ๋œ๋‹ค.

RoI Align

image image

  • RoI Align์€ ๊ธฐ์กด RoI, RoIPooling์˜ ์ •๋ณด ์†์‹ค ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ์ƒˆ๋กœ์šด ๋ฐฉ๋ฒ•๋ก ์ด๋‹ค. Mask R-CNN์€ Instance Segmentation task๋ฅผ ์ˆ˜ํ–‰ํ•˜๊ธฐ์— pixel๊ฐ„์˜ ๊ด€๊ณ„๊ฐ€ ์ค‘์š”ํ•˜๋‹ค. ๋”ฐ๋ผ์„œ ์ •๋ณด์˜ ์†์‹ค์ด ์—†์–ด์•ผํ•œ๋‹ค. RoI Align์€ quantization์„ ํ•˜์ง€ ์•Š๊ณ , RoI ๊ฐ’์„ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•œ๋‹ค. RoI์˜ ๋ฒ”์œ„๋ฅผ 3x3 ์‚ฌ์ด์ฆˆ์— ๋งž์ถฐ width์™€ height๋ฅผ 3๋“ฑ๋ถ„ํ•˜๊ณ , ๊ฐ ๋“ฑ๋ถ„๋œ ๊ตฌ์—ญ์„ 4๋“ฑ๋ถ„ํ•˜์—ฌ bilinear interpolation์„ ๊ณ„์‚ฐํ•ด ๊ฐ ์ง€์ ์˜ ๊ฐ’์„ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด์ œ 4๋“ฑ๋ถ„๋œ ๊ตฌ์—ญ์˜ ๊ฐ’์„ poolingํ•˜์—ฌ 3x3 feature map์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ RoI Align์€ quantization์—†์ด pooling์„ ์ง„ํ–‰ํ•˜์—ฌ ์ •๋ณด์˜ ์†์‹ค์—†์ด pooling์„ ์ง„ํ–‰ํ•  ์ˆ˜ ์žˆ๋‹ค.

Reference

  • https://www.youtube.com/watch?v=RtSZALC9DlU
  • https://velog.io/@imfromk/CV-Understanding-RoIsRegion-of-Interest
  • https://kimdoing.medium.com/mask-r-cnn-649f78083547
  • https://kimdoing.medium.com/fully-convolutional-networks-for-semantic-segmentation-4ca085ccb1b1
  • https://wikidocs.net/148635
  • https://panggu15.github.io/detection/Mask-R-CNN/

Categories:

Updated:

Leave a comment