바운딩 박스 텍스트 읽는 순서대로 정렬하는 법 | How to sort the shuffled bounding boxes in a same way to read text

Programming/python-computer vision

바운딩 박스 텍스트 읽는 순서대로 정렬하는 법 | How to sort the shuffled bounding boxes in a same way to read text

방황하는 데이터불도저 2023. 2. 6. 20:02

Computer Vision분야를 공부하면서 자주 접하는 bounding box 처리에 대한 내용이다.

이미지 내의 인식하고자하는 객체에 대한 영역에 박스표시를 한다고 해서 bounding box라는 단어를 사용하고,

보편적으로 좌표값(startX, startY, endX, endY)으로 된 데이터를 자주 볼 수 있다.

나는 EAST Detector을 사용하여 이미지 내의 text영역에 대한 bounding box를 추출하는 작업을 수행하였다.

EAST Detector를 이용하면 한 이미지 안에서 텍스트라고 인식된 모든 영역에 대해서 좌표값 리스트로 값이 리턴되는데,

그 결과로 나온 좌표값 리스트 순서가 매우 뒤죽박죽이었다.

예시 이미지

# bounding boxes list

[array([ 95, 141, 167, 163]),
 array([335,  11, 392,  35]),
 array([274,  11, 331,  38]),
 array([ 97, 114, 180, 139]),
 array([ 95,  88, 144, 111]),
 array([396,  15, 469,  36]),
 array([166, 114, 229, 138]),
 array([492,  13, 565,  37]),
 array([ 95, 166, 180, 190]),
 array([146,  91, 235, 115]),
 array([498, 582, 548, 601]),
 array([547, 586, 598, 602]),
 array([ 90, 194, 236, 251])]

이미지에서 텍스트를 추출한 후에 기계번역의 input으로 넣을 목적이 있었기에,

나의 작업에서 추출된 bounding box의 순서는 매우 중요했다.

따라서, 여러가지 방법들을 찾아본 결과로 아래와 같은 코드를 사용하였다.

def get_box_precedence(box, cols):
    tolerance_factor = 10
    return ((box[1] // tolerance_factor) * tolerance_factor) * cols + box[0]


def get_sorted_bb(boxes):
    sorted_boxes = sorted(boxes, key=lambda x: get_box_precedence(x, image.shape[1]))
    return sorted_boxes
    
    
sorted_boxes = get_sorted_bb(boxes)
sorted_boxes

# output
[array([274,  11, 331,  38]), 
 array([335,  11, 392,  35]), 
 array([396,  15, 469,  36]), 
 array([492,  13, 565,  37]), 
 array([ 95,  88, 144, 111]), 
 array([146,  91, 235, 115]), 
 array([ 97, 114, 180, 139]), 
 array([166, 114, 229, 138]), 
 array([ 95, 141, 167, 163]), 
 array([ 95, 166, 180, 190]), 
 array([ 90, 194, 236, 251]), 
 array([498, 582, 548, 601]), 
 array([547, 586, 598, 602])]

▼ 파라미터로 들어가는 변수들에 대한 설명

[get_box_precedence 함수]

- box : 하나의 bounding box에 대한 (startX, startY, endX, endY) 형태의 array / tuple / list

* box[0] = startX(x축 값), box[1] = startY(y축 값)

- tolerance_factor, cols : 보기에 같은 행(row, y축)에 있는 bounding box를 보이는 그대로 정렬하기 위해 계산해주는 값

ex. 500x500 크기의 이미지에서 (24, 46)점에 있는 텍스트와 (60, 45)점에 있는 텍스트가 보기에는 같은 줄에 있지만 단순히 top->left->right->bottom 규칙으로 하면 (60, 45)점이 더 우선으로 정렬되버린다. 따라서, y축 값이 40~50범위인 점을 동일한 line으로 보고 (tolerance_factor = 10), 동일 line에서 x축 값의 크기에 따라 정렬해주는 것이다.

[get_sorted_bb 함수]

- boxes : 여러 bounding box list를 포함하고 있는 list

- sorted_boxes : 위에서 정의한 get_box_precedence함수를 기준으로 새로 정렬한 bounding box list

* sorted() 함수에 대한 설명은 이곳으로 → https://kyull-it.tistory.com/83

아래의 함수를 활용하여 잘 정렬된 리스트를 확인할 수 있다.

def putText(cv_img, text, x, y, color=(0, 255, 0), font_size=3):

    font = ImageFont.truetype('/usr/share/fonts/truetype/nanum/NanumGothicBold.ttf', font_size)    # Korean
    img = Image.fromarray(cv_img)

    draw = ImageDraw.Draw(img)
    draw.text((x, y), text, font=font, fill=color)

    cv_img = np.array(img)

    return cv_img


textimage = image.copy()

for i in range(len(sorted_boxes)):
    textimage = putText(textimage, str(i), sorted_boxes[i][0]-5, sorted_boxes[i][1]+10, 
                        color=(0, 255, 0), font_size=20)

**** 하지만.

예를들어, (24, 40) (60, 39) 의 두 점이 있다고 치면, (24, 40)가 앞글자인데, 위의 코드로는 (60, 39)가 앞글자인 것으로 정렬된다.

보통의 해상도 높은 큰 이미지라면 위의 코드가 문제 없을 것 같지만 만약, 아주 작은 이미지, 섬세한 이미지가 input으로 들어온 경우, 잘못 정렬될 경우가 발생한다.이러한 경우를 대비할 수 있는 코드는 더 찾아봐야할듯싶다.ㅠ

참고 링크

1) https://github.com/clovaai/CRAFT-pytorch/issues/43

Bounding box is shuffled. How to sort it? · Issue #43 · clovaai/CRAFT-pytorch

First of all thanks very much for your model. I have used your pretrained model for text detection. I have got the bounding box coordinates as txt files. The resulted bounding boxes are shuffled an...

github.com

2) https://github.com/clovaai/CRAFT-pytorch/issues/52

Need help -- Have you sorted boundry boxes? · Issue #52 · clovaai/CRAFT-pytorch

Boundry boxes are shuffled. Can you please help me for this. Really appreciate.

github.com

3) https://stackoverflow.com/questions/39403183/python-opencv-sorting-contours

Python opencv sorting contours

I am following this question: How can I sort contours from left to right and top to bottom? to sort contours from left-to-right and top-to-bottom. However, my contours are found using this (OpenC...

stackoverflow.com

(+)

https://stackoverflow.com/questions/58903071/i-want-to-sort-the-words-extracted-from-image-in-order-of-their-occurence-using%EF%BB%BF

I want to sort the words extracted from image in order of their occurence using contours detection

I am making an OCR, I am using contours detection, I have extracted words and drawn bounding boxes but the problem is that when I crop the individual word, they are not in sorted order. I have tried

stackoverflow.com

https://stackoverflow.com/questions/68220867/opencv-contour-sorting

Opencv contour sorting

I'm new to python and Opencv. I'm working on a project to build an app to recognize handwritings in local language. One of the critical part of it to find the order of the words and character. So far

stackoverflow.com

https://stackoverflow.com/questions/66946804/python-sorting-items-from-top-left-to-bottom-right-with-opencv/67008153#67008153

Python: Sorting items from top left to bottom right with OpenCV

How can I go about trying to order the items of a picture from top left to bottom right, such as in the image below? Currently receiving this error with the following code . Error: a = sorted(keyp...

stackoverflow.com

'Programming > python-computer vision' 카테고리의 다른 글

Color Space (색공간), OpenCV (0)	2023.07.13
기울어진 BBOX 이미지를 잘라내는 법, 수평으로 회전시키는 방법! (EasyOCR, OpenCV) (0)	2023.02.17
(OpenCV, Matplotlib) 파이썬에서 이미지 시각화하는 방법, 띄우는 방법 (0)	2023.01.02
이미지 회전 Image Rotation할 때, 이미지 잘리지 않게 하는 법 \| imutils.rotate(), rotate_bound() (0)	2022.12.26
Google Tesseract 설치 및 실행 방법 \| 이미지 속 텍스트 글자 인식하기 (Pytesseract OCR) (0)	2022.07.27

현재글바운딩 박스 텍스트 읽는 순서대로 정렬하는 법 | How to sort the shuffled bounding boxes in a same way to read text

주니어입니다. 겸손하게 불도저처럼 나아가겠습니다☄️

머신러닝, 벡터, 칸아카데미, TensorFlow, 모두를위한선형대수학, 데이터, 딥러닝, 파이썬, 선형대수, Python, 선형대수학, Linux, linearalgebra, 신경망모델, 인공지능, ML, tensor, 부스트코스, 리눅스, 텐서플로우,

Today :
Yesterday :

AI와 데이터의 모든 것