[Coursera] Self-Driving Car, Visual Perception(Assignment 1)

Lecture Review

[Coursera] Self-Driving Car, Visual Perception(Assignment 1)

dhpark 2022. 3. 9. 13:53

출처 : https://www.coursera.org/learn/visual-perception-self-driving-cars/home/week/1

Coursera | Online Courses & Credentials From Top Educators. Join for Free | Coursera

Learn online and earn valuable credentials from top universities like Yale, Michigan, Stanford, and leading companies like Google and IBM. Join Coursera for free and transform your career with degrees, certificates, Specializations, & MOOCs in data science

www.coursera.org

1. 사전 지식

- Disparity Map을 계산하는 OpenCV 함수 활용법 파악

- 카메라 행렬 분해 OpenCV 함수 활용법 파악

StereoBM : 스테레오 상응 점을 블록 매칭 알고리즘을 통해 계산합니다.

virtual void cv::cuda::StereoBM::compute
(InputArray  	left,
 InputArray  	right,
 OutputArray  	disparity,
 Stream &  	stream 
)

static Ptr<StereoBM> cv::StereoBM::create
(int numDisparities = 0,  // Disparity 검색 범위
 int blockSize = 21       // 알고리즘에 의해 비교되는 선형 블록 사이즈. 
                          // 클수록 부드럽고, 작을 수록 정확한 Disparity map을 만들어 낸다.
)                         // 하지만, 작으면 부정확한 상응점을 찾을 수도 있다.
	
Python:
	retval = cv.StereoBM_create([, numDisparities[, blockSize]]	)

StereoSGBM: StereoBM 알고리즘에 비해 전/후 처리 스텝을 더 포함합니다.
decomposeProjectionMatrix() : 카메라 행렬을 QR Decomposition에 의해 분해합니다.

2. 파일 실행

두 개의 스테레오 이미지가 주어지고, 각각 이미지의 카메라 파라미터는 다음과 같습니다.

p_left 
 [[ 640.     0.   640.  2176. ]
 [   0.   480.   480.   552. ]
 [   0.     0.     1.     1.4]]

p_right 
 [[ 640.     0.   640.  2176. ]
 [   0.   480.   480.   792. ]
 [   0.     0.     1.     1.4]]

2.1 Disparity 계산

def compute_left_disparity_map(img_left, img_right):
    # Parameters
    num_disparities = 6*16
    block_size = 11
    
    min_disparity = 0
    window_size = 6
    
    img_left = cv2.cvtColor(img_left, cv2.COLOR_BGR2GRAY)
    img_right = cv2.cvtColor(img_right, cv2.COLOR_BGR2GRAY)
    
    # Stereo BM matcher
    left_matcher_BM = cv2.StereoBM_create(
        numDisparities=num_disparities,
        blockSize=block_size
    )

    # Stereo SGBM matcher
    left_matcher_SGBM = cv2.StereoSGBM_create(
        minDisparity=min_disparity,
        numDisparities=num_disparities,
        blockSize=block_size,
        P1=8 * 3 * window_size ** 2,
        P2=32 * 3 * window_size ** 2,
        mode=cv2.STEREO_SGBM_MODE_SGBM_3WAY
    )

    # Compute the left disparity map
    disp_left = left_matcher_SGBM.compute(img_left, img_right).astype(np.float32)/16
    
    return disp_left

2.2 카메라 행렬의 분해

def decompose_projection_matrix(p):
    k, r, t, _, _, _, _ = cv2.decomposeProjectionMatrix(p)
    t = t / t[3]
    return k, r, t

2.3 Depth Map의 생성

- 앞서 카메라 행렬을 분해하여 focal length(f)를 구할 수 있었고, 스테레오 이미지와 OpenCV 함수를 이용해서 Disparity Map을 구할 수 있었습니다.

- 좌우 이미지의 Tranlation Matrix를 이용해서 Baseline(b)을 구합니다.

- 아래의 수식과 위에서 구한 것을 이용해서 Depth Map(Z)를 구합니다.

※ Disparity 행렬이 분모로 들어가므로, '0' 혹은 음수로 나누지 않기 위해 해당 원소에 대해서 최소 값 ex) 0.1 으로 대체합니다.

$$ Z = \frac{fb}{X_L - X_R} $$

def calc_depth_map(disp_left, k_left, t_left, t_right):
    # Get the focal length from the K matrix
    f = k_left[0, 0]

    # Get the distance between the cameras from the t matrices (baseline)
    b = t_left[1] - t_right[1]

    # Replace all instances of 0 and -1 disparity with a small minimum value (to avoid div by 0 or negatives)
    disp_left[disp_left == 0] = 0.1
    disp_left[disp_left == -1] = 0.1

    # Initialize the depth map to match the size of the disparity map
    depth_map = np.ones(disp_left.shape, np.single)

    # Calculate the depths 
    depth_map[:] = f * b / disp_left[:]
    print(depth_map)
    ### END CODE HERE ###
    
    return depth_map

2.4 응용 - 물체와의 거리 계산

- OpenCV의 cv2.matchTemplate() 함수를 이용하면 Cross Correlation Map을 계산할 수 있습니다.

Cross Correlation 계산은 물체의 위치를 찾는데 활용 됩니다.

def locate_obstacle_in_image(image, obstacle_image):
    # Run the template matching from OpenCV
    cross_corr_map = cv2.matchTemplate(image, obstacle_image, method=cv2.TM_CCOEFF)
    
    # Locate the position of the obstacle using the minMaxLoc function from OpenCV
    _, _, _, obstacle_location = cv2.minMaxLoc(cross_corr_map)

    return cross_corr_map, obstacle_location

Template Matching 알고리즘으로 물체의 위치를 찾을 수 있다.

2.5 응용 - 바운딩 박스를 추출

def calculate_nearest_point(depth_map, obstacle_location, obstacle_img):
    # Gather the relative parameters of the obstacle box
    obstacle_width = obstacle_img.shape[0]
    obstacle_height = obstacle_img.shape[1]
    obstacle_min_x_pos = obstacle_location[1]
    obstacle_max_x_pos = obstacle_location[1] + obstacle_width
    obstacle_min_y_pos = obstacle_location[0]
    obstacle_max_y_pos = obstacle_location[0] + obstacle_height

    # Get the depth of the pixels within the bounds of the obstacle image, find the closest point in this rectangle
    obstacle_depth = depth_map_left[obstacle_min_x_pos:obstacle_max_x_pos, obstacle_min_y_pos:obstacle_max_y_pos]
    closest_point_depth = obstacle_depth.min()

    # Create the obstacle bounding box 
    obstacle_bbox = patches.Rectangle((obstacle_min_y_pos, obstacle_min_x_pos), obstacle_height, obstacle_width, 
                                 linewidth=1, edgecolor='r', facecolor='none')
    
    return closest_point_depth, obstacle_bbox

3. 결론

- 본 실험은 두개의 스테레오 이미지 쌍으로 부터 (블록 매칭 알고리즘을 이용한) OpenCV 함수를 이용해서 Disparity Map을 계산하였습니다.

- 주어진 카메라 행렬 P를 분해하여 Baseline과 Focal Length를 계산하였고, 스테리오 삼각 방정식에 의해 Depth Map을 계산하였습니다.

- 번외로, Template Matching 알고리즘을 이용하여 물체와의 거리를 Depth Map으로 부터 추출 할 수 있었습니다.

저작자표시 (새창열림)