Computer Vision – ELTE Számítógépes Grafika

Lectures: on Tuesdays, 8:15-9:45 , lecture hall 0-805 ‘Fejér Lipót’

Practices:

On Tuesdays 14.15-15.45, South building 2-219 (Graphics lab — Grafika labor) or
On Fridays 8.30-10.00 North Building 7.89-90 Bosch lab or
On Fridays 12.15-13.45 North Building 7.89-90 Bosch lab

Teachers:

Dates for Assignment Presentation (MS Teams)

19th December, 14:00 MS Teams
2nd January, 14:00 MS Teams
9th January, 14:00 MS Teams
22nd January, 10:00 MS Teams
30th January, 14:00 MS Teams

Dates for Oral Exams

Location: ELTE South Building, room 0-222

20th December, 10:00 (in person)
3rd January, 10:00 (in person)
10th January, 10:00 (in person)
22nd January, 14:00 (in person)
31st January, 10:00 (in person)

Timetable of the semester

Weak	Lecture	Practice
1st (12th Sept)	Introduction	Demonstrations: ELTECar and ELTEKart
2nd (19th Sept)	Over-determined inhomogeneus linear system of equations	GUI for OpenCV
3rd (26th Sept)	Lagrange-multiplier, homogeneous linear system, plane/line fitting	Affine Transformations
4th (3rd Oct)	Random Sampling Consensus (RANSAC)	Robust Fitting by RANSAC
5th (10th Oct)	Optimal line/plane fitting, Singular Value Decomposition	Multi-Model Fitting
6th (17th Oct)	Camera models, Projections models	Point Cloud Visualization
7th (24th Oct)	Introduction to homographies, homography estimation.	Tomasi-Kanade factorization
	Fall break
8th (7th Nov)	Camera calibration	Sphere and cylinder detection
9th (14th Nov)	Introduction to Stereo Vision	Homography estimation
10st (21st Nov)	Fundamental mtx estimation; triangulation for standard stereo	Camera Calibration
11th (28th Nov)	Triangulation, essential mtx decomposition	Assignment presentations
12th (5th Dec)	Planar motion, Point registration	Full stereo reconstruction
13th (12th Dec)	Point registration , Bundle Adjustment	Point Registration

Course materials:

Presentations:

Materials for practices:

Videos for the labs can be downloaded from here.

Latest version of OpenCV can be downloaded from project’s website or our page.

Questions for the oral exam:

Basic estimation theory: solution for homogeneous and inhomogeneous linear system of equations.
Robust fitting methods: RANSAC, LO-RANSAC.
Multi-model fitting: sequential RANSAC, MultiRANSAC
Camera models: perspective camera, orthogonal projection, weak-perspective camera.
Calibration of perspective camera using a spatial calibration object: Estimation and decomposition of projection matrix.
Chessboard-based calibration of perspective cameras. Radial and tangential distortion.
Plane-plane homography. Panoramic images by homographies.
Estimation of homography. Data normalization.
Basics of epipolar geometry. Essential and fundamental matrices. Rectification.
Estimation of essential and fundamental matrices. Data normalization. Decomposition of essential matrix.
Triangulation for both standard and general stereo vision.
Stereo vision for planar motion.
Tomasi-Kanade factorization: multi-view reconstruction by orthogonal and weak-perspective camera models.
Reconstruction by merging stereo reconstructions. Registration of two point sets by a similarity transformation.
Numerical multi-view reconstruction: bundle adjustment.
~~Tomasi-Kanade factorization with missing data.~~
~~Reconstruction by special devices: laser scanning, depth camera, LiDAR~~

Assignments (Obsolete — 2022 fall)

1.Point Cloud Colouring (30%)

A LiDAR device and a digital camera are given. They are calibrated to each other. The aim of this assignment is to merge information from different sensors.

LiDAR gives 3D coordinates without colours. The colours can be added from the camera images.

The transformation from LiDAR to camera is written as R^T*p_l+ t = p_c. (R^T is the transpose of R) In the formula, p_c and p_l are the camera and lidar 3D coordinates, respectively. Intrinsic and extrinsic parameters, with respect to LiDAR, for all cameras are given in a text file.

Colour the LiDAR points that are visible in camera images. If a point is in the backside of all cameras, its colour should be black. If a point is visible in more than one image, set the average of the colours.

There are three different scenes. Save the colour point clouds into PLY files.

Upload the resulting files and your implementation to Canvas in a zip file.

https://cg.inf.elte.hu/~hajder/vision/assignments/First_PointCloudColorization/

2. Inverse Perpsective Mapping (40%)

Compute birds’-eye view image by Inverse Perspective Mapping (IPM).

Four cameras and a LiDAR are mounted to ELTECar. The documentation of the sensor kit is available at http://cv.inf.elte.hu/wp-content/uploads/2022/10/sensor-pack-summary.docxLinks to an external site.. Only the two frontal cameras should be processed, the other cameras with fisheye lenses and the LiDAR should be omitted.

We put markers in front of the car, the recorded images containing the markers can be seen at our webpageLinks to an external site.. The markers form a 3×2 grid, the edge size of the grid is regular (3 meters).

Your task is threefold:

Stitch the two frontal images to each other by a homography. You can use manual marker selection to collect data for homography estimation. (5%)
Transform the stitched image in order to get a birds’-eye view. In the new view, the distances between the markers should be regular. Select the new viewpoint and set the edge size of the marker grid to 50+3*N where N is the day of your birthday. Thus, N={1,…31}. (20%)
Take at least three sequences from the Bosch-ELTE dataset (Links to an external site.). Generate a video using OpenCV (https://docs.opencv.org/4.6.0/dd/d9e/classcv_1_1VideoWriter.html#a3115b679d612a6a0b5864a0c88ed4b39 (Links to an external site.)) from the new virtual birds’-eye view images. Remark that each zip contains more than ten sequences, you should select only three out of those. (15%)

Upload at least ten images of the resulting video and the source codes to CANVAS. Save and store the final videos as you should show them in the final presentation. Do not upload the videos to CANVAS, please.

Remark that you can also download the data files from our alternative webpage.

3. Planar Motion (30%)

Implement the special stereo algorithm for planar motion. It is written in the end of the second presentation. Implement a RANSAC framework for which two points are required to get the initial (minimal) model. Put the planar stereo algorithm into your own RANSAC framework. Apply point-epipolar lines (symmetric) distances to separate outliers and inliers.

Input image pairs can be downloaded from here. Apply any kind of feature detectors and matchers, like ASIFT, to obtain point pairs. Run the RANSAC+Planar method for the five image pairs and estimate the dominant motion parameters. Write the extrinsic parameters to a file.

Upload the implementation and the parameter files to CANVAS. At least two good results are required to get the maximal score.

The cameras are calibrated, the intrinsic camera parameters are as follows:

DEV1 Intrinsics:
    – fx = 1280.7
    – fy = 1281.2
    – [u0, v0] = [969.4257, 639.7227]

          | 1280.7     0.0 969.4257 |
    K =|    0.0 1281.2 639.7227 |
          |    0.0     0.0       1.0 |

DEV2 Intrinsics:
    – fx = 1276.1
    – fy = 1275.4
    – [u0, v0] = [965.9650, 618.2222]

          | 1276.1     0.0 965.9650 |
    K = |    0.0 1275.4 618.2222 |
          |    0.0     0.0       1.0 |

Final grade

The final grade is the sum of oral exam (max. 100%) and assignment scores.

Thresholds for marks as follows:

Excellent (5): >=170%
Good (4): 140-169%
Satisfactory (3): 110-139%
Pass (2): 80-109%
Fail (1): 0-79%