Computer Vision (Spring 2021)


Administrative Matters

Instructor: Lin ZHANG (张林),

TA: Yang CHEN, Email:

Evaluation: homework (3 times) 30%, final project 30%, final paper exam 40%; extra bonus 5%.


Latest Notices

1. Scores and ref solutions for assignment 1 now is available.

2. Assignment 2 is available now (Due: May 31, 2021)

3. Assignment 1 is available now (Due: Apr. 18, 2021)

4. Course website is online! (Feb. 19, 2020)


Lecture Slides


Reading Materials

Introduction to Computer Vision

Handout 01

1. Computer Vision, Wiki Page,

2. Erlangen Program,

Local Interest Point Detectors

Handout 02

1. HarrisCornerDetector: this program implements the Harris corner detector and generates an example for "corner detection" mentioned in our lecture.

2. C. Harris and M. Stephens, A combined corner and edge detector, 1988

3. D.G. Lowe, Distinctive image features from scale-invariant keypoints, IJCV' 04

Local Feature Descriptors and Matching

Handout 03

1. Sift Implementation (Matlab)

2. Sift Implementation (C++)

3. HomographyEstimation: this program implements how to estimate the homography matrix between two images based on local interest points and descriptors. Homography fitting is performed by RANSAC algorithm.

Math Prerequsit I: Projective Geometry

Introduction of Projective Geometry in Wiki

Math Prerequsit II: Nonlinear Least-squares

K. Madsen et al., Methods for nonlinear least-squares problems, Technical Univ. Denmark, 2004

Measurement Using a Single Camera

1. Z. Zhang, A Flexible New Technique for Camera Calibration, IEEE T-PAMI, 2000

2. Rodrigues' rotation formula

3. Why do we need at least two calibration board images?

4. Demo code to perform single camera calibration. This demo is based on the openCV source code, totally complying with the theoretical discussions in our lectures. The code is complied by VS2017+opencv4.2+Win10. Since it is a pure C++ project, it can be straightforwardly ported to another platform (MacOS or Ubutu) if you like. 

Basics for Machine Learning and A Special Emphasis on CNN

1. Demo for linear regression
2. Demo for softmax regression
3. Caffe: The most widely used deep learning framework
4. Windows Caffe Installation Guide
5. Digit classification demo. Classify an image with a digit using your trained LeNet. (For instructions, refer to Installation Guide)
6. K. He et al., Deep Residual Learning for Image Recognition, CVPR 2016
7. G. Huang et al., Densely Connected Convolutional Networks, CVPR 2017
8. J. Redmon et al., Yolo: 9000 better, faster, stronger, CVPR 2017
9. Learn to configure YoloV2 and try to solve your own detection task,
10. 典型卷积神经网络模型结构的演进

Applications of CNN

1. Lin Zhang et al., Vision-based parking-slot detection: A DCNN-based approach and a large-scale benchmark dataset, IEEE Trans. Image Processing, 2018.
2. Z. Cao et al., Realtime multi-person 2D pose estimation using part affinity fields, CVPR 2017

Introduction to Numerical Geometry

1. Example to demonstrate fast marching (FastMarching.rar)
2. Example to show Euclidean isometry removal by PCA (EuclideanIsometryRemoval.rar)
3. Code for ICP-based 3D shape matching (ICP.rar)
4. Geomagic Studio 2015 (forWin64)
5. Lin Zhang et al., 3D face recognition based on multiple keypoint descriptors andsparse representation, PLoS ONE, 2014

6. Lin Zhang et al., 3D palmprint identification using block-wise features and collaborative representation, IEEE Trans. Pattern Analysis and Machine Intelligence, 2015.



1. Assignment 1, Scores, ref solution for Q1 (credit to Jiajie Li), ref solution for Q2 (credit to Jiajie Li), ref solution for Q3

2. Assignment 2, Due: May 31, 2021


1. Compress all files into a .rar file whose name is composed of student name and ID.

2. For the programming assignments, please use Matlab, C++, or Python and make sure your program can successfully run on TA's machine.

3. All the documents you hand in, including comments in the source codes, should be in English.

4. Please send your solutions to TA  and confirm with TA that she has received your email successfully.




1. 2 or 3 persons form a group to deal with a selected topic.

2. At the end of this semester, you need to hand in the source code of the project and a related report; and then, you need to give a presentation about your fruit. All the documents should be in English, including the comments in the program. The style of the source code should be neat and clear; and you should provide clear comments to the key components, functions, or statements. The report should contain at least the following parts: background introduction, system structure design, key algorithms used, experimental results, and references.

3. Try your best to make the system perfect. Creative ideas are highly encouraged. If the innovation is critical, we could prepare some conference papers!



1. Panorama Stitching

2. Palmprint verification on mobile phones

3. Parking-slot detetion on Nvidia Jetson TX2 with tensorRT

4. Detection and distance measurement of speed bumps

5. Interaction between a mobile device and a ROS host

6. Depth Estimation and Dense Reconstruction with the Monocular Camera

Group-topic-pairs list


Main References

D. Forsyth and J. Ponce
Computer Vision -- A Modern Approach (2nd Edition),
Prentice Hall, 2013
Online version available here

Richard Hartley and Andrew Zisserman

Multiple View Geometry in Computer Vision  (2nd Edition)

Cambridge University Press, 2004

Online version available here

Milan Sonka, Vaclav Hlavac, and Roger Boyle

Image Processing, Analysis, and Machine Vision

Thomson, 2008


Created on: Feb. 19, 2021

Last updated on: Apr. 25, 2021