This project aims to identify and track lanes in more challenging highway scenarios (with lighting changes, color changes, shadows, underpasses) using a single camera mounted on the center of a vehicle.
File | Description | Supported flags |
---|---|---|
config.py | global parameters for the project | No flags |
calibrations.py | code for calibrating and undistorting images | -- corners, --undistort, --filename=RELATIVE_PATH_TO_IMAGE |
transforms.py | code for perspective transformation | --warp=RELATIVE_PATH_TO_IMAGE, --unwarp=RELATIVE_PATH_TO_IMAGE, --comp |
thresholds.py | code for binary thresholding of images | No flags |
lanes.py | code for lane detection and lane line state management | No flags |
main.py | code for pipeline for images and video | RELATIVE_PATH_TO_IMAGE --debug --timeslot=1-10 |
The first step is the correction for the effect of image distortion. These distortions are caused by the angle of light and the position of the lens while capturing the image. The distortions changes the size and shapes of objects in an image. Calibration is a process which measures and corrects the distortion errors based on measurements from standard shapes. In this project, we use a chessboard to calibrate the camera. The regular high contrast pattern makes it an ideal candidate for calibrating a camera.
The project repository provided by Udacity contained a set of calibration images which were taken at different angles and distances. For each calibration image, the following steps were performed
- Identify the 3D real world points for each of the identified corners using the OpenCV function
cv2.findChessboardCorners( ... )
. These points are called object points - The corresponding 2D coordinates of these points in the image are the locations where two black squares touch each other in the respective chess boards. These points are called image points.
- With the identified object points and image points, calibrate the camera using the function
cv2.calibrateCamera( ... )
which returns the camera matrix and distortion coefficients - The object points, image points, camera matrix and distortion coefficients for each image are converted to a byte stream and saved to a pickle file
The above steps are implemented in the calibrations.py file. The results could be verified by running the following command from a conda environment:
$ python calibrations.py --corners
For each calibration*.jpg file in the calibration folder, a corresponding result file with the postfix _corners.jpg is created in the output_images folder. The result files shows a side-by-side view of the original chessboard image with distortion and the resulting image with the identified corners. A sample output for one of the calibration image is shown below:
As a side note, some of the chessboard images could not be calibrated because cv2.findChessboardCorners
was unable to detect the desired number of internal corners.
This step unpickles the camera matrix and distortion coefficients from the previously cached pickle file . The raw images are calibrated using the openCV cv2.undistort(image, mtx, dist, None, None)
function. The output could be verified by running the following command from a conda environment:
$ python calibrations.py --undistort
For each calibration*.jpg file in the calibration folder, a corresponding result file with the postfix _undistorted.jpg is created in the output_images folder. The result files shows a side-by-side view of the original chessboard image with distortion and the resulting undistorted image. A sample output for one of the calibration image is shown below:
The lane detection pipeline consists of the following stages:
- Distortion correction
- Perspective transformation
- Binary thresholding
- Lane line detecting
- Calculating radius of curvature
- Calculating vehicle position
The calibration routines discussed in the previous chapter are encapsulated in the class cameraCalibration
. A given image is undistorted by simply initializing a cameraCalibration object and a subsequent call to cameraCalibration::undistort(filename)
function.
The distortion correction on a given image can be executed by the following command:
$ python calibrations.py --filename=<RELATIVE_PATH_TO_IMAGE>
An example calibration by python calibrations.py --filename=test_images\signs_vehicles_xygrad.png
gives the following result:
The undistorted image seen above exhibits the so-called perspective phenomenon where objects farther on the road appears smaller and parallel lines seem to converge to a point. Perspective transform is the step which warps an image by transforming the apparent z-coordinates of the object points. This effectively adapts the objects 2D representation. Perspective transformation involves the following steps:
- Define the source points (
src
) that define a rectangle in a given image - Define the destination image points (
dst
) on the transformed/warped image - Map the
src
anddst
points by the functioncv2.getPerspectiveTransform( ... )
to derive the mapping perspective matrix - M - Apply the tranformation matrix M on the original undistorted image to derive the warped image by calling the
cv2.warpPerspective ( ...)
function. - To reverse the perspective transform, the
src
anddst
points needs to be swapped to derive the the MInv matrix using thecv2.getPerspectiveTransform( ... )
function. - A call to
cv2.warpPerspective ( ...)
with theMInv
matrix unwarps the image
The above steps are encapsulated in the class perspectiveTranform
. The perspective transformation on a given image can be performed by executing the following command:
$ python transforms.py --warp=<RELATIVE_PATH_TO_IMAGE> --comp
$ python transforms.py --unwarp=<RELATIVE_PATH_TO_IMAGE> --comp
A sample transformation on a straight and curved road are shown below:
To efficiently extract the lane lines, several thresholding techniques were discussed in the Lesson 7: Gradient and color spaces. An explorative study on several combination of sobel gradient thresholds, magnitude thresholds, color thresholds in different color space like RGB, HSV, Luv and Lab were carried out. The standard combination of Sobel gradients and S-binary thresholds gave reasonably good results for detecting yellow and white lines. But they failed to detect the lines on roads with less contrast and scenarios with shadows. The performance of the Sobel gradient and S-binary thresholds could be seen in the below picture.
For a robust performance under shadows and different lighting scenarios, I explored color spaces other than the HSV and HSL color space. I took help from the Udacity Mentor network. I got a hint that the b channel from Lab color space and l channel from Luv color space with a specific range of thresholds gave good results for yellow and white lines under normal lighting as well as under shadows and low contrast surfaces. So i decided to do the binary thresholding only on Lab and Luv color spaces. The performance of the color thresholds showed good results:
The code for the binary thresholds are available in the class thresholdedImage
. The functions luv_l_thresh
and lab_b_thresh
extracts the l and b channels respectively, applies the given thresholds and returns the thresholded image. The function applyThresholds
OR's the output from the two functions and returns the final binary thresholded image.
An extract of the code is shown below:
# Returns the binary image combining (OR) the binary thresholded l channel
# from the Luv color space and b channel from Lab color space
def applyThresholds(self):
l_binary_output = self.luv_l_thresh(self.image)
b_binary_output = self.lab_b_thresh(self.image)
# Combine Luv and Lab B color space channel thresholds
combined = np.zeros_like(l_binary_output)
combined[(l_binary_output == 1) | (b_binary_output == 1)] = 1
return combined
After identifying the edges on the warped images, the next step is to identify the potential lane lines from the image by plotting a histogram of the binary pixels on the warped, binary-thresholded image. This serves as a starting position for the lanes. I extensively re-used the code from the Lesson 8(Advanced Computer Vision). As suggested in the lesson, i defined a class line
which represents the internal state of a lane line. In addition, I defined a class drivingLane
which encapsulates the detection of the left and right line, validation of the lines and filling of the lanes lines with a defined color.
- The
track
(instance ofclass drivingLane
) object reads in an image and instantiates one line class per detected line. - From the first image frame, the
track
objects uses the sliding windows method as discussed in Lesson 8 (Advanced Computer Vision :Finding the Lines: Sliding Window) to detect a set of points (X and Y) which could be a potential lane. - The detected points are fit to a second degree polynomial using the function
cv2.polyfit (x, y, 2)
and forwarded to the respectiveline
object. - The
line
object internally validates the recent fit and adds it to an array of line fit. - The
line
object calculates a best-fit based on the weighted average of the line fit array (10 elements). The weights are determined by the count of the pixels which were used for thecv2.polyfit
.
A sample of lane line detection is shown in the below short video frame. The debug video could be prepared by running the following command:
$ python main.py test_images\project_video.mp4 --debug --timeslot=1-3
In the output frame, the lane pixels for the left line and right line could be identified by the blue and red colors. The fitted line is drawn with a green color.
After the line is detected and validated, it is unwarped and overlayed back onto the original image using the function cv2.warpPerspective
.
cv2.warpPerspective(image, self.Minv, (image.shape[1], image.shape[0]), flags=cv2.INTER_LINEAR)
The code for overlay is available in the module lanes.py. The function overlay_lanes
prepares a the set of x and y in stacked array and uses the function cv2.fillpoly to draw a closed region enclosing the x and y coordinates. In addition, the polynomial fitted lane lines are also drawn with a pre-defined thickness.
Since the radius of curvature is measured in world-space coordinates(metres), I had to normalize the lane dimensions to meters. For this, i took one of the warped straight line image as a reference and manually calculated the horizontal distance between 2 lanes lines and the vertical distance of a lane segment. This pixel sizes corresponded to the real-world values of 3.7m and 3 meters respectively. The track
object initialize two variable in real-world space as below:
self.ym_per_pix = 3/110 # 110 is the number of pixels for 1 lane segement in straight_line1_warped.jpg
self.xm_per_pix = 3.7/380 # 380 is number of pixels for 1 lane width in straight_line1_warped.jpg
With the above normalization values, each lane calculates the radius of curvature by the following steps
-
Normalize the y and x points by multiplying the values by ym_per_pix and xm_per_pix respectively
-
Fit a second-degree polynomial with the real-world values of y and x
-
With the returned coefficients (A & B), the radius of coefficient can be calculate with the formula:
-
Since the measurement of radius of curvature is done closest to your vehicle, the
y_eval
value corresponding to the bottom of the image is used.
The following code explains the radius of curvature calculations:
def calc_curavture(self, ym_per_pix, xm_per_pix ):
ploty = np.linspace(0, (config.IMAGE_HEIGHT)-1, config.IMAGE_HEIGHT)
#fit a second degree polynomial, which fits the current x and y points
fit_cr = np.polyfit((self.ally * ym_per_pix), (self.allx * xm_per_pix), 2)
#fit a second degree polynomial, which fits the current x and y points
y_eval = np.max(ploty)*ym_per_pix
self.radius_of_curvature = ((1 + (2*fit_cr[0]*y_eval + fit_cr[1])**2)**1.5)/abs(2*fit_cr[0])
return self.radius_of_curvature
I took support from the Udacity knowledge portal to calculate the vehicle position. The vehicle position is calculated with the following steps:
- For calculating the vehicle position, it can be assumed that the camera is mounted at the centre of the vehicle such that the lane center is the midpoint at the bottom of the image.
vehicle_position = image_shape[1]/2
- The center of the lane is calculated as the average of the left line and right line x-intercepts:
leftline_intercept = leftline.best_fit[0]*height**2 + leftline.best_fit[1]*height + leftline.best_fit[2]
rightline_intercept = rightline.best_fit[0]*height**2 + rightline.best_fit[1]*height + rightline.best_fit[2]
lane_center = (leftline_intercept + rightline_intercept) /2
- The vehicle's position from the center is calculated by taking the absolute value of the vehicle position minus the halfway point along the horizontal axis
self.vehicle_pos = (vehicle_position - lane_center) * self.xm_per_pix
- Since the image positions and lane center are in pixel space, it is multiplied with the
xm_per_pix
scaling to convert them to real-world coordinates.
The link to the output videos could be found here. The pipeline worked reasonably well for project_video.mp4 and project_video.mp4. Whenever the vehicle comes out of a low contrast road, it wobbles, but recovers after few frames. This shows that my smoothing works well, though i could think of some better techniques for the next projects.