OpenCV Machine Vision¶
New for 2025.
OpenCV is an open-source package providing a wide variety of low-level machine vision algorithms. It is widely available on many platforms and is optimized for speed. The underlying code is in C++ but the Python bindings unify the image data types with the Python numpy API very conveniently. The library hosts a great variety of algorithms, but the documentation can be spotty in places and require some understanding of the underlying C++ notation. Nevertheless, there are many tutorials and sample codes to assist with gettings started.
A notable alternative is the scikit-image library. This library is part of the SciPy project, has excellent documentation, and is written in native Python built on numpy. However, our project goal centers on real-time control using camera input so we will stick with the more optimized OpenCV.
The sample code included here may be downloaded as opencv_examples.zip or browsed in Python/opencv_examples.
Installation Notes¶
We will only use the Python API for OpenCV, so the simplest way to install it is
using pip as detailed under Python 3 Installation. On most platforms this
will import efficient pre-compiled binaries and install all required package
dependencies.
The specifics of this process may vary with your installation. Note that the recommended practice is to set up a virtual environment so the package and its dependencies can be kept locally, but that is outside the scope of these instructions.
OpenCV Documentation¶
The documentation index for the complete system can be found at https://docs.opencv.org/4.x/
Please note that the Python API documentation is included alongside the C++ documentation; it typically appears below the corresponding C++ documentation. Please note that the internal links on site are not especially stable, so the following sample links may become stale.
Some code samples from previous iterations of this course can be found under Camera Input.
Command Line Usage¶
All of the demonstration scripts are intended to be run from a command line. On macOS, this is typically the Terminal application, on Windows the Console, on Linux a shell. If you are running the sample code from within an IDE (e.g. Visual Studio/VS Code), you may need to figure out how to specify the current working directory and command line arguments.
Each sample has defaults so it can be run without any additional parameters. For example:
python3 capture_sequence.py
On your installation, the interpreter might be named python intead of python3:
python3 capture_sequence.py
On macOS or Linux, a ‘shebang’ is included you should be able to run it directly if the script has execute permissions:
./capture_sequence.py
All the samples include argparse support for command line interpretation. The
specific options available can be displayed using the --help option:
python3 capture_sequence.py --help
So for example, the --verbose option will enable more console diagnostic text, and an output filename can be included:
python3 capture_sequence.py --verbose myfilename.avi
Example: Capturing a Video Frame¶
This sample opens the default camera, waits for a single frame, and writes it to a file.
1#!/usr/bin/env python3
2"""capture_frame.py : capture a single camera image and save in a file
3"""
4import argparse
5import numpy as np
6import cv2 as cv
7
8#================================================================
9def read_one_frame(capture, args):
10 """Wait for the first available frame from a camera."""
11 while True:
12 success, frame = capture.read()
13 if success:
14 if args.verbose:
15 print("read image of shape %s" % (str(frame.shape)))
16 return frame
17
18 # leave the while loop
19 break
20
21#================================================================
22if __name__ == "__main__":
23 parser = argparse.ArgumentParser( description = "Open a camera and save a single video frame.")
24 parser.add_argument('--camera', default=0, type=int, help="Select camera by number (default: %(default)s).")
25 parser.add_argument('--verbose', action='store_true', help='Enable more detailed diagnostic output.' )
26 parser.add_argument('out', default="frame.png", type=str, nargs='?', help="Specify output imagefile name (default: %(default)s).")
27 args = parser.parse_args()
28
29 capture = cv.VideoCapture(args.camera)
30
31 if not capture.isOpened():
32 print("Unable to open camera, exiting.")
33 exit(1)
34
35 try:
36 frame = read_one_frame(capture, args)
37
38 except KeyboardInterrupt:
39 print ("User quit.")
40 capture.release()
41 exit()
42
43 # flip the image left to right for 'selfie' mode
44 # frame = frame[:,::-1]
45
46 # reduce image size and flip
47 # frame = frame[::2,::-2]
48
49 cv.imwrite(args.out, frame)
50 print("wrote %s" % (args.out))
51
52 capture.release()
Example: Capturing a Video Sequence¶
This sample opens the default camera, then captures frames to a file until the user interrupts the process.
1#!/usr/bin/env python3
2"""capture_sequence.py : capture a series of camera images and save to an AVI file
3"""
4import argparse, time
5import numpy as np
6import cv2 as cv
7
8#================================================================
9def read_one_frame(capture, args):
10 """Wait for the first available frame from a camera or file."""
11 while True:
12 success, frame = capture.read()
13 if success:
14 if args.verbose:
15 print("read image of shape %s" % (str(frame.shape)))
16 return frame
17
18#================================================================
19def open_video_output(frame, args, frame_rate=30):
20 """Open a video output file sized for the given sample frame. The frame is not written."""
21
22 codec_code = cv.VideoWriter.fourcc(*'MJPG') # motion JPEG
23 frame_width = frame.shape[1] # cols
24 frame_height = frame.shape[0] # rows
25 return cv.VideoWriter(args.out, codec_code, frame_rate, (frame_width, frame_height))
26
27#================================================================
28if __name__ == "__main__":
29 parser = argparse.ArgumentParser( description = "Save camera frames to a file file.")
30 parser.add_argument('--camera', default=0, type=int, help="Select camera by number (default: %(default)s).")
31 parser.add_argument('-q','--quiet', action='store_true', help="Run without viewer window or console output.")
32 parser.add_argument('--verbose', action='store_true', help='Enable even more detailed logging output.' )
33 parser.add_argument('out', default="frames.avi", type=str, nargs='?', help="Specify output AVI file name (default: %(default)s).")
34 args = parser.parse_args()
35
36 capture = cv.VideoCapture(args.camera)
37
38 if not capture.isOpened():
39 print("Unable to open camera, exiting.")
40 exit(1)
41
42 if not args.quiet:
43 cv.namedWindow("capture")
44
45 # the output file will be created after the frame size is known
46 output = None
47
48 try:
49 while True:
50 frame = read_one_frame(capture, args)
51
52 if not args.quiet:
53 cv.imshow("capture", frame) # display the frame
54
55 if output is None:
56 output = open_video_output(frame, args)
57
58 if output is not None:
59 output.write(frame)
60 if args.verbose:
61 print("wrote frame to file.")
62
63 # If using the GUI, calling waitKey allows the window system event loop to run and update
64 # the display. The waitKey argument is in milliseconds.
65 if not args.quiet:
66 key = cv.waitKey(33) & 0xFF
67 if key == ord('q'):
68 break
69
70 except KeyboardInterrupt:
71 print ("User quit.")
72
73 output.release()
74 capture.release()
Example: Processing Video to Video¶
This sample reads frames from one video file, applies optional transformations, and writes new video file.
1#!/usr/bin/env python3
2"""process_sequence.py : process a series of images from an input video to output video
3"""
4import argparse, time
5import numpy as np
6import cv2 as cv
7
8#================================================================
9def open_video_output(frame, args, frame_rate=30):
10 """Open a video output file sized for the given sample frame. The frame is not written."""
11
12 codec_code = cv.VideoWriter.fourcc(*'MJPG') # motion JPEG
13 frame_width = frame.shape[1] # cols
14 frame_height = frame.shape[0] # rows
15 return cv.VideoWriter(args.out, codec_code, frame_rate, (frame_width, frame_height))
16
17#================================================================
18if __name__ == "__main__":
19 parser = argparse.ArgumentParser( description = "Process one video file to another.")
20 parser.add_argument('-q','--quiet', action='store_true', help="Run without viewer window or console output.")
21 parser.add_argument('--verbose', action='store_true', help='Enable even more detailed logging output.' )
22 parser.add_argument('input', default="frames.avi", type=str, nargs='?', help="Specify input AVI file name (default: %(default)s).")
23 parser.add_argument('out', default="processed.avi", type=str, nargs='?', help="Specify output AVI file name (default: %(default)s).")
24
25 # various processing options follow
26 parser.add_argument( '-s','--saturation', action='store_true', help='Use only saturation channel.')
27
28 args = parser.parse_args()
29
30 capture = cv.VideoCapture(args.input)
31
32 if not capture.isOpened():
33 print("Unable to open input file, exiting.")
34 exit(1)
35
36 if not args.quiet:
37 cv.namedWindow("capture")
38
39 # the output file will be created after the frame size is known
40 output = None
41
42 try:
43 while True:
44 success, frame = capture.read()
45 if not success or frame is None:
46 break
47
48 if args.verbose:
49 print("read image of shape %s" % (str(frame.shape)))
50
51 # reduce the frame size by skipping pixels
52 frame = frame[::2,::2]
53
54 if args.saturation:
55 # transform the image
56 frame = cv.cvtColor(frame, cv.COLOR_BGR2HSV)
57 # pick out saturation and convert back to color image
58 frame = frame[:,:,1]
59 frame = cv.cvtColor(frame, cv.COLOR_GRAY2BGR)
60
61 if not args.quiet:
62 cv.imshow("capture", frame) # display the frame
63
64 if output is None:
65 output = open_video_output(frame, args)
66
67 if output is not None:
68 output.write(frame)
69 if args.verbose:
70 print("wrote frame to file.")
71
72 # If using the GUI, calling waitKey allows the window system event loop to run and update
73 # the display. The waitKey argument is in milliseconds.
74 if not args.quiet:
75 key = cv.waitKey(33) & 0xFF
76 if key == ord('q'):
77 break
78
79 except KeyboardInterrupt:
80 print ("User quit.")
81
82 output.release()
83 capture.release()
Example: Video Filter Demos¶
The following samples use a common library for I/O to simplify implementation of simple video filters. Each sample is built around a single callback function which transforms a single frame of video.
The common code is documented on Open CV Support Library.
Example: Subsample Reduction¶
1#!/usr/bin/env python
2# demo file processor to reduce frame size
3
4import argparse
5import rcpcv.demo
6
7#================================================================
8# Callback to process a single video frame, returning a new frame.
9# The args value is the result of processing the command line arguments.
10def reduce_frame(frame, args):
11 step = args.step
12
13 # return a matrix view of the frame skipping the specific number of rows and columns
14 return frame[::step, ::step]
15
16#================================================================
17if __name__ == "__main__":
18 parser = argparse.ArgumentParser( description = "Demo to reduce frame size.")
19 parser.add_argument('--step', default=2, type=int, help="reduction step size (default: %(default)s)")
20 args = rcpcv.demo.parse_args(parser)
21 rcpcv.demo.run_filter(args, reduce_frame)
Example: Cropping¶
1#!/usr/bin/env python
2# demo of cropping a fixed region
3
4import argparse
5import rcpcv.demo
6
7def crop_frame(frame, args):
8 rows, cols, channels = frame.shape
9 first_row = rows//2 - args.height//2
10 first_col = cols//2 - args.width//2
11 return frame[first_row:first_row+args.height, first_col:first_col+args.width]
12
13if __name__ == "__main__":
14 parser = argparse.ArgumentParser(description = "Demo to blend multiple frames.")
15 parser.add_argument('--width', default=120, type=int, help="crop width in pixels (default: %(default)s)")
16 parser.add_argument('--height', default=120, type=int, help="crop height in pixels (default: %(default)s)")
17 args = rcpcv.demo.parse_args(parser)
18 rcpcv.demo.run_filter(args, crop_frame)
Example: Blurring¶
This sample accepts command line arguments to select various blurring filters.
1#!/usr/bin/env python
2# demo of blurring image without reduction
3
4import argparse
5import cv2 as cv
6import rcpcv.demo
7
8def blur_frame(frame, args):
9 if args.gaussian:
10 return cv.GaussianBlur(frame, ksize=(15, 15), sigmaX=0)
11
12 elif args.median:
13 return cv.medianBlur(frame, ksize=15)
14
15 else:
16 # default is a normalized box blur
17 return cv.blur(frame, ksize=(15,15))
18
19if __name__ == "__main__":
20 parser = argparse.ArgumentParser( description = "Demo to blur images.")
21 parser.add_argument('--gaussian',action='store_true', help="Apply Gaussian blurring (default: %(default)s)")
22 parser.add_argument('--median',action='store_true', help="Apply median blurring (default: %(default)s)")
23 args = rcpcv.demo.parse_args(parser)
24 rcpcv.demo.run_filter(args, blur_frame)
Example: Background Subtraction¶
1#!/usr/bin/env python
2# demo file processor to remove background
3
4import argparse
5import cv2 as cv
6import rcpcv.demo
7
8bgs = cv.createBackgroundSubtractorMOG2()
9
10#================================================================
11def remove_frame_background(frame, args):
12 # update the background subtractor and obtain a new foreground mask
13 mask = bgs.apply(frame)
14
15 # convert the mask back to RGB and apply it the source image
16 binary = cv.cvtColor(mask, cv.COLOR_GRAY2BGR)
17 return frame * binary
18
19#================================================================
20if __name__ == "__main__":
21 parser = argparse.ArgumentParser( description = "Demo to remove background.")
22 args = rcpcv.demo.parse_args(parser)
23 rcpcv.demo.run_filter(args, remove_frame_background)
Example: Temporal Smoothing¶
This sample implements a first-order low-pass filter to smooth a frame sequence over time.
1#!/usr/bin/env python
2# demo of blending images into an accumulator to smooth across time
3
4import argparse
5import cv2 as cv
6import rcpcv.demo
7
8# global frame accumulator
9average = None
10
11def blend_frame(frame, args):
12 global average
13
14 if average is None:
15 average = frame * 0
16
17 else:
18 average = cv.addWeighted(average, 0.9, frame, 0.1, 0.0)
19 return average
20
21if __name__ == "__main__":
22 parser = argparse.ArgumentParser(description = "Demo to blend multiple frames.")
23 args = rcpcv.demo.parse_args(parser)
24 rcpcv.demo.run_filter(args, blend_frame)
Example: Pixellation¶
A visual effect to make an image blocky.
1#!/usr/bin/env python
2
3import argparse
4import cv2 as cv
5import rcpcv.demo
6
7def blocky_frame(frame, args):
8 rows, cols, channels = frame.shape
9 small = cv.resize(frame, dsize=(cols//32,rows//32))
10 return cv.resize(small, dsize=(cols, rows), interpolation=cv.INTER_NEAREST)
11
12if __name__ == "__main__":
13 parser = argparse.ArgumentParser( description = "Demo to reduce frame size.")
14 args = rcpcv.demo.parse_args(parser)
15 rcpcv.demo.run_filter(args, blocky_frame)
Example: Contouring¶
Identifies and labels a set of contours in an image. This can be an important step for region extraction or object identification.
1#!/usr/bin/env python
2
3import argparse
4import cv2 as cv
5import rcpcv.demo
6
7def contour_frame(frame, args):
8 bw = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
9 ret, thresholded = cv.threshold(bw, 127, 255, cv.THRESH_BINARY)
10 dilated = cv.dilate(thresholded, None, iterations=0)
11 contours, hierarchy = cv.findContours(dilated, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)
12 for idx,contour in enumerate(contours):
13 cv.drawContours(frame, contours, idx, (128,255,255))
14 return frame
15
16if __name__ == "__main__":
17 parser = argparse.ArgumentParser( description = "Demo to reduce frame size.")
18 args = rcpcv.demo.parse_args(parser)
19 rcpcv.demo.run_filter(args, contour_frame)
Example: Circle Finding¶
Identifies and labels a set of small circles in an image. This can be an important step for region extraction or object identification.
1#!/usr/bin/env python
2
3import argparse
4import cv2 as cv
5import numpy as np
6import rcpcv.demo
7
8def highlight_circles(frame, args):
9 gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
10
11 # normalize levels to full range of pixel values
12 gray = cv.normalize(gray, None, 0, 255, cv.NORM_MINMAX, dtype=cv.CV_8U)
13
14 # reduce noise
15 blur = cv.medianBlur(gray,5)
16
17 # find circles
18 circles = cv.HoughCircles(blur,
19 method=cv.HOUGH_GRADIENT,
20 dp=1, # accumulator has same resolution as the image
21 minDist=8, # minimum center to center distance
22 param1=200, # with HOUGH_GRADIENT, Canny edge threshold
23 param2=20, # with HOUGH_GRADIENT, accumulator threshold
24 minRadius=0,
25 maxRadius=50) # limit to small circles
26 if circles is not None:
27 # the result seems to be a 1 x num-circles x 3 matrix
28 # this reshapes it to a 2D matrix, each row is then [x y r]
29 num_circles = circles.shape[1]
30 circles = circles.reshape((num_circles, 3))
31
32 # discretize circle positions to integer pixel coordinates
33 rounded = np.uint16(np.round(circles))
34
35 # throw away excess circles
36 if num_circles > 150:
37 rounded = rounded[0:150]
38
39 # draw each circle in frame
40 for i in rounded:
41 # draw the perimeter in green
42 cv.circle(frame,(i[0],i[1]),i[2],(0,255,0),2)
43
44 return frame
45
46if __name__ == "__main__":
47 parser = argparse.ArgumentParser( description = "Demo to find circles.")
48 args = rcpcv.demo.parse_args(parser)
49 rcpcv.demo.run_filter(args, highlight_circles)