OpenCV Machine Vision ¶

New for 2025.

OpenCV is an open-source package providing a wide variety of low-level machine vision algorithms. It is widely available on many platforms and is optimized for speed. The underlying code is in C++ but the Python bindings unify the image data types with the Python numpy API very conveniently. The library hosts a great variety of algorithms, but the documentation can be spotty in places and require some understanding of the underlying C++ notation. Nevertheless, there are many tutorials and sample codes to assist with gettings started.

A notable alternative is the scikit-image library. This library is part of the SciPy project, has excellent documentation, and is written in native Python built on numpy. However, our project goal centers on real-time control using camera input so we will stick with the more optimized OpenCV.

The sample code included here may be downloaded as opencv_examples.zip or browsed in Python/opencv_examples.

Installation Notes ¶

We will only use the Python API for OpenCV, so the simplest way to install it is using pip as detailed under Python 3 Installation. On most platforms this will import efficient pre-compiled binaries and install all required package dependencies.

The specifics of this process may vary with your installation. Note that the recommended practice is to set up a virtual environment so the package and its dependencies can be kept locally, but that is outside the scope of these instructions.

OpenCV Documentation ¶

The documentation index for the complete system can be found at https://docs.opencv.org/4.x/

Please note that the Python API documentation is included alongside the C++ documentation; it typically appears below the corresponding C++ documentation. Please note that the internal links on site are not especially stable, so the following sample links may become stale.

Some code samples from previous iterations of this course can be found under Camera Input.

Command Line Usage ¶

All of the demonstration scripts are intended to be run from a command line. On macOS, this is typically the Terminal application, on Windows the Console, on Linux a shell. If you are running the sample code from within an IDE (e.g. Visual Studio/VS Code), you may need to figure out how to specify the current working directory and command line arguments.

Each sample has defaults so it can be run without any additional parameters. For example:

python3 capture_sequence.py

On your installation, the interpreter might be named python intead of python3:

python3 capture_sequence.py

On macOS or Linux, a ‘shebang’ is included you should be able to run it directly if the script has execute permissions:

./capture_sequence.py

All the samples include argparse support for command line interpretation. The specific options available can be displayed using the --help option:

python3 capture_sequence.py --help

So for example, the --verbose option will enable more console diagnostic text, and an output filename can be included:

python3 capture_sequence.py --verbose myfilename.avi

Example: Capturing a Video Frame ¶

This sample opens the default camera, waits for a single frame, and writes it to a file.

#!/usr/bin/env python3
"""capture_frame.py : capture a single camera image and save in a file
"""
import argparse
import numpy as np
import cv2 as cv

#================================================================
def read_one_frame(capture, args):
    """Wait for the first available frame from a camera."""
    while True:
        success, frame = capture.read()
        if success:
            if args.verbose:
                print("read image of shape %s" % (str(frame.shape)))
            return frame

        # leave the while loop
        break

#================================================================
if __name__ == "__main__":
    parser = argparse.ArgumentParser( description = "Open a camera and save a single video frame.")
    parser.add_argument('--camera', default=0, type=int, help="Select camera by number (default: %(default)s).")
    parser.add_argument('--verbose', action='store_true', help='Enable more detailed diagnostic output.' )
    parser.add_argument('out', default="frame.png", type=str, nargs='?', help="Specify output imagefile name (default: %(default)s).")
    args = parser.parse_args()

    capture = cv.VideoCapture(args.camera)

    if not capture.isOpened():
        print("Unable to open camera, exiting.")
        exit(1)

    try:
        frame = read_one_frame(capture, args)

    except KeyboardInterrupt:
        print ("User quit.")
        capture.release()
        exit()

    # flip the image left to right for 'selfie' mode
    # frame = frame[:,::-1]

    # reduce image size and flip
    # frame = frame[::2,::-2]

    cv.imwrite(args.out, frame)
    print("wrote %s" % (args.out))

    capture.release()

Example: Capturing a Video Sequence ¶

This sample opens the default camera, then captures frames to a file until the user interrupts the process.

#!/usr/bin/env python3
"""capture_sequence.py : capture a series of camera images and save to an AVI file
"""
import argparse, time
import numpy as np
import cv2 as cv

#================================================================
def read_one_frame(capture, args):
    """Wait for the first available frame from a camera or file."""
    while True:
        success, frame = capture.read()
        if success:
            if args.verbose:
                print("read image of shape %s" % (str(frame.shape)))
            return frame

#================================================================
def open_video_output(frame, args, frame_rate=30):
    """Open a video output file sized for the given sample frame.  The frame is not written."""

    codec_code = cv.VideoWriter.fourcc(*'MJPG')  # motion JPEG
    frame_width = frame.shape[1]   # cols
    frame_height = frame.shape[0]  # rows
    return cv.VideoWriter(args.out, codec_code, frame_rate, (frame_width, frame_height))

#================================================================
if __name__ == "__main__":
    parser = argparse.ArgumentParser( description = "Save camera frames to a file file.")
    parser.add_argument('--camera', default=0, type=int, help="Select camera by number (default: %(default)s).")
    parser.add_argument('-q','--quiet', action='store_true', help="Run without viewer window or console output.")
    parser.add_argument('--verbose', action='store_true', help='Enable even more detailed logging output.' )
    parser.add_argument('out', default="frames.avi", type=str, nargs='?', help="Specify output AVI file name (default: %(default)s).")
    args = parser.parse_args()

    capture = cv.VideoCapture(args.camera)

    if not capture.isOpened():
        print("Unable to open camera, exiting.")
        exit(1)

    if not args.quiet:
        cv.namedWindow("capture")

    # the output file will be created after the frame size is known
    output = None

    try:
        while True:
            frame = read_one_frame(capture, args)

            if not args.quiet:
                cv.imshow("capture", frame) # display the frame

            if output is None:
                output = open_video_output(frame, args)

            if output is not None:
                output.write(frame)
                if args.verbose:
                    print("wrote frame to file.")

            # If using the GUI, calling waitKey allows the window system event loop to run and update
            # the display.  The waitKey argument is in milliseconds.
            if not args.quiet:
                key = cv.waitKey(33) & 0xFF
                if key == ord('q'):
                    break

    except KeyboardInterrupt:
        print ("User quit.")

    output.release()
    capture.release()

Example: Processing Video to Video ¶

This sample reads frames from one video file, applies optional transformations, and writes new video file.

#!/usr/bin/env python3
"""process_sequence.py : process a series of images from an input video to output video
"""
import argparse, time
import numpy as np
import cv2 as cv

#================================================================
def open_video_output(frame, args, frame_rate=30):
    """Open a video output file sized for the given sample frame.  The frame is not written."""

    codec_code = cv.VideoWriter.fourcc(*'MJPG')  # motion JPEG
    frame_width = frame.shape[1]   # cols
    frame_height = frame.shape[0]  # rows
    return cv.VideoWriter(args.out, codec_code, frame_rate, (frame_width, frame_height))

#================================================================
if __name__ == "__main__":
    parser = argparse.ArgumentParser( description = "Process one video file to another.")
    parser.add_argument('-q','--quiet', action='store_true', help="Run without viewer window or console output.")
    parser.add_argument('--verbose', action='store_true', help='Enable even more detailed logging output.' )
    parser.add_argument('input', default="frames.avi", type=str, nargs='?', help="Specify input AVI file name (default: %(default)s).")
    parser.add_argument('out', default="processed.avi", type=str, nargs='?', help="Specify output AVI file name (default: %(default)s).")

    # various processing options follow
    parser.add_argument( '-s','--saturation', action='store_true', help='Use only saturation channel.')

    args = parser.parse_args()

    capture = cv.VideoCapture(args.input)

    if not capture.isOpened():
        print("Unable to open input file, exiting.")
        exit(1)

    if not args.quiet:
        cv.namedWindow("capture")

    # the output file will be created after the frame size is known
    output = None

    try:
        while True:
            success, frame = capture.read()
            if not success or frame is None:
                break

            if args.verbose:
                print("read image of shape %s" % (str(frame.shape)))

            # reduce the frame size by skipping pixels
            frame = frame[::2,::2]

            if args.saturation:
                # transform the image
                frame = cv.cvtColor(frame, cv.COLOR_BGR2HSV)
                # pick out saturation and convert back to color image
                frame = frame[:,:,1]
                frame = cv.cvtColor(frame, cv.COLOR_GRAY2BGR)

            if not args.quiet:
                cv.imshow("capture", frame) # display the frame

            if output is None:
                output = open_video_output(frame, args)

            if output is not None:
                output.write(frame)
                if args.verbose:
                    print("wrote frame to file.")

            # If using the GUI, calling waitKey allows the window system event loop to run and update
            # the display.  The waitKey argument is in milliseconds.
            if not args.quiet:
                key = cv.waitKey(33) & 0xFF
                if key == ord('q'):
                    break

    except KeyboardInterrupt:
        print ("User quit.")

    output.release()
    capture.release()

Example: Video Filter Demos ¶

The following samples use a common library for I/O to simplify implementation of simple video filters. Each sample is built around a single callback function which transforms a single frame of video.

The common code is documented on Open CV Support Library.

Example: Subsample Reduction ¶

#!/usr/bin/env python
# demo  file processor to reduce frame size

import argparse
import rcpcv.demo

#================================================================
# Callback to process a single video frame, returning a new frame.
# The args value is the result of processing the command line arguments.
def reduce_frame(frame, args):
    step = args.step

    # return a matrix view of the frame skipping the specific number of rows and columns
    return frame[::step, ::step]

#================================================================
if __name__ == "__main__":
    parser = argparse.ArgumentParser( description = "Demo to reduce frame size.")
    parser.add_argument('--step', default=2, type=int, help="reduction step size (default: %(default)s)")
    args = rcpcv.demo.parse_args(parser)
    rcpcv.demo.run_filter(args, reduce_frame)

Example: Cropping ¶

#!/usr/bin/env python
# demo of cropping a fixed region

import argparse
import rcpcv.demo

def crop_frame(frame, args):
    rows, cols, channels = frame.shape
    first_row   = rows//2 - args.height//2
    first_col   = cols//2 - args.width//2
    return frame[first_row:first_row+args.height, first_col:first_col+args.width]

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description = "Demo to blend multiple frames.")
    parser.add_argument('--width', default=120, type=int, help="crop width in pixels (default: %(default)s)")
    parser.add_argument('--height', default=120, type=int, help="crop height in pixels (default: %(default)s)")
    args = rcpcv.demo.parse_args(parser)
    rcpcv.demo.run_filter(args, crop_frame)

Example: Blurring ¶

This sample accepts command line arguments to select various blurring filters.

#!/usr/bin/env python
# demo of blurring image without reduction

import argparse
import cv2 as cv
import rcpcv.demo

def blur_frame(frame, args):
    if args.gaussian:
        return cv.GaussianBlur(frame, ksize=(15, 15), sigmaX=0)

    elif args.median:
        return cv.medianBlur(frame, ksize=15)

    else:
        # default is a normalized box blur
        return cv.blur(frame, ksize=(15,15))

if __name__ == "__main__":
    parser = argparse.ArgumentParser( description = "Demo to blur images.")
    parser.add_argument('--gaussian',action='store_true', help="Apply Gaussian blurring (default: %(default)s)")
    parser.add_argument('--median',action='store_true', help="Apply median blurring (default: %(default)s)")
    args = rcpcv.demo.parse_args(parser)
    rcpcv.demo.run_filter(args, blur_frame)

Example: Background Subtraction ¶

#!/usr/bin/env python
# demo file processor to remove background

import argparse
import cv2 as cv
import rcpcv.demo

bgs = cv.createBackgroundSubtractorMOG2()

#================================================================
def remove_frame_background(frame, args):
    # update the background subtractor and obtain a new foreground mask
    mask = bgs.apply(frame)

    # convert the mask back to RGB and apply it the source image
    binary = cv.cvtColor(mask, cv.COLOR_GRAY2BGR)
    return frame * binary

#================================================================
if __name__ == "__main__":
    parser = argparse.ArgumentParser( description = "Demo to remove background.")
    args = rcpcv.demo.parse_args(parser)
    rcpcv.demo.run_filter(args, remove_frame_background)

Example: Temporal Smoothing ¶

This sample implements a first-order low-pass filter to smooth a frame sequence over time.

#!/usr/bin/env python
# demo of blending images into an accumulator to smooth across time

import argparse
import cv2 as cv
import rcpcv.demo

# global frame accumulator
average = None

def blend_frame(frame, args):
    global average

    if average is None:
        average = frame * 0

    else:
        average = cv.addWeighted(average, 0.9, frame, 0.1, 0.0)
    return average

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description = "Demo to blend multiple frames.")
    args = rcpcv.demo.parse_args(parser)
    rcpcv.demo.run_filter(args, blend_frame)

Example: Pixellation ¶

A visual effect to make an image blocky.

#!/usr/bin/env python

import argparse
import cv2 as cv
import rcpcv.demo

def blocky_frame(frame, args):
    rows, cols, channels = frame.shape
    small = cv.resize(frame, dsize=(cols//32,rows//32))
    return cv.resize(small, dsize=(cols, rows), interpolation=cv.INTER_NEAREST)

if __name__ == "__main__":
    parser = argparse.ArgumentParser( description = "Demo to reduce frame size.")
    args = rcpcv.demo.parse_args(parser)
    rcpcv.demo.run_filter(args, blocky_frame)

Example: Contouring ¶

Identifies and labels a set of contours in an image. This can be an important step for region extraction or object identification.

#!/usr/bin/env python

import argparse
import cv2 as cv
import rcpcv.demo

def contour_frame(frame, args):
    bw = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
    ret, thresholded = cv.threshold(bw, 127, 255, cv.THRESH_BINARY)
    dilated = cv.dilate(thresholded, None, iterations=0)
    contours, hierarchy = cv.findContours(dilated, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)
    for idx,contour in enumerate(contours):
        cv.drawContours(frame, contours, idx, (128,255,255))
    return frame

if __name__ == "__main__":
    parser = argparse.ArgumentParser( description = "Demo to reduce frame size.")
    args = rcpcv.demo.parse_args(parser)
    rcpcv.demo.run_filter(args, contour_frame)

Example: Circle Finding ¶

Identifies and labels a set of small circles in an image. This can be an important step for region extraction or object identification.

#!/usr/bin/env python

import argparse
import cv2 as cv
import numpy as np
import rcpcv.demo

def highlight_circles(frame, args):
    gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)

    # normalize levels to full range of pixel values
    gray = cv.normalize(gray, None, 0, 255, cv.NORM_MINMAX, dtype=cv.CV_8U)

    # reduce noise
    blur = cv.medianBlur(gray,5)

    # find circles
    circles = cv.HoughCircles(blur,
                              method=cv.HOUGH_GRADIENT,
                              dp=1,        # accumulator has same resolution as the image
                              minDist=8,   # minimum center to center distance
                              param1=200,  # with HOUGH_GRADIENT, Canny edge threshold
                              param2=20,   # with HOUGH_GRADIENT, accumulator threshold
                              minRadius=0,
                              maxRadius=50) # limit to small circles
    if circles is not None:
        # the result seems to be a 1 x num-circles x 3 matrix
        # this reshapes it to a 2D matrix, each row is then [x y r]
        num_circles = circles.shape[1]
        circles = circles.reshape((num_circles, 3))

        # discretize circle positions to integer pixel coordinates
        rounded = np.uint16(np.round(circles))

        # throw away excess circles
        if num_circles > 150:
            rounded = rounded[0:150]

        # draw each circle in frame
        for i in rounded:
            # draw the perimeter in green
            cv.circle(frame,(i[0],i[1]),i[2],(0,255,0),2)

    return frame

if __name__ == "__main__":
    parser = argparse.ArgumentParser( description = "Demo to find circles.")
    args = rcpcv.demo.parse_args(parser)
    rcpcv.demo.run_filter(args, highlight_circles)