Category Archives: Uncategorized

Finding rotation, translation and scale between two noisy 3D point sets [w/ code]

Hello everyone. In this post I present my implementation for computing the RIGID rotation, translation and scale between two 3D point sets (code at the bottom!).

The function uses Horn’s absolute orientation method exploiting a quaternion, as explained in his paper. There are also some unit quaternion notes that are interesting for further understanding the math behind it here. Finally, I got inspiration for this from ‘Matt J’ Matlab implementation, that can be found here.

After trying some porting and cleaning of the code, I decided to give it a shot and wrote fresh new code. The function also computes the eigen values and eigen vector of the constructed quaternion, thus in the file are also included two little function for this: one computes the 4 real eigen values of any 4×4 symmetric matrix, the second computes the eigen vector for a given matrix and eigen value.

The input of the function are TWO set of CORRESPONDING 3D points, there is a version using OpenCV cv::Point3f structure and one that takes a vector of float (arranged in the x1,y1,z1 x2,y2,z2,… fashion).

  • Input:
    • src: a vector of 3D points, which must be in float precision
    • dst: correspoinding 3D points to align the first set to
    • out: reference to the output data structure
  • Output:
    • src_to_dst: the src points mapped to fit the dst set
    • rotation: 3×3 rotation matrix between the two points sets
    • translation: 3D translation vector between the two points sets
    • scale: scalar representing the scale between src and dst
    • error_lsq: least squares error
    • error_max: maximum error (euclidean distance between corrispondence points)

On a single thread running on a i7-5700HQ @ 2.70GHz, medium execution time is around 0.01 ms. To provide reference, here is the result of the function against the OpenCV function estimateAffine3D:

horn_example (2)horn_example (1)

Here the code file.

Advertisements

Statistical Outlier Removal with OpenCV and FLANN [w/ code]

Hello everyone: as I often manage 3D point clouds in my software, I found helpful filtering them with a statistical outlier removal filter (SOR) as the one found in the famous PCL libraries.

This little function has various parameters:

  • Input: a vector of 3d points, which must be in float precision.
  • Output: a vector of boolean indicating whether or not a point is an outlier (it is when the corresponding boolean value is FALSE)
  • Variables:
    • nn -> Nearest Neighbor to be computed
    • std_threshold -> the multiplication factor of the standard deviation
    • app_steps -> number of time that the filter must run, useful when dealing with very noisy data (use a high threshold and repeat the filter multiple times)
    • exact -> whether or not the computation of nearest neighbor must be perfect or can be done with the FLANN library (thus obtaining a huge speed up)
    • flann_checks -> number of time the flann routine will check to be in the right path: it can be used as a parameter to improve the results of the nearest neighbor, that is not exact most of the time.

I’ve run the code for various amount of points, finding out that with the predefined variables (1 filter step and 160 as a flann_check), when having less than 1400 points, the exact option gives best speed performance and assures a better solution.

Here the code file and the output of the exact versus the flann search.

std::vector<bool> SOR(std::vector<cv::Point3f> p, int nn, double std_threshold, int app_steps = 1, bool exact = true, int flann_checks = 160)
{
 std::vector<bool> ret(p.size());
 std::vector<float> dist1(p.size());
 std::vector<float> dist2(p.size());

 std::vector<int> indices(nn + 1);
 std::vector<float> distances(nn + 1);
 cv::Mat point(1, 3, CV_32FC1);

 cv::flann::Index kd_tree(cv::Mat(p).reshape(1), cv::flann::KDTreeIndexParams(), cvflann::FLANN_DIST_L2);

 if (exact)
 {
 for (int i = 0; i < p.size(); i++)
 {
 ret[i] = true;

 for (int j = 0; j < p.size(); j++)
 {
 if (j == i)
 dist2[j] = 0.0f;
 else
 dist2[j] = cv::norm(p[i] - p[j]);
 }

 std::sort(dist2.begin(), dist2.end());

 dist1[i] = 0.0;
 for (int j = 1; j < nn + 1; j++)
 dist1[i] += dist2[j];
 dist1[i] /= nn;
 }
 }
 else
 {
 for (int i = 0; i < p.size(); i++)
 {
 ret[i] = true;
 point.at<float>(0) = p[i].x;
 point.at<float>(1) = p[i].y;
 point.at<float>(2) = p[i].z;
 kd_tree.knnSearch(point, indices, distances, nn + 1, cv::flann::SearchParams(flann_checks));

 dist1[i] = 0.0f;
 for (int j = 1; j < nn + 1; j++)
 dist1[i] += sqrtf(distances[j]);
 dist1[i] /= (float)nn;
 }
 }

 for (int j = 0; j < app_steps; j++)
 {
 cv::Mat mean_m, stddev_m;
 cv::meanStdDev(dist1, mean_m, stddev_m, ret);
 float distance_threshold = mean_m.at<double>(0) + std_threshold * stddev_m.at<double>(0);

 for (int i = 0; i < p.size(); i++)
 {
 if (ret[i])
 ret[i] = dist1[i] < distance_threshold;
 }
 }

 return ret;
}

 

OpenCV C++ Video to Images [/w code]

Hello, here I would like to present you an easy way to perform conversion from videos (in my case .mp4, but works with any format supported by OpenCV) to images.

You can either compile it and modify it as you please, or download and use this windows binaries (note you will need Microsoft c++ Redistributable installed)

Download binaries or sources!

Sample code follows:

// Alvise Memo, 12/02/2016

#include <opencv2\opencv.hpp>

int main(int argc, char ** argv)
{
 std::cout << "VideoToPNGs w/ OpenCV, Alvise Memo, 12/02/2016" << std::endl;

 if (argc < 3)
 {
 std::cout << "Usage: VideoToPNGs.exe pathtovideoinput.avi pathtooutput" << std::endl;
 std::getchar();
 return 1;
 }

 std::string input_file = argv[1];
 std::string output_folder = argv[2];
 
 std::cout << "Opening input file: " << input_file << std::endl;
 cv::VideoCapture vc;
 vc.open(input_file);
 assert(vc.isOpened() && "Input file DID NOT open correctly");

 cv::Mat input_frame;
 int frame_index = 0;
 while(vc.read(input_frame))
 {
 std::cout << "Reading frame " << frame_index << std::flush;

 std::string frame_name = std::to_string(frame_index);
 while (frame_name.length() < 5) frame_name = "0" + frame_name;

 bool saving = cv::imwrite(output_folder + "\\" + input_file + "_" + frame_name + ".bmp", input_frame);

 std::cout << " saved (" << saving << ")!" << std::endl;
 frame_index++;
 }
 vc.release();

 std::cout << "Exported " << frame_index << "frame/s.\nEnd." << std::endl;

 std::getchar();
 return 1;
}

 

Google HashCode 2016, Qualification Round [w/ code]

Hello everyone!

Alvise here, I wanted to post online my solution for the qualification round of the 2016 Google HashCode challenge that took place yesterday (consider leaving a feedback as a comment) !

First things first: if you havent, HERE is the PDF with the instructions that were given us, if you did not took part in the challenge, I recommend you try it out.

Few words on the solution then: me and my team tried VERY hard to get the big picture and to produce a decent solution, but in the end this was what killed us: the right approach would have been an incremental solution: start easy, start VERY easy, and then go for it. The few points we got where thanks to Paolo Montesel: after few seconds of the deadline we were able to get a nicer result, but of course, time was out.

Lets talk about the logic behind what would have been a good starting point:

  1. At each time-step, for each order select the most appropriate warehouse (distance > object availability)
  2. For each drone, if the drone is free (not busy doing something), select the warehouse closest to him, and the most easy order from that warehouse
  3. Deliver that order, make the drone busy for the right amount of turns

Pretty easy, no optimization needed, and a solution with this steps would have allowed you to get in the 300th position. After this, become optimizing which warehouse for which drone, selecting always the shortest order, try to finish orders sooner as possible, etc..

Here the resulting scores of the code you will find at the bottom:

score

The code is available here:


#!/usr/bin/env python

"""
Google #Hashcode 2016, Qualification Round, Team #############

Beware, this is the worst code I've written in my
entire life. It's also full of bugs and hacks.
- Paolo MONTESEL
"""

import math
import random
import sys

history = list()

class Position:
    def __init__(self, y, x):
        self.y = y
        self.x = x

    def distance(self, obj):
        return distance(self.y, self.x, obj.y, obj.x)

class Drone(Position):
    def __init__(self, name, y, x, max_weight, max_item):
        self.name = name
        self.y = y
        self.x = x
        self.max_weight = max_weight
        self.weight = 0
        self.loaded_items = [0] * max_item
        self.loaded_weight = 0
        self.waits = 0

    def __str__(self):
        return "<drone {} {}>".format(self.y, self.x)

    def __repr__(self):
        return self.__str__()

    def deliver(self, order):
        wait_time = self.distance(order) + 1

        for item_id, item_count in enumerate(self.loaded_items):
            if item_count <= 0: continue order.items[item_id] -= self.loaded_items[item_id] history.append("{} D {} {} {}".format(self.name, order.name, item_id, self.loaded_items[item_id])) self.loaded_items[item_id] = 0 # break self.loaded_weight = self.update_loaded_weight() self.waits += wait_time def load(self, item_id, item_count, order): closest_warehouse_distance = None closest_warehouse = None calc_weight = weights[item_id] * item_count if calc_weight + self.loaded_weight >= self.max_weight:
            return False

        for w in warehouses:
            if w.can_fulfill(item_id, item_count):
                if closest_warehouse is None or self.distance(w) < closest_warehouse_distance: closest_warehouse = w closest_warehouse_distance = self.distance(w) if closest_warehouse is not None: self.loaded_weight += calc_weight self.waits += self.distance(closest_warehouse) + 1 closest_warehouse.reserve(item_id, item_count) self.load_item(item_id, item_count) history.append("{} L {} {} {}".format(self.name, closest_warehouse.name, item_id, item_count)) for added_item_id in xrange(KINDS): added_item_count = order.items[added_item_id] added_weight = added_item_count * weights[added_item_id] if added_item_id != item_id and added_item_count > 0 and closest_warehouse.can_fulfill(added_item_id, added_item_count) > 0 and \
                                        added_weight + self.loaded_weight < self.max_weight: self.loaded_weight += added_weight self.load_item(added_item_id, added_item_count) closest_warehouse.reserve(added_item_id, added_item_count) history.append("{} L {} {} {}".format(self.name, closest_warehouse.name, added_item_id, added_item_count)) return True return False def load_item(self, item_id, item_count): self.loaded_items[item_id] += item_count def step(self, t, orders): if self.waits > 0:
            self.waits -= 1
            return

        found = False
        for order in orders:
            for item_id, item_count in enumerate(order.items):
                if item_count > 0 and self.load(item_id, item_count, order):
                    found = True
                    self.deliver(order)
                    break
            if found:
                break

    def update_loaded_weight(self):
        ret = 0
        for item_id, item_count in enumerate(self.loaded_items):
            ret += item_count * weights[item_id]

        return ret


class Warehouse(Position):
    def __init__(self, name, y, x, max_items):
        self.name = name
        self.y = y
        self.x = x
        self.items = [0] * max_items

    def add_item(self, item_id, item_count):
        self.items[item_id] = item_count

    def can_fulfill(self, item_id, item_count):
        return self.items[item_id] >= item_count

    def reserve(self, item_id, item_count):
        self.items[item_id] -= item_count

        if self.items[item_id] < 0:
            exit(-1)

    def __str__(self):
        return "<WH {} {} {}>".format(self.y, self.x, self.items)

    def __repr__(self):
        return self.__str__()


class Order(Position):
    def __init__(self, name, y, x, max_items):
        self.name = name
        self.y = y
        self.x = x
        self.items = [0] * max_items

    def __str__(self):
        return "<Order {} {} {} {}>".format(self.y, self.x, len(self.items), self.items)

    def __repr__(self):
        return self.__str__()

    def add_item(self, kind, count):
        if kind not in self.items:
            self.items[kind] = 0
        self.items[kind] += count


def parse_line(line):
    return map(int, line.strip().split())


def distance(r1, c1, r2, c2):
    return math.ceil(math.sqrt((r1- r2) ** 2 + (c1 - c2) ** 2))

in_file = open(sys.argv[1], "rb")

M, N, DRONES, TURNS, MAX_LOAD = parse_line(in_file.readline())
KINDS = parse_line(in_file.readline())[0]

print M, N, DRONES, TURNS, MAX_LOAD
print "KINDS:", KINDS

weights = parse_line(in_file.readline())
print "WEIGHTS:", len(weights)

WAREHOUSES = parse_line(in_file.readline())[0]

print "WAREHOUSES:", WAREHOUSES

warehouses = list()
for i in xrange(WAREHOUSES):
    row, col = parse_line(in_file.readline())
    warehouse = Warehouse(i, row, col, KINDS)

    items_count = parse_line(in_file.readline())
    for item_id, item_count in enumerate(items_count):
        warehouse.add_item(item_id, item_count)

    warehouses.append(warehouse)

ORDERS = parse_line(in_file.readline())[0]
print "ORDERS:", ORDERS

orders = list()
for i in xrange(ORDERS):
    row, col = parse_line(in_file.readline())
    num_items = parse_line(in_file.readline())[0]
    kinds = parse_line(in_file.readline())

    order = Order(i, row, col, KINDS)
    for kind in kinds:
        order.add_item(kind, 1)

    orders.append(order)

drones = list()
for i in xrange(DRONES):
    drones.append(Drone(i, warehouses[0].y, warehouses[0].x, MAX_LOAD, KINDS))

try:
    for turn in xrange(TURNS):
        random.shuffle(drones)
        for drone in drones:
            drone.step(turn, orders)

        out = open(sys.argv[1] + ".out", "wb")
        out.write("{}\n".format(len(history)))
        for h in history:
            out.write(h + "\n")
        out.close()

        for i, order in enumerate(orders[:]):
            if sum(order.items) == 0:
                orders.remove(order)

        if turn % 100 == 0:
            print "TURN:", turn
            print len(orders), sum([1 if drone.waits > 0 else 0 for drone in drones])

        if len(orders) <= 0:
            break

except KeyboardInterrupt:
    pass

 

Thanks again to my team, here their contacts!

mail

See you soon and if you want, leave a comment on how your #hashcode2016 went!

12729301_10206716077713709_1231853888550121293_n

Google HashCode 2016, Practice Round [w/ Code]

Hello everyone!

Alvise here, I wanted to post online my solution for the practice round of the 2016 Google HashCode challenge (consider leaving a feedback as a comment) !

I didn’t develop any particular solution, mine is very straight forward, so nothing fancy sorry! I will upload my solution but I strongly recommend you try your own heading over at the challenge page:

https://hashcode.withgoogle.com/

The problem was controlling a robot with few instructions (paint square or straight lines or erasing a single cell) to paint a given input image with the fewest instructions to the robot. The fewer instructions the better.
My first approach was like:

  1. Fit squares, from large to small
  2. Fit lines, large to small (same dimension for horizontal and vertical)
  3. Fit remaining single cells

And then I improved like:

  1. Fit squares, from large to medium
  2. Fit lines, large to medium (same dimension for horizontal and vertical)
  3. Fit special squares (squares 3×3 with a single missing value, resulting in 2 commands to be drawn: a paint_square + erase_cell)
  4. Fit squares, from medium to small
  5. Fit lines, medium to small (same dimension for horizontal and vertical)
  6. Fit remaining single cells

The fitting of squares and lines is done by using a 2d convolutional linear filter: kernels are represented by squares, columns or rows, with each cell of the kernal having a vlua of cell_v = 1 / kernel_size. Applied to the whole image (with border wrapping constant, meaning 0 when the filter look for data outside the image) the resulting filtered image has values equal to 1 (255 using unsigned char values) only where those filers finds filling squares or lines.

Here the resulting scores:

results

And the output of the program at execution time:

run

The code is available here:

// Alvise Memo, 10/2/2016. Submission for the Google Hashcode challenge 2016.
// Practice problem.
// The 13 is my birthday!!

#include <iostream>
#include <fstream>
#include <string>

// OpenCV include, for 2d filtering, debug visualization and loading.
#include <opencv2\opencv.hpp>

using namespace std;
using namespace cv;

// Loading input .in files into OpenCV Mat objects, straight-forward
Mat load(string s)
{
 string line;
 ifstream myfile(s);
 Mat r;
 bool first = true;
 if (myfile.is_open())
 {
 int n, m;
 myfile >> n >> m;
 r = Mat::zeros(n, m, CV_8UC1);

 getline(myfile, line);

 int row = 0;
 while (getline(myfile, line))
 {
 const char * c = line.c_str();

 for (int i = 0; i < m; i++)
 {
 if (c[i] == '#')
 {
 r.at<unsigned char>(row, i) = 255;
 }
 }
 row++;

 }
 myfile.close();
 }
 return r;
}

// This function look in the image for SQUARES of dimension starting from the biggest possible (determined
// by the size of the input image) UP TO a maximum dimension of N (not included!)
int fitSquaresUpTo(Mat & m, Mat & m2, int n, vector<string> & comm)
{
 int max_s = (m.rows < m.cols ? m.rows : m.cols);
 max_s += max_s % 2 - 1;

 if (max_s > 25)
 max_s = 25;


 int ret = 0;

 for (int i = max_s; i > n; i -= 2)
 {
 int k_size = i;
 Mat kernel = Mat::ones(i, i, CV_32FC1);
 kernel /= i * i;

 Mat out;
 filter2D(m, out, -1, kernel, Point(-1, -1), 0, BORDER_CONSTANT);



 double max = 255;
 int pos[2];
 while (max == 255)
 {
 minMaxIdx(out, NULL, &max, NULL, pos);

 if (max == 255)
 {
 Rect r = Rect(pos[1] - i / 2, pos[0] - i / 2, i, i);
 rectangle(out, r, Scalar(0), -1);
 rectangle(m, r, Scalar(0), -1);
 rectangle(m2, r, Scalar(0, 0, 255), -1);
 comm.push_back("PAINT_SQUARE " + to_string(pos[0]) + " " + to_string(pos[1]) + " " + to_string((i - 1) / 2));
 ret++;
 }
 }
 }

 //cout << ret << endl;

 return ret;
}

// This function look in the image for SQUARES of dimension starting from N and
// going down to the minimum value of 3 (less than 3 would be a SINGLE cell/pixel)
int fitSquaresDownFrom(Mat & m, Mat & m2, int n, vector<string> & comm)
{
 int max_s = n;
 max_s += max_s % 2 - 1;

 if (max_s > 25)
 max_s = 25;


 int ret = 0;

 for (int i = max_s; i > 2; i -= 2)
 {
 int k_size = i;
 Mat kernel = Mat::ones(i, i, CV_32FC1);
 kernel /= i * i;

 Mat out;
 filter2D(m, out, -1, kernel, Point(-1, -1), 0, BORDER_CONSTANT);



 double max = 255;
 int pos[2];
 while (max == 255)
 {
 minMaxIdx(out, NULL, &max, NULL, pos);

 if (max == 255)
 {
 Rect r = Rect(pos[1] - i / 2, pos[0] - i / 2, i, i);
 rectangle(out, r, Scalar(0), -1);
 rectangle(m, r, Scalar(0), -1);
 rectangle(m2, r, Scalar(0, 0, 255), -1);
 comm.push_back("PAINT_SQUARE " + to_string(pos[0]) + " " + to_string(pos[1]) + " " + to_string((i - 1) / 2));
 ret++;
 }
 }
 }

 //cout << ret << endl;

 return ret;
}

// Special function: looks for squares of dimension 3x3, with a SINGLE missing cell:
// it then adds two commands: one for drawing the full square and one for removing the missing cell.
int fitSquaresSpecial(Mat & m, Mat & m2, vector<string> & comm)
{
 int ret = 0;
 int i = 3;

 Mat kernel = Mat::ones(i, i, CV_32FC1);
 kernel /= i * i;

 Mat out;
 filter2D(m, out, -1, kernel, Point(-1, -1), 0, BORDER_CONSTANT);

 double max = 255;
 int pos[2];
 while (max > 225)
 {
 minMaxIdx(out, NULL, &max, NULL, pos);

 if (max > 225) // cubo da 9 meno spicchio
 {
 // Find missing cell position, LAZY!
 Point missing(-1, -1);
 for (int x = 0; x < 3; x++)
 {
 for (int y = 0; y < 3; y++)
 {
 if (m.at<unsigned char>(pos[0] - i / 2 + y, pos[1] - i / 2 + x) == 0)
 {
 assert(missing.x == -1);
 missing.x = pos[1] - i / 2 + x;
 missing.y = pos[0] - i / 2 + y;
 }
 }
 }
 if (missing.x == -1)
 {
 Rect r = Rect(pos[1] - i / 2, pos[0] - i / 2, i, i);
 rectangle(out, r, Scalar(0), -1);
 rectangle(m, r, Scalar(0), -1);
 rectangle(m2, r, Scalar(0, 0, 255), -1);
 comm.push_back("PAINT_SQUARE " + to_string(pos[0]) + " " + to_string(pos[1]) + " " + to_string((i - 1) / 2));
 ret++;
 }
 else
 {
 Rect r = Rect(pos[1] - i / 2, pos[0] - i / 2, i, i);
 rectangle(out, r, Scalar(0), -1);
 rectangle(m, r, Scalar(0), -1);
 rectangle(m2, r, Scalar(0, 0, 125), -1);
 rectangle(m2, Rect(missing.x, missing.y, 1, 1), Scalar(125, 125, 125));
 comm.push_back("PAINT_SQUARE " + to_string(pos[0]) + " " + to_string(pos[1]) + " " + to_string((i - 1) / 2));
 comm.push_back("ERASE_CELL " + to_string(missing.y) + " " + to_string(missing.x));
 ret += 2;
 }
 }
 }


 //cout << ret << endl;

 return ret;
}

// Same as before, fits horizontal and vertical lines instead of squares. Look fitSquaresUpTo
int fitLinesUpTo(Mat & m, Mat & m2, int n, vector<string> & comm)
{
 int max_s = (m.rows < m.cols ? m.cols : m.rows);
 max_s += max_s % 2 - 1;

 int ret = 0;

 for (int i = max_s; i >= n; i--)
 {
 {
 Mat kernel = Mat::ones(1, i, CV_32FC1);
 kernel /= i;

 Mat out;
 filter2D(m, out, -1, kernel, Point(-1, -1), 0, BORDER_CONSTANT);

 double max = 255;
 int pos[2];
 while (max == 255)
 {
 minMaxIdx(out, NULL, &max, NULL, pos);

 if (max == 255)
 {
 Point r1 = Point(pos[1] - i / 2, pos[0]);
 Point r2 = Point(pos[1] + i / 2 - (1 - i % 2), pos[0]);

 line(out, r1, r2, Scalar(0));
 line(m, r1, r2, Scalar(0));
 line(m2, r1, r2, Scalar(0, 125, 125));
 comm.push_back("PAINT_LINE " + to_string(r1.y) + " " + to_string(r1.x) + " " + to_string(r2.y) + " " + to_string(r2.x));
 ret++;
 }
 }
 }
 if (i < m.rows)
 {
 Mat kernel = Mat::ones(i, 1, CV_32FC1);
 kernel /= i;

 Mat out;
 filter2D(m, out, -1, kernel, Point(-1, -1), 0, BORDER_CONSTANT);

 double max = 255;
 int pos[2];
 while (max == 255)
 {
 minMaxIdx(out, NULL, &max, NULL, pos);

 if (max == 255)
 {
 Point r1 = Point(pos[1], pos[0] - i / 2);
 Point r2 = Point(pos[1], pos[0] + i / 2 - (1 - i % 2));

 line(out, r1, r2, Scalar(0));
 line(m, r1, r2, Scalar(0));
 line(m2, r1, r2, Scalar(125, 125, 0));
 comm.push_back("PAINT_LINE " + to_string(r1.y) + " " + to_string(r1.x) + " " + to_string(r2.y) + " " + to_string(r2.x));
 ret++;
 }
 }
 }
 }


 //cout << ret << endl;

 return ret;
}

// Same as before, fits horizontal and vertical lines instead of squares. Look fitSquaresDownFrom
int fitLinesDownFrom(Mat & m, Mat & m2, int n, vector<string> & comm)
{
 int max_s = n;
 max_s += max_s % 2 - 1;

 int ret = 0;

 for (int i = max_s; i > 1; i--)
 {
 {
 Mat kernel = Mat::ones(1, i, CV_32FC1);
 kernel /= i;

 Mat out;
 filter2D(m, out, -1, kernel, Point(-1, -1), 0, BORDER_CONSTANT);

 double max = 255;
 int pos[2];
 while (max == 255)
 {
 minMaxIdx(out, NULL, &max, NULL, pos);

 if (max == 255)
 {
 Point r1 = Point(pos[1] - i / 2, pos[0]);
 Point r2 = Point(pos[1] + i / 2 - (1 - i % 2), pos[0]);

 line(out, r1, r2, Scalar(0));
 line(m, r1, r2, Scalar(0));
 line(m2, r1, r2, Scalar(0, 125, 125));
 comm.push_back("PAINT_LINE " + to_string(r1.y) + " " + to_string(r1.x) + " " + to_string(r2.y) + " " + to_string(r2.x));
 ret++;
 }
 }
 }
 if (i < m.rows)
 {
 Mat kernel = Mat::ones(i, 1, CV_32FC1);
 kernel /= i;

 Mat out;
 filter2D(m, out, -1, kernel, Point(-1, -1), 0, BORDER_CONSTANT);

 double max = 255;
 int pos[2];
 while (max == 255)
 {
 minMaxIdx(out, NULL, &max, NULL, pos);

 if (max == 255)
 {
 Point r1 = Point(pos[1], pos[0] - i / 2);
 Point r2 = Point(pos[1], pos[0] + i / 2 - (1 - i % 2));

 line(out, r1, r2, Scalar(0));
 line(m, r1, r2, Scalar(0));
 line(m2, r1, r2, Scalar(125, 125, 0));
 comm.push_back("PAINT_LINE " + to_string(r1.y) + " " + to_string(r1.x) + " " + to_string(r2.y) + " " + to_string(r2.x));
 ret++;
 }
 }
 }
 }


 //cout << ret << endl;

 return ret;
}

// Look in the image for a SINGLE cell, a square with dimension of 1x1
int fitSingles(Mat & m, Mat & m2, vector<string> & comm)
{
 int ret = 0;

 for (int x = 0; x < m.cols; x++)
 {
 for (int y = 0; y < m.rows; y++)
 {
 if (m.at<unsigned char>(y, x) == 255)
 {
 m.at<unsigned char>(y, x) = 0;
 m2.at<Vec3b>(y, x) = Vec3b(255, 0, 0);
 comm.push_back("PAINT_SQUARE " + to_string(y) + " " + to_string(x) + " 0");
 ret++;
 }
 }
 }
 return ret;
}

// Analyze script version 1
int analyzeVersion1(std::string name)
{
 Mat m_c = load(name);
 cv::imwrite(name + "input.png", m_c);
 
 Mat m_r = Mat::zeros(m_c.size(), CV_8UC3);
 vector<string> commands;

 int points = m_c.rows * m_c.cols;

 points -= fitSquaresUpTo(m_c, m_r, 2, commands);
 points -= fitLinesUpTo(m_c, m_r, 2, commands);
 points -= fitSingles(m_c, m_r, commands);

 cout << points << endl;

 
 cv::imwrite(name + "solved.png", m_r);

 ofstream fileOut;
 fileOut.open(name + "comm.txt");
 fileOut << commands.size() << "\n";
 for each (string s in commands)
 fileOut << s << "\n";
 fileOut.close();

 return points;
}

// Analyze script version 2
int analyzeVersion2(std::string name, int i, int j)
{
 Mat m_c = load(name);
 cv::imwrite(name + "input.png", m_c);
 
 Mat m_r = Mat::zeros(m_c.size(), CV_8UC3);
 vector<string> commands;

 int points = m_c.rows * m_c.cols;

 points -= fitSquaresUpTo(m_c, m_r, i, commands);
 points -= fitLinesUpTo(m_c, m_r, j, commands);

 points -= fitSquaresSpecial(m_c, m_r, commands);

 points -= fitSquaresDownFrom(m_c, m_r, i, commands);
 points -= fitLinesDownFrom(m_c, m_r, j, commands);

 points -= fitSingles(m_c, m_r, commands);

 cout << points << endl;

 cv::imwrite(name + "solved.png", m_r);

 ofstream fileOut;
 fileOut.open(name + "comm.txt");
 fileOut << commands.size() << "\n";
 for each (string s in commands)
 fileOut << s << "\n";
 fileOut.close();

 return points;
}

// Entry of the software
int main(int argc, char ** argv)
{
 std::cout << "Practice test." << std::endl;
 std::cout << "The 13/02 is my birthday!!" << std::endl;

 int total = 0;

 std::cout << "Analyze learn_and_teach.in" << std::endl;
 total += analyzeVersion2("learn_and_teach.in", 9, 4);

 std::cout << "Analyze logo.in" << std::endl;
 total += analyzeVersion1("logo.in");

 std::cout << "Analyze right_angle.in" << std::endl;
 total += analyzeVersion2("right_angle.in", 5, 3);

 cout << total << endl;

 std::getchar();
}

 

Output (Red: square, Senape:horizontal line, Acqua:vertical line, Blue: single square, Grey:erased cell)

right_angle.ininput right_angle.insolved

learn_and_teach.ininput learn_and_teach.insolved

logo.ininputlogo.insolved

 

Head Tracking 3D with OpenCV

Hello, in this post I want to present a small project developed with OpenCV for a simple head tracking with a normal webcam.
By head tracking I mean that the software tracks the user head movement in the image taken by a camera connected to the computer (could be a webcam or other sources). The tracking is actually a very fast and continuous search for faces. With the acquired position of the tracked head, a 3D render is called and is used to draw on the monitor a ‘fake’ window to give the tracked user the illusion of a proper projection.

The tracking and filtering of the data is done with OpenCV, the rendering with OpenGL/FreeGLUT/GLEW.

The software workflow is as follow:

  1. Initialization, where the Kalman filters used for smoothing the head tracking position, loading the 3D model rendered and the rest of the environment
  2. From this moment on, the 3D loop is activated, and does not communicate with the tracking side at all, except for the “SetCam” function, which specify where the user head is, and update the viewing frustum accordingly.
  3. The tracking loop instead opens the webcam stream, and starts tracking the first face found.
    • When a face is found, a second classification occours in the described area to look for two eyes, and to compute a synthetic distance of the user from the camera.
    • A kalman filter is then applied both to the x-y values, and also to the z value found before
    • Finally, values are communicated to the render for updating the projection matrix.

Here a video of the result:

Download link to the binaries, to try this on your computer (Windows only sorry):

Download binaries

And to download the full visual studio project (VS2012) with all included libraries:

Download sources

In the code I will provide, a lot of simplification and optimization can be done, but for my purpose this was good. For any question please feel free to drop me an email.