Face Recognition in the Dark ∗ †Equinox Corporation ‡Equinox Corporation
{andrea,diego}@equinoxsensors.com
Abstract
recognition algorithms are the locations of both eyes. Theseare normally used to align the detected faces to a standard
Previous research has established thermal infrared imagery
template prior to more complex feature extraction and com-
of faces as a valid biometric and has shown high recogni-
parison. To date, thermal infrared face recognition algo-
tion performance in a wide range of scenarios. However,
rithms have relied on eye locations that were either man-
all these results have been obtained using eye locations that
ually marked by a human operator [3, 6, 1, 4, 7], or au-
were either manually marked, or automatically detected in a
tomatically detected in a coregistered visible image [8].
coregistered visible image, making the realistic use of ther-
The results of such studies cannot be extrapolated to situ-
mal infrared imagery alone impossible. In this paper we
ations such as real-time face recognition at night, where a
present the results of an eye detector on thermal infrared
coregistered visible image may be unavailable or heavily
imagery and we analyze its impact on recognition perfor-
mance. Our experiments show that although eyes cannot
In this paper, we present results of a detection algorithm
be detected as reliably in thermal images as in visible ones,
applied to thermal infrared eye images. We compare these
some face recognition algorithms can still achieve adequate
results with ground-truth data as well as with results ob-
tained by a visible eye detector and we observe that al-though the localization error is larger in the thermal infraredthan in the visible, the distance between the detected and the
Introduction
actual eye center location stays within 15% of the eye sizein both modalities.
Over the last few years, there has been a surge of interest
Previous research [4] has shown that face recognition
in face recognition using thermal infrared imagery. While
performance with thermal imagery is much more sensi-
the volume of literature on the subject is notably smaller
tive to eye location errors than its visible counterpart. As
than related to visible face recognition, there is nonetheless
reported in that study, correct recognition rates using the
a steady stream of research [1, 2, 3, 4, 5]. Although they
PCA algorithm with Mahalanobis angle distance drop sig-
mostly relied on databases limited in size and variability,
nificantly when the eye locations are randomly perturbed
these papers have established that thermal imagery of hu-
in a 3x3 pixel window centered at the manually-located
man faces constitutes a valid biometric signature. Thermal
eye center. The reported performance drop is consider-
imagery has shown superior performance over visible im-
ably larger than that suffered when perturbing eye locations
agery with a variety of algorithms [1, 6]. More recently,
and performing recognition with visible imagery. The au-
results of time lapse recognition results were reported in
thors therefore conclude that face recognition from thermal
[4, 7]. Results in an operational scenario are presented in
imagery is inherently more sensitive to registration errors.
That conclusion is not supported by our study.
A necessary step toward automated face recognition, in
We use visible and thermal infrared eye detection algo-
any modality, is the detection of faces and facial features.
rithms to geometrically normalize face images before ap-
Face detection in the thermal infrared has been reported
plying two recognition algorithms, PCA using Mahalanobis
in [9]. The most common facial features required by face
angle distance and the Equinox algorithm. ∗This research was supported in part by the DARPA Human Identifica-
the impact of eye localization errors on recognition perfor-
tion at a Distance (HID) program, contract # DARPA/AFOSR F49620-01-C-0008.
1Without the use of external illumination.
mance in both visible and thermal infrared images of vary-
thermal infrared in general, is that it fails to detect the eye
ing difficulty, indoors and outdoors. We observe that when
center locations for subjects wearing glasses. Glasses are
the eye detection error increases, recognition performance
opaque in the thermal infrared spectrum and therefore show
decreases more abruptly in the case of the weaker PCA al-
up black in thermal images, blocking the view of the eyes
gorithm and stays within reasonable bounds for the better
(see Figure 2.) In these images the glasses can be easily
performing Equinox algorithm. Contrary to [4], our finding
segmented and the eye center location can be inferred from
is that visible and thermal infrared performance decrease by
the shape of the lens. Unfortunately, the errors incurred
approximately the same amount. This may be due to the in-
by such inference are rather large. For the experiments re-
creased difficulty of the visible image set with respect to the
ported in this paper, only images of subjects without glasses
were used. Proper normalization of thermal images of sub-jects wearing glasses is an area of active research, publishedresults on which are forthcoming. The Eye Detection Algorithms
In order to detect eyes in thermal images, we rely on theface location detected using the face detection and trackingalgorithm in [9, 10]. We then look for the eye locationsin the upper half of the face area using a slightly modifiedversion of the object detector provided in the Intel OpenComputer Vision Library [11].
Before detection we apply an automatic gain control al-
gorithm to the search area. Although LWIR images are 12bit, the temperature of different areas of a human face hasa range of only a few degrees and thus is represented by atmost 6 bits. We improve the contrast in the eye region bymapping the pixels in the interest area to an 8 bit interval,
Figure 2: Thermal infrared image of a person wearing
The detection algorithm is based on the rapid object de-
tection scheme using a boosted cascade of simple featureclassifiers introduced in [12] and extended in [13]. The
We do not use the OpenCV object detection algorithm
OpenCV version of the algorithm [14] extends the haar-like
for eye detection in visible images. While this method does
features by an efficient set of 45 degree rotated features and
work reasonably well, we can obtain better localization re-
uses small decision trees instead of stumps as weak clas-
sults with the algorithm outlined below. This is simply be-
sifiers. Since we know that there is one and only one eye
cause we can take advantage of clear structure within the
on the left and right halves of the face, we force the algo-
eye region and model it explicitly, rather than depend on a
rithm to return the best guess regarding its location. Figure
generic object detector. We simply search for the center of
1 shows an example of face and eyes detected in a thermal
the pupil of the open eye. The initial search area relies again
on the position of the face as returned by a face detector [9]. Within this region, we look for a dark circle surrounded bylighter background using an operator similar to the Houghtransform widely used for detection in the iris recognitioncommunity [15]:
max(r,x0,y0)|Gσ(r) ∗ δ
This operator searches over the image domain (x, y) for themaximum in the smoothed partial derivative with respect toincreasing radius r, of the normalized contour integral of
Figure 1: Automatic detection of the face and eyes in a ther-
I(x, y) along a circular arc ds of radius r and center co-
ordinates (x0, y0). The symbol ∗ denotes convolution andGσ(r) is a smoothing function such as a Gaussian of scale
The drawback of the algorithm, and of eye detection in
Experimental Results and Discus-
set. These are images where the detected eye coordinates
were at least 10 pixels away from the ground-truth location. Since the images are geometrically normalized using the de-
In order to validate our thermal eye detection algorithm, we
tected eye locations, outliers can easily be detected by pass-
performed two types of experiments. First we compared the
ing the normalized image through a face/non-face classifier.
eye locations to those obtained manually on a set of images.
Also, note that the face detector yields expected eye loca-
Then we used the eye locations to geometrically normal-
tions, which can be used to validate the feasibility of finer
ize the face images before applying two face recognition
eye localization. Outliers amounted to 436 images (12%)
algorithms: PCA with Mahalanobis angle distance and the
in the visible domain and 79 images (2%) in the LWIR do-
Equinox algorithm. We compared the recognition perfor-
mance in visible and thermal infrared using eye detection
Table 1 shows the mean absolute error and the standard
deviation of the error in the x and y coordinates for the leftand right eye, for detection in the visible domain, while Ta-ble 2 shows the equivalent quantities for the LWIR domain. Comparison to Ground Truth Data
While the number of outliers is much larger in the visible
For this experiment we used 3732 images of 207 subjects
than in LWIR, the means and standard deviations of the
not wearing glasses, collected during several indoor ses-
visible errors stay below 1 pixel2. The means of the ab-
sions. We used the FBI mugshot standard light arrange-
solute LWIR errors go up to 2.8 pixels, a 4.7 times increase
ment. The subjects were volunteers, none of which was
over visible, and the standard deviations go up to 1.75, a
visibly agitated or perspiring (on their face, at least) during
1.77 times increase over visible. We have to keep in mind
though that at the resolution of our images the average size
We used an uncooled sensor capable of acquiring coreg-
of an eye is 20 pixels wide by 15 pixels high. So although
istered visible and longwave thermal infrared (LWIR)
the error increase from visible to LWIR is large, LWIR val-
video. The format consists of 240 × 320 pixel image pairs,
ues still stay within 15% of the eye size, quite a reasonable
coregistered to within 1/3 pixel, where the visible image
bound. We will see below how this error increase affects
(from a Pulnix 6701 camera, sensitive to approximately
0.9µ) has 8 bits of grayscale resolution and the LWIR has
bits. The LWIR microbolometer (Indigo Merlin) is sen-
sitive through the range 8µ-12µ, with a noise-equivalent-
differential-temperature (NEdT) of 100mK. Thermal im-
ages were radiometrically calibrated in order to compensate
for non-uniformities in the microbolometer array. Figure 3shows sample images from this set.
Table 1: Means and standard deviations of visible eye de-tection errors
Table 2: Means and standard deviations of IR eye detectionerrors
Figure 3: Sample images used for ground truth data com-parison
Face Recognition Performance
For each coregistered image pair, the locations of the left
and right pupil were semi-automatically located by a human
The imagery used for this experiment was collected dur-
operator. These locations were used as ground truth.
ing eight separate day-long sessions spanning a two week
After detecting the faces and eyes in both the visible and
2Obviously, removing the outliers reduces the standard deviation, so to
thermal infrared images, we removed the outliers from our
some extent the lower variance is due to the large number of outliers.
period. A total of 385 subjects participated in the collec-
lier time, in a different location and used a disjoint set of
tion. Four of the sessions were held indoors in a room with
subjects. This insures that the results reported below are
no windows and carefully controlled illumination. Sub-
indicative of real world performance. Since the data collec-
jects were imaged against a plain background some seven
tion involved video data in both modalities, we evaluated
feet from the cameras, and illuminated by a combination of
recognition performance using 40 frame video sequences
overhead fluorescent lighting and two photographic lights
as input. The distance from a probe sequence to an indi-
with umbrella-type diffusers positioned symmetrically on
vidual in the gallery was defined to be the smallest distance
both sides of the cameras and about six feet up from the
between any frame in the sequence and any image of that in-
floor. Three of the four indoor sessions were held in differ-
dividual in the gallery. Classification was based on nearest
ent rooms. The remaining four sessions were held outdoors
neighbors with respect to this distance.
at two different locations. During the four outdoor sessions,
We divided our test data in three categories: indoor
the weather included sun, partial clouds and moderate rain.
gallery and indoor probe set consisting of 190 subjects, out-
All illumination was natural; no lights or reflectors were
door gallery and outdoor probe set consisting of 151 sub-
added. Subjects were always shaded by the side of a build-
jects, and indoor gallery and outdoor probe set consisting
ing, but were imaged against an unconstrained natural back-
ground which included moving vehicles, trees and pedestri-
Eye detection for the gallery images was performed in
ans. Even during periods of rain, subjects were imaged out-
the visible domain, a likely scenario for an access control
side and uncovered, in an earnest attempt to simulate true
system where users are enrolled under good visiblity condi-
tions. Eye detection for the probe images was performed in
For all sessions, subjects were cooperative, standing
the visible as well as in the LWIR domain.
about seven feet from the cameras, and looking directly
For this experiment we did not remove the outliers after
at them when so requested. On half of the sessions (both
eye detection as described in the previous section. Outliers
indoors and outdoors), subjects were asked to speak while
result in face images that are incorrectly normalized and
being imaged, in order to introduce some variation in facial
thus their distance to all gallery individuals is large. Since
expression into the data. For each subject and session, a four
the distance to an individual from the gallery is the smallest
second video clip was collected at ten frames per second in
distance between any frame in a 40 frame sequence and that
two simultaneous imaging modalities.
individual, frames with incorrect eye locations are far from
We used the same sensor as in the previuos experiment,
so the image format consists again of 240×320 pixel image
We performed recognition experiments on our three data
pairs, coregistered to within 1/3 pixel, where the visible
sets using the PCA algorithm with Mahalanobis angle dis-
image has 8 bits of grayscale resolution and the LWIR has
tance and the Equinox algorithm in the visible and LWIR
12 bits. Thermal images were radiometrically calibrated.
domains using eye detection results from the visible and
Example visible images can be seen in Figure 4.
LWIR domains. For completeness, we also recorded theperformance obtained in the visible domain using eyes de-tected in the LWIR domain. Although this is not a realisticscenario (if visible imagery is available we might as welldetect the eyes there) the results show the sensitivity of vis-ible imagery to error in eye location.
Top match recognition performances are shown in Ta-
bles 3, 4, 5, 6. Recognition performance with LWIR eyelocations is followed in parentheses by the percentage ofthe corresponding performance with visible eye locations
Figure 4: Example visible images of a subject from indoor
that this represents. PCA performs very poorly on difficult
data sets (outdoor probes) and the performance decreaseseven more when the eyes are detected in LWIR. The de-
For each individual, the earliest available video sequence
crease in performance is about the same in both modalities
in each modality is used for gallery images and all subse-
(performance with LWIR eye locations is about 70% of the
quent sequences in future sessions are used for probe im-
performance with visible eye locations). This is in contrast
ages. Images of subjects wearing glasses were removed
with the observation in [4], but is probably due to the diffi-
culty of the data set as well as a lower error in the eye center
The training set for all algorithms was completely dis-
joint from gallery and probe images, in time, space and
The Equinox algorithm performs much better than PCA
subjects. That is, the training set was collected at an ear-
in general. The decrease in performance when usig LWIR
Table 3: Performance of PCA algorithm with eyes detected
Table 6: Performance of Equinox algorithm with eyes de-
tected in the LWIR domain. In parentheses percentage ofcorrespondig performance with eyes detected in the visible
385 subjects divided into three gallery/probe set pairs, in-door/indoor, outdoor/outdoor and indoor/outdoor. We ob-
Table 4: Performance of PCA algorithm with eyes detected
served that while recognition performance drops for both al-
in the LWIR domain. In parentheses percentage of corre-
gorithms, the drop is more significant for the already poorly
spondig performance with eyes detected in the visible do-
performing PCA algorithm. For the Equinox algorithm per-
formance drops only 10% when the eyes are detected inLWIR. Night-time thermal infrared only recognition perfor-
eyes is not as steep as in the case of PCA (about 90% of the
mance stays comparable to day-time visible performance.
visible eyes performance in both modalities).
Notably, our experiments show that the decay in perfor-mance due to poor eye localization is comparable across
Based on our results we believe that, using the right al-
gorithm, thermal infrared face recognition is a viable bio-
metric not only when visible light is available, but also inthe dark.
Table 5: Performance of Equinox algorithm with eyes de-tected in the visible domain
References
[1] D. A. Socolinsky and A. Selinger, “A comparative
Conclusion
analysis of face recognition performance with visibleand thermal infrared imagery,” in Proceedings ICPR,Quebec, Canada, August 2002.
Thermal infrared face recognition algorithms so far, haverelied on eye center locations that were detected either man-
[2] D.A. Socolinsky and A. Selinger, “Face recognition
ually or automatically in a coregistered visible image. In an
with visible and thermal infrared imagery,” Computer
attempt to solve the problem of real-time night-time face
Vision and Image Understanding, July - August 2003.
recognition we performed eye detection on thermal infraredimages of faces and used the detected eye center locations to
[3] Joseph Wilder, P. Jonathon Phillips, Cunhong Jiang,
geometrically normalize the images prior to applying face
Infra-Red Imagery for Face Recognition,” in Proceed-
We presented the results of applying a generic object de-
ings of 2nd International Conference on Automatic
tector to the problem of eye detection in thermal infrared
Face & Gesture Recognition, Killington, VT, 1996,
images of faces. As expected, the problem is more difficult
than its visible counterpart. The experiments we performedon over 3000 images with ground truth data available show
[4] X. Chen, P. Flynn, and K. Bowyer, “Visible-light and
that the error in eye center location is much higher in the
infrared face recognition,” in Proceedings of the Work-
thermal infrared than in the visible domain, but it still stays
shop on Multimodal User Authentication, Santa Bar-
We analyzed the impact of eye locations detected in the
visible and thermal infrared domains on two face recog-
nition algorithms: PCA with Mahalanobis angle distance
thermal signatures for face recognition,”
Biometrics Consortium Conference, Arlington, VA,September 2003.
[6] D. Socolinsky, L. Wolff, J. Neuheisel, and C. Eveland,
“Illumination Invariant Face Recognition Using Ther-mal Infrared Imagery,” in Proceedings CVPR, Kauai,Dec. 2001.
[7] X. Chen, P. Flynn, and K. Bowyer, “PCA-based face
recognition in infrared imagery: Baseline and compar-ative studies,” in International Workshop on Analysisand Modeling of Faces and Gestures, Nice, France,October 2003.
[8] D. Socolinsky and A. Selinger, “Thermal face recog-
CVPR, Washington, DC, June 2004, to appear.
[9] C. Eveland, Utilizing Visible and Thermal InfraredVideo for the Fast Detection and Tracking of Faces,Ph.D. thesis, University of Rochester, 2003.
D. Marchette, and J.G. DeVinney, “A boosted CCCDclassifier for fast face detection,” Computing Scienceand Statistics, vol. 35, 2003.
http://sourceforge.net/projects/opencvlibrary/.
[12] P. Viola and M. Jones, “Rapid object detection using
a boosted cascade of simple features,” in Proceedingsof IEEE CVPR, Kauai, HI, December 2001.
[13] R. Lienhart and J. Maudt, “An extended set of haar-
like features for rapid object detection,” in Proceed-ings ICIP 2002, 2002, vol. 1, pp. 900–903.
[14] R. Lienhart, A. Kuranov, and V. Pisarevsky, “Empir-
ical analysis of detection cascades of boosted classi-fiers for rapid object detection,” Tech. Rep., Micro-processor Research Lab, Intel Labs, Intel Corporation,2002.
[15] J. Daugman, “How iris recognition works,” IEEETransactions on Circuits and Systems for Video Tech-nology, vol. 14, no. 1, January 2004.
Box Suggestions The following are ideas from sponsors, recipients, and other agencies for items to send to your family. GENERAL SUPPLIES Cleaning Box – rubber gloves; paper towels; window, floor, counter, bathroom, wood furniture cleaners, etc.; Clorox or other brand (Anti-bacterial wipes); dish detergent, dish cloths. Toiletries Box – toilet and tissue paper; soap; shampoo; hair b
Briefings quit. If they are as bad as polls suggest (as low as 19% of the vote) and there is a big rise in votes for Meltdown prospect the far right and fascists the pressure may become unbearable. however, there are other factors stalks Labour Party that need to be considered. Chief amongst these are whether there is 1 In the summer of 2008 John parliamentary crisis, looked like a an