230 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 12, NO. 2, FEBRUARY 2003
Identification of Bitmap Compression History:
JPEG Detection and Quantizer Estimation
Zhigang Fan and Ricardo L. de Queiroz, Senior Member, IEEE
Abstract—Sometimes image processing units inherit images
in raster bitmap format only, so that processing is to be carried
without knowledge of past operations that may compromise image
quality (e.g., compression). To carry further processing, it is useful
to not only know whether the image has been previously JPEG
compressed, but to learn what quantization table was used. This
is the case, for example, if one wants to remove JPEG artifacts
or for JPEG re-compression. In this paper, a fast and efficient
method is provided to determine whether an image has been
previously JPEG compressed. After detecting a compression
signature, we estimate compression parameters. Specifically, we
developed a method for the maximum likelihood estimation of
JPEG quantization steps. The quantizer estimation method is
very robust so that only sporadically an estimated quantizer step
size is off, and when so, it is by one value.
Index Terms—Artifact removal, image history, JPEG compres-
sion, quantizer estimation.
I. INTRODUCTION
I N SEVERAL situations, images are processed as bitmapswithout any knowledge of prior processing. It typically hap-
pens when the image processing unit inherits the image data as a
bitmap without any side information. For example, imaging re-
lated drivers in operational systems may receive a bitmap image
with instructions for rendering it at a particular size and position,
but without further information. The image may have been pro-
cessed and perhaps compressed. It might even contain severe
compression artifacts. All these facts are unknown to the image
processing unit. If one wants to ensure that image is rendered
in good conditions, it is desirable to understand the artifacts the
image might have, i.e., it is desirable to know a bit of the image’s
“history.” Particularly, we are interested in detecting whether
the image has ever been compressed using the JPEG standard
[1]. Furthermore, we would like to know what quantizer tables
were used. This is quite important for a few applications. For ex-
ample, in removing artifacts, most of the artifact removal algo-
rithms [2]–[9] require the knowledge of the quantization table.
In another application, if one knows the quantizer table, it is pos-
sible to avoid further distortion when recompressing the image.
The situation is illustrated in Fig. 1.
Manuscript received September 19, 2002; revised October 1, 2002. The as-
sociate editor coordinating the review of this manuscript and approving it for
publication was I. Pitas.
Z. Fan is with Xerox Corporation, Webster, NY 14580 USA (e-mail:
zfan@crt.xerox.com).
R. L. de Queiroz was with Xerox Corporation, Webster, NY 14580 USA. He
is now with the Electrical Engineering Department, University of Brazil, Brazil
(e-mail: queiroz@ieee.org).
Digital Object Identifier 10.1109/TIP.2002.807361
Fig. 1. Image processing module receives the bitmap image and processes it.
In order to remove possible artifacts, it first has to determine whether the image
has been compressed in the past and estimate its quantizer table. The information
is, then, used to remove possible compression artifacts. We are only concerned
with the “compression detection” and “quantizer estimation” operations.
In Fig. 1, the image received by the processing module is a
bitmap. We first detect if the image has been compressed. In this
paper we are only dealing with JPEG [1] compression of mono-
chrome images, hence the only useful compression parameter
that affects the image quality is the quantizer table, i.e., the quan-
tizer step sizes applied to the discrete cosine transform (DCT)
coefficients. The quantizer table is estimated from the uncom-
pressed raster data and this information can be made available
to an artifact removal routine. Frequently, algorithms to remove
JPEG artifacts use the quantizer information to estimate the
amount of distortion caused by quantization so that over-blur-
ring can be avoided. There are several alternatives for JPEG arti-
fact removal. The method recommended in [1] as well as those
in [2]–[9], serve as good examples. Particularly, we favor the
implementation of [3], [4], but we are not concerned with arti-
fact removal techniques here. We are only concerned with the
steps of detecting a previous JPEG compression and estimating
the quantizers used.
In this paper, a method for the maximum likelihood estima-
tion (MLE) [10] of JPEG quantization tables is presented, where
the only information available is a bitmap of the decoded image.
The method is used in conjunction with a very simple and effi-
cient method to identify if an image has been previously JPEG
compressed. In our experiments, we used the popular IJG JPEG
implementation [11], so that, throughout the paper, we iden-
tify quantizer tables with so called “quality factors” (QF) with
which the reader might be already familiar. The QF are just ref-
erence numbers which range from 1 to 100 [11], where QF
100 means that all quantizer steps [1] are unity, thus yielding
the best quality JPEG can possibly achieve.
1057-7149/03$17.00 © 2003 IEEE
FAN AND DE QUEIROZ: IDENTIFICATION OF BITMAP COMPRESSION HISTORY 231
Fig. 2. For each block the numbers Z = jA � B � C + Dj and Z =
jE�F �G+Hj are computed, i.e., involving same pixel pattern but spanning
or not multiple blocks.
A simple method to detect previous JPEG compression is pre-
sented in Section II. A suitable likelihood function for the esti-
mation of the quantizer steps is discussed in Section III, while
the MLE method is presented in Section IV. Reports of exper-
imental results are presented in Section V and the conclusions
of this paper are presented in Section VI.
II. DETECTING JPEG COMPRESSION
The first step in the identification of the image’s compres-
sion history is to identify if the image has been compressed. We
are just concerned with JPEG compression and artifacts derived
from block based image coding. We look for blocking signa-
tures in the image as an evidence of JPEG compression. Sim-
ilar ideas have been used before. For example, the method in
[5] examines harmonics of the Fourier transform over blocks
of 32 32 pixels in order to estimate the “blockiness” present
in the image. Another method [6] estimates blocking artifacts
by simply comparing the gradient at the border pixels with in-
terpolated (projected) gradient values. In both cases, detecting
blocking artifacts is necessary since they are concerned with re-
moving the artifacts. In our case, we do not carry this step. We
are only interested in a binary decision on whether or not the
image has been compressed. Hence, algorithms that are both
simpler and more robust can be applied.
Even “light” compression may leave small but consistent
discontinuities across block boundaries. The proposed method
detects images compressed with QF as high as 95. The idea is
very simple: it assumes that if there is no compression the pixel
differences across blocks should be similar to those within
blocks. Assume the block grid is known. We then compute a
sample of differences within a block and spanning across a
block boundary, as illustrated in Fig. 2. For each block
we compute
(1)
where through are the values of the pixels in the posi-
tions depicted in Fig. 2. Next, one computes the normalized
histograms and of the and , re-
spectively. The blocking signature measure that we use is the
energy of the difference between histograms, i.e.,
(2)
can be compared to a threshold or given as a confidence pa-
rameter. Fig. 3 serves to illustrate the method. Fig. 3(a) shows
histograms and for a typical image without com-
pression, while Fig. 3(b) shows the same after the image un-
derwent compression with QF 75 (recommended for excellent
image quality [1]). The absolute histogram differences
are shown in Fig. 3(c). In fact, is related to the area
under the curves in Fig. 3(c).
Commonly, the grid origin is the same as the image origin
point. This is generally the case if there is no image cropping
or pasting. In case one wants to find the grid, let be
the image pixels, then the grid origin can be chosen as the pair
that maximizes , where
(3)
In other words, the grid should be aligned with the positions
where the horizontal and vertical neighbor differences, in a pe-
riodic displacement, are at their maximum. If there is no com-
pression and, therefore, no blocking, all in (3) should be
similar and the grid origin will be picked randomly.
There are alternatives to simplify the grid location proce-
dure, for example by selecting a few image rows and columns in
order to perform the summation. There are many ways to detect
blocking; the proposed computation of and is
just one instantiation.
III. LIKELIHOOD FUNCTION FOR QUANTIZER STEP ESTIMATION
Given that the image has been detected as being previously
compressed using JPEG, we would like to estimate the quan-
tizer table used. In this section, we analyze the probability dis-
tribution function of the DCT coefficients and formulate a likeli-
hood function. We assume that the elements of the quantization
matrix are integers, as this is almost always true for transform
coders and is certainly true for JPEG.
JPEG compression is typically performed in three steps: dis-
crete cosine transform (DCT), quantization, and entropy coding.
At the decoding side, the processes are reversed. The data are en-
tropy decoded, dequantized, and inverse transformed (IDCT).
It was reported that the DCT coefficients typically have a
Gaussian distribution for DC component and Laplacian distri-
butions for AC components [1], [12]. In the quantization step,
the DCT coefficients are discretized. They are recovered in the
dequantization step as the multiples of quantization intervals.
Specifically, if denotes the th component of a
dequantized JPEG block in the DCT domain, it can be expressed
as , where is the th entry of the quan-
tization table, and is an integer. Fig. 4 shows a typical discrete
histogram of for all the blocks in an image. Non-zero
entries occur only at the multiples of . The envelop of
the histogram is roughly Gaussian for the DC component and
Laplacian for the AC components.
232 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 12, NO. 2, FEBRUARY 2003
(a)
(b)
(c)
Fig. 3. Histogram of neighbor differences comparing region I (within block)
and II (across blocks). (a) No compression is applied to image. (b) Moderate
compression is applied to image (IJG code with quality factor 75). (c) Difference
in histograms (I and II) for both cases.
Once the histogram of is established, the estimation
of is fairly straightforward. However, only
exists as an intermediate result. It is typically discarded after
Fig. 4. Histogram of jY (0; 1)j for image Lena (q(m; n) = 6), which is
formed by spaced peaks. The envelope, which has an exponential decay is only
included as a reference.
decompression. Theoretically, can be re-calculated
from the decoded image block, since IDCT is reversible.
Nevertheless in reality, the DCT of the image block usually
generates , which is not exactly , but an
approximated version of it.
There are mainly two sources of errors. Both of them were
introduced during the IDCT calculation. First, the pixel values,
typically integers, are rounded from real numbers. Second, any
number greater than 255 or smaller than 0 for a pixel value,
which is normally limited to 8 bits, is truncated to 255 or 0, re-
spectively. The truncation errors can be very significant, partic-
ularly at low bit-rates. Furthermore, they are difficult to model.
Fortunately, they occur only in a small percentage of blocks and
these blocks can be detected. We will discuss the treatment of
these blocks in Section IV. In this section, we assume all the
blocks are not truncated, and we will focus on rounding errors.
If we assume the rounding error for each pixel is an independent,
identically distributed, random variable with a uniform distribu-
tion in the range of , then the Gaussian distribution
will be a natural candidate for modeling according
to the Central Limit Theorem. The mean and the variance of
the distribution can be calculated as and 1/12, respec-
tively. With the exception of uniform blocks, which will be dis-
cussed later, our simulation have shown that the Gaussian model
is fairly reasonable. Although the data distribution has shorter
tails than the Gaussian distribution, it fits the model very well
when the deviation is small. At the tail part, we can show that
is limited. The reason is fairly simple.
As the rounding error for each pixel does not exceed 0.5, the
total rounding error is bounded by
(4)
where is for and is 1 otherwise. The right hand
side of (4) can be simplified as
(5)
where
for
for
for odd.
(6)
FAN AND DE QUEIROZ: IDENTIFICATION OF BITMAP COMPRESSION HISTORY 233
Fig. 5. Histogram of jY (0; 1)j for image Lena (q(m; n) = 6). As in the
case of Fig. 4, the decaying envelope is included only as a reference.
Consequently, we assume has a modi-
fied Gaussian distribution. It has a Gaussian shape in the range
of , and is zero outside the range. Specifically
else
(7)
where is a normalizing constant.
For a block with uniform intensity, where is nonzero
only for the DC term, the rounding errors for all the pixels in
the block have the same value, and are highly correlated. As a
result, the Central Limit Theorem and the Gaussian model are
no longer valid. In fact, in these blocks, has a zero
value for all AC coefficients and a uniform distribution for the
DC. As it will be explained in Section IV, the uniform blocks
are not considered in the estimation process. We assume that in
the following discussion all the blocks are nonuniform.
Excluding the uniform blocks and the blocks with trun-
cation, the distribution of is shown in Fig. 5. It
is a blurred version of Fig. 4. The discrete lines in Fig. 4
become bumps of Gaussian shapes in Fig. 5. These bumps
remain separated if the quantization interval is not too small,
i.e., . They may touch each other if
. The probability function of
for given can be determined as (for simplicity, we will
drop the index for rest of this section and for the first
part of the next section)
(8)
where
(9)
and the function is given in (7). As is zero beyond the
range of , there is only one nonzero term in the summa-
tion in (8) for . For a small ,
there will be a few terms. The summation is then over all integers
such that . The first factor in (8) characterizes
the bumps. The second factor, which represents the envelop, has
roughly a Gaussian (for DC) or a Laplacian (for AC) distribu-
tion.
Based on the above analysis, if we assume the statistics of
are independent for each image block, the likelihood function
can be established as
(10)
where the index refers to the th block and the distribution of
is given in (8). The MLE of can then be formulated as
(11)
There are two terms in the optimization. The first term fits the
data to the multiples of . The second one, which matches
the overall DCT distribution can be further calculated as
Cons (12)
where is the total number of blocks used in estimation. Con-
sequently, (11) becomes
(13)
From another point of view, the second term in (13) provides
a penalty for a smaller . Suppose , where is a
positive integer. It is always true that the first term in (13) is
no smaller for than it is for . In other words, it
is biased toward a smaller . This bias is compensated by the
second term.
We have so far assumed that is a real number. Practically,
the outputs of many DCT routines are integers. Also, as it will
be shown in the next section, approximating with an integer
would result in simpler calculations. If we denote as the in-
teger rounded from , the probability function for can be
obtained as
(14)
The likelihood maximization becomes
(15)
IV. MLE ALGORITHM
In this section, we describe the algorithm that calculates the
MLE estimation. First, we detect two kinds of image blocks:
234 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 12, NO. 2, FEBRUARY 2003
TABLE I
VALUES OF K FOR SEVERAL IMAGES AND QUALITY FACTORS USING THE PROPOSED ALGORITHM. BY THRESHOLDING THE VALUES TO 0.25 ONE CAN
DETECT JPEG COMPRESSED IMAGES WITH SOME CONFIDENCE. VALUES OF K > 0:25 ARE HIGHLIGHTED. NOTE THAT WITH THIS
THRESHOLD, DETECTION IS POSSIBLE WITH QUALITY FACTORS AS HIGH AS 90
uniform blocks and those with truncation. For each image block,
we find the maximum and the minimum values of the block. If
the maximum value is 255 or the minimum value is 0, the block
may contain truncated pixels. If the maximum value is equal to
the minimum value, the block is uniform. Both kinds of blocks
are excluded from further processing.
The data in the remaining blocks are used for MLE estima-
tion. Neither the likelihood defined in (13) nor its integer ap-
proximation (15) can be optimized analytically. However, nu-
merical complexity can be significantly reduced in maximiza-
tion of the integer version (15).
It can be easily shown that the probability function given in
(14) depends only on , which is modulated by . If
we denote , can assume only distinguished
values. As a result, the probability function for each value
can be pre-calculated, and we only have to count the number
of occurrences when assumes each of the different values.
To further minimize the computation, not all possible values
are tested in maximization of. If a histogram is built for ,
peaks can be found at the multiples of . Normally, the highest
peak outside the main lobe (0 and its vicinity) corresponds to
or one of its multiples. Based on this observation, we restrict
the search to be , , and their integer fractions,
where is the highest peak outside the main lobe ( > ).
For example, if is found to be 10, optimization in (15) will be
calculated for , 2, 3, 5, 9, 10, and 11.
For a particular coefficient , it is possible that no peak
is detected outside the main lobe. This occurs when
is small for all the blocks in the image. Typically, the histogram
decays rapidly to zero without showing any periodic structure.
The data contain little or no information about . The
estimation fails in that case and is declared to be “un-
determined.”
This estimation algorithm can also serve as another method to
determine if the image has been previously JPEG compressed.
If all the quantization levels are estimated to be 1, it is a good
indication that the image has not been JPEG compressed.
It is fairly common that the quantization table used is a scaled
version of a “standard” table, for example the JPEG baseline
table. In other words,
(16)
Fig. 6. Performance evaluation. Percentage of correct estimation is very high
and of error is very low. Undetermined coefficients are those which did not
exist in the image (likely eliminated by quantization) preventing any quantizer
estimation. Seven images were tested so that for each QF, 448 step sizes are
estimated.
where is the th entry of the standard quantiza-
tion table and is the scaling factor. In this case, we only need
to estimate . This can be performed by maximizing the joint
likelihood, which is the product of formula (15) over all the en-
tries . However, a simpler, but equally powerful method
exists. We can first estimate, using (15), a few “reliable” entries
that s are not too small. is then simply obtained
from these entries as
(17)
If a conflict occurs, i.e., different entries give different estima-
tion, a joint likelihood optimization may follow. Otherwise, (17)
serves as the final estimation, and the quantization matrix can be
retrieved using (16).
V. EXPERIMENTAL RESULTS
The detection algorithm was tested using various images and
QF’s and results are shown in Table I. For a given threshold
found empir