You are on page 1of 7

Comparing Industry-Leading

Anti-Spam Services
Results from Twelve Months of Testing
Joel Snyder
Opus One®
April, 2016

INTRODUCTION

The following analysis summarizes the spam catch and false positive rates of the leading antispam vendors. Compiled by Opus One, an independent research firm, this report provides data to
objectively compare the market’s most popular anti-spam solution.

Working with Cisco, we identified six enterprise-focused vendors to compare to Cisco’s Email
Security Solution (Appliance and Cloud-Based): Barracuda Networks, Intel Security (formerly
McAfee), Microsoft Office 365, Proofpoint, Sophos, and Symantec. The only vendor mentioned by
name in the rest of this report is Cisco. The remaining vendor names have been anonymized.

TEST METHODOLOGY

To ensure consistency and reliability, Opus One operated within the following parameters during
the 12-month long analysis from April 2015 to March 2016:




Approximately 10,000 messages were selected at random for testing each month, with a
total of 145,709 messages in the final evaluation set
Messages were drawn from actual corporate production mail streams
Messages were received live and tested with less than a one-second delay
Tested products were acquired directly from the vendor or through normal distribution
channels and were under active support contracts. Cloud-based solutions were only
used when an appliance-based solution was not available. Tested products were “up to
date” with current released software and signature updates and all settings were
reviewed by each vendor’s own technical support team
Messages were hand classified as “spam” and “not spam” to ensure data validity
Each of the tested products included the vendor-recommended or integrated reputation
service in the results

Table of Contents
Introduction and Test Methodology
Test Results
Spam Catch Rate Results
False Positive Results
Summary
Appendix

1
2
3
4
5
6

The test results reported here are taken from Opus One’s
continuing anti-spam testing program. With over ten years of
monthly results, Opus One is uniquely positioned to provide
objective efficacy reporting across all major anti-spam
products. While testing occurred in North America, message
sources were global. See the appendix at the conclusion of
this report for further test methodology details and definitions
of terms.

TEST RESULTS

Cisco’s email security solution demonstrated the most accurate rate of detection and the best
spam capture rate. The results are remarkable given the tradeoff between spam capture and
false positive rates. For example, a vendor can catch 100% of spam if they block every message
but then the false positive rate would also be 100%, which is obviously unacceptable.
In our testing, there are months when other vendors out-performed Cisco in catching spam, but
this always caused a jump in their false positive rates. We saw only a single month in the last 12
months where another product matched Cisco Email Security on both spam catch rate and false
positive rate: in June, 2015, vendor E had a 0.1% better catch rate, and the same false positive
rate at Cisco Email Security.
The results showing false positive rate and spam catch rate are summarized in the graph below.
The data points shown are averages across the entire year, scaled to fit.

Comparative Anti-Spam Efficacy

Spam Accuracy Rate (1-FP rate)

Best

Cisco

Vendor A

Vendor C

Vendor E

Vendor F

Vendor D

Vendor B

Worst


OPUS ONE

Spam Capture Rate

Best

2

SPAM CATCH RATE RESULTS

The spam catch rate has a direct impact on end-users’ satisfaction and productivity. With the high
daily global volume of spam, even the slightest reduction in catch rates can have a major adverse
effect. Cisco’s own anti-spam engine had the highest catch rate averaged across the year-long
period covered by this report. The table below compares other vendors to Cisco by showing how
much spam they missed compared to Cisco. For example, a user protected by Vendor C would
have more than twice as much spam as their inbox than a user protected by Cisco.

Missed Spam Rate
Relative to Leader

Vendor
Cisco

n/a

Vendor B

130%

Vendor F

132%

Vendor E

141%

Vendor A

163%

Vendor C

179%

Vendor D

209%


Average annual spam catch rate results by vendor over the testing period are graphed below.
The average for the year, as well as worst and best monthly performance values, are shown.

Anti-Spam Catch Rate
Average, Best, and Worst Values for 12 months
100.00%

Best
97.50%

Average
Worst

95.00%

92.50%
Cisco

OPUS ONE

A

B

C

D

E

F

3

FALSE POSITIVE RESULTS
Because of the mission critical nature of email, it is essential that an enterprise’s anti-spam
solution deliver a low false positive rate. Messages incorrectly quarantined and blocked pose a
serious loss of time and productivity for system administrators and end-users. In some cases,
false positives also have a negative financial impact on the organization. The relative results over
the year-long period ending March 2016 are as follows:
False Positive Rate
Relative to Leader

Vendor
Cisco

n/a

Vendor A

143%

Vendor E

180%

Vendor C

188%

Vendor F

358%

Vendor D

1577%

Vendor B

4705%

The false positive performance of each vendor is shown in the graph below. The average for the
entire year, along with worst and best performance are graphed. In this graph, higher numbers
show higher accuracy and are better.

Anti-Spam False Positive Accuracy
Average, Best and Worst Values for 12 months
100.00%

99.50%

99.00%

98.50%

98.00%

Cisco

A

E

C

F

D

B

OPUS ONE

4

SUMMARY

Our testing shows that Cisco offers an industry-leading solution to blocking unwanted spam in
enterprises and organizations around the world. Given the essential role of email in the
operations of modern enterprises, spam poses a serious threat to their success. When a spam
message finds its way into a user’s inbox or a legitimate message is incorrectly identified as spam
and quarantined, there is an immediate impact on productivity. While performance of the
solutions evaluated in this analysis may vary by only a few percentage points, it’s important to
recognize that this difference can translate into hundreds, if not thousands, of unwanted and
potentially problematic messages infiltrating a network. With outstanding performance in catching
spam, and avoiding false positives, Cisco’s anti-spam technology should be on the short-list for
anyone considering a new email security gateway.
Over the years, much ground has been gained in the battle against spam. Nevertheless, the
number of threat messages continues to rise, demanding increasingly sophisticated and capable
defense systems. The productivity of the global marketplace demands it.
























ABOUT OPUS ONE

Opus One is an information technology consultancy with experience in the areas of messaging,
security, and networking. Opus One has provided objective testing results for publication and
private use since 1983.
This document is copyright © 2016 Opus One, Inc.

OPUS ONE

5

APPENDIX

DEFINITION OF TERMS

Spam is unsolicited commercial bulk email. We consider messages to be “spam” if there is no
business or personal relationship between sender and receiver and which are obviously bulk in
nature. Mail messages that may not have been solicited, but which show a clear business or
personal relationship between sender and receiver, or are obviously a one-to-one message, even
if unsolicited and unwanted, are not considered “spam.”
Spam catch rate measures how well the spam filter catches spam. We have used the
commonly accepted definition of specificity, which is the number of spam messages caught
divided by the total number of spam messages received. The missed spam is one minus the
spam catch rate.
False positive rate measures the number of legitimate emails misclassified as spam. Different
vendors and testing services define false positive rate in different ways, typically either specificity
or positive predictive value. In this report, false positive rate is defined using positive predictive
value as (1 – ((messages marked as spam – false positives)/(total messages marked as spam))).
The spam accuracy rate is one minus the false positive rate.

TESTING METHODOLOGY

Anti-spam products were evaluated by installing them in a production mail stream environment.
The test simultaneously feeds the same production stream to each product, recording the verdict
(typically “spam,” “not spam,” or “suspected spam”) for later comparison.
Each product tested was acquired directly from the vendor or through normal distribution
channels. Each product tested was under an active support contract, and was believed to be upto-date with publicly released software and signature updates.
Where multiple versions were available from a vendor, the technical support team for each
vendor was consulted to determine the recommended platform and version for use. To minimize
confusion, products were not upgraded during the test cycle. Each product was connected to the
Internet to retrieve signature and software updates as often as recommended by the vendor. If
vendor technical support teams recommend a shorter update cycle, this recommendation was
implemented.
All systems were able to connect to the Internet for updates and DNS lookups. A firewall was
placed between each product and the Internet to block inbound connections, while outbound
connections were completely unrestricted on all ports.
Each product was configured based on the product manufacturer’s recommended settings. In
cases where obviously inappropriate settings were included by default, these settings were
changed to support the production mail stream. “Maximum message size” -- to accommodate
messages of varying sizes -- was the most commonly changed setting.
The tests drew on the real .COM corporate message stream because this message stream
contains no artificial content and best represents the normal enterprise stream. No spurious spam
or non-spam content was injected into the stream. No artificial methods to attract spam were
employed.

OPUS ONE

6

Because products were not receiving email directly from the Internet, the reputation service of
each product had to be individually configured to support the multi-hop configuration. In cases
where products were unable to handle a multi-hop configuration with reputation service, the
reputation service results were gathered at the edge of the network and then re-combined with
the anti-spam results after the test was completed.
Once the messages were received, Opus One manually read through every single message,
classifying it as “spam,” “not spam,” or “unknown” according to the definitions above. All mailing
lists which have legitimate subscriptions were considered “not spam,” irrespective of the content
of any individual message.
Messages were classified as “unknown” if they could not be definitively categorized as “spam” or
“not spam” based on content, or if they were so malformed that it could not be determined that
they were spam, viruses, or corrupt software. All "unknown" messages were deleted from the
data set, and do not factor into the result statistics. The total number of “unknown” messages in
the sample was small, typically less than 0.1% of the total sample size.
Once the manual qualification of messages was completed, all results were placed in an SQL
database. Queries were then run to create false positive and false negative (missed spam) lists.
False positives and false negatives for each product were evaluated and any errors in the original
manual classification were fixed. Once the data sets were determined to be within acceptable
error rates, the databases were reloaded and the queries recreated.
Each anti-spam engine provides a verdict on messages. While this is often internally represented
as a number, the verdict in most products is reduced to a categorization of each message as
being “spam” or “not spam.” In many anti-spam products, a third category is included, typically
called “suspected spam.”
In this test, products were configured at the factory-default settings, where possible, to have three
verdicts (spam, suspect spam, and not spam). Where products have three verdicts, suspect
spam is considered to be spam. As a result, suspect spam was included in the catch rate and
false positive rate calculations. The one exception to this is Vendor E; in this product, “suspected
spam” is actually marketing mail and not considered spam.
Catch rate refers to the number of spam messages caught out of the total number of spam
messages received. When spam is not caught, it is called a false negative.
• False negative means the test said “this was not spam,” and it was.
• False positive means the test said “this was spam,” and it wasn’t.

Spam catch rate is calculated and compared using Sensitivity; false positive rate is calculated
and compared using the inverse Positive Predictive Value.

OPUS ONE

7