Palo Alto‘s new firewall delivered performance 10 times faster than when we tested in 2008, and came close to its rated capacity of 20Gbps in firewall-only mode, according to our exclusive Clear Choice testing.
Of course, there is always a tradeoff between security and performance. In the case of Palo Alto’s PA-5060, it all depends on what features you turn on and off.
Palo Alto has shaken up the firewall market with its “application aware” feature, and we found that this next-generation capability carries no performance penalty. The PA-5060 does application-layer inspection by default.
On the other hand – and this is a pretty big caveat – UTM rates were nowhere near the device’s stated 20Gbps limit. Performance was far lower with any UTM feature enabled than when the PA-5060 operated in firewall-only mode.
Regardless of which UTM features we enabled – intrusion prevention, antispyware, antivirus, or any combination of these – results were essentially the same as if we’d turned on just one such feature. Simply put, there’s no extra performance cost, beyond the initial sharp drop in rates, for layering on multiple types of traffic inspection.
Rates also fell when the device handled SSL traffic. And when decrypting SSL traffic, the system’s four 10-gigabit Ethernet interfaces ran at rates that would make Fast Ethernet aficionados smile.
Some of this is to be expected. All security devices slow down when handling SSL traffic, and we’ve seen far bigger drops, in percentage terms, when enabling UTM features.
Overall, we’d characterize the PA-5060 as a capable performer. While it offers many unique application-inspection capabilities, it doesn’t quite do away with the perennial question about security-vs.-performance tradeoffs.
Web Metrics
Forwarding rate was the primary metric in our tests. We used both mixed and static HTTP loads to measure rates under various configurations, along with separate tests to assess performance for SSL traffic. We also verified the PA-5060’s TCP connection capacity and connection setup rate.
The forwarding rate tests clearly show that the PA-5060, which can be equipped with up to four 10-Gbit/s interfaces, runs at least 10 times faster than earlier Palo Alto models.
In a test involving heavy Web traffic with a mix of content types and object sizes, the PA-5060 moved data at around 17Gbps when configured as a firewall.
That’s a bit under the system’s 20Gbps rated capacity, which isn’t altogether surprising since such data-sheet numbers often are obtained using best-case conditions such as a single large object requested over and over.
In contrast, the traffic load we used involved a mix of text, images and binary content of various sizes – just the sort of Web traffic often seen on enterprise networks. The 17Gbps rate we saw in testing is probably a more meaningful predictor of performance on production networks.
The mixed traffic load offered here is identical to the one Network World’s Joel Snyder used in his 2008 review of Palo Alto’s PA-4020 firewall. In that test, the PA-4020 topped out at around 1.6Gbps (vs. of a rated capacity of 2.0 Gbps).
UTM’s performance penalty
As with most other security devices, rates fall sharply if various UTM functions – such as antispyware, antivirus, and intrusion prevention capabilities – are enabled. Again using the same mixed Web load, we saw rates drop from 17Gbps to around 5.3G or 5.4Gbps.
The good news is that rates held steady regardless of the number of UTM functions in use. So, it doesn’t matter whether the PA-5060 does antispyware, antivirus, intrusion prevention, or any combination of these.
One way of boosting forwarding rates is to disable server response inspection, which checks traffic flowing from servers to clients. Disabling this feature caused rates to nearly triple, to 13.7Gbps. This setting is mainly useful when the firewall sits in front of data centers or other server farms. Enterprise network managers deploying firewalls to protect clients will want to keep server inspection enabled (which is the default setting).
Speed Bump: SSL Handling
SSL encryption is compute-intensive. Even with dedicated silicon for the task, the PA-5060, like virtually all other high-end firewalls, is a far slower performer when handling SSL traffic.
The PA-5060 generally moved traffic at around 7.5G to 7.6Gbps in every test case. We initially suspected that the nearly identical rates were caused by some limit in our test gear. But back-to-back tests of the Spirent Avalanche equipment without the PA-5060 in line moved traffic at around 8.6Gbps, faster than the firewall. So the test gear wasn’t the bottleneck. (See our test methodology.)
Rates for SSL traffic are higher than those for cleartext traffic, except in the firewall-only test case. This suggests the PA-5060 does less inspection of SSL traffic by default. Palo Alto’s engineers confirmed this, but only for the particular traffic generated by Spirent Avalanche; in this case, the PA-5060 simply classified the traffic as type “SSL” and did no further inspection. Palo Alto says there are cases where the PA-5060 can detect certain attacks hidden in SSL traffic, but we did not attempt to verify that claim.
The PA-5060 does support decryption of SSL traffic for deeper inspection, but that feature comes with a heavy performance cost. When doing SSL decryption, rates fell to 986Mbps when the PA-5060 acted as a firewall, and just 108Mbps with all UTM features enabled.
Both numbers are a long way off from the 17-Gbps rates we saw in the cleartext tests, or even the 7.5-Gbps rates in the SSL tests without decryption. If higher-speed decryption of SSL is required, network managers might consider a purpose-built appliance such as those from Netronome and other vendors.
Static Object Handling
A traffic load that mixes object sizes offers one approximation of what enterprise Web traffic might look like, but it’s certainly not applicable in all situations. We also ran separate tests with fixed object sizes: One with 10-kbyte objects, since this is close to the average object size as observed in many studies of Web logs, and another with 512-kbyte objects, since this large size would better describe maximum firewall rates.
Of course, no production network carries Web traffic where every request is for 10- or 512-kbyte objects, but modeling some allegedly “real world” condition wasn’t the goal here. The tests with static object sizes had a simpler goal: To describe the limits of firewall performance when handling average and large Web objects.
Not surprisingly, the PA-5060 turned in its single fastest result, nearly 18.7Gbps, in tests when configured as a firewall and presented with 512-kbyte objects. With average 10-kbyte objects, rates were a bit slower, around 16.3Gbps.
Enabling UTM features produced a similar result as with the mixed-object loads: Rates were substantially lower than in the firewall-only tests, but very consistent regardless of which combination of antispyware, antivirus and intrusion prevention we used. Here again, the PA-5060 moved large objects faster than average-sized objects after we’d enabled UTM features, though by a smaller margin than in the firewall-only tests. With UTM features turned on, the PA-5060 moved large objects only about 1Gbps faster (around 6.2G to 6.3Gbps) than average-size objects (around 5.2Gbps).
The PA-5060 also moved SSL traffic at lower rates when static objects were involved, especially in tests with large objects. This is an expected result, since more bytes means more work for the device’s encryption engine. In most SSL test cases, rates were around 10.5G to 11Gbps with average-size objects and around 8.8Gbps with large objects.
Also, traffic rates for SSL were around the same regardless of which features we enabled or disabled on the firewall. As in the mixed-object tests, the PA-5060 didn’t try any further inspection after classifying the Spirent traffic as SSL.
Decrypting SSL traffic carried a heavy performance cost, even higher than in the mixed-object tests. With SSL decryption enabled, rates fell as low as 100Mbps when we offered large objects to the PA-5060. And, we used the weaker RC4-MD5 cipher; if anything, rates would likely be lower still with a stronger cipher such as AES256-SHA1.
TCP Connection Testing
While traffic rates are undoubtedly useful in characterizing firewall performance, they’re not the only metrics that matter. We also conducted separate tests to determine how many concurrent connections the PA-5060 could handle, and how quickly it could set up and tear down those connections.
In the TCP connection capacity tests, we configured Spirent Avalanche to build up successively larger connection counts by having each existing connection make one new HTTP request every 60 seconds. The largest number of concurrent connections the PA-5060 handled without errors was 3,620,979. While 3.6 million is a huge number, it’s also less than the device’s rated capacity of 4 million. After testing concluded, Palo Alto said it had identified a bug in the software version we tested, and that a release scheduled for release by press time would allow the firewall to handle 4 million concurrent connections. We did not test the new software.
In a related test, we also examined the maximum rate at which the firewall would set up and tear down new connections. Here, we configured Spirent Avalanche to use HTTP version 1.0, forcing each HTTP request to set up a new TCP connection. When handling this load, the PA-5060 handled 44,120 connections per second error-free when using all four of the device’s 10-gigabit Ethernet interfaces. In tests involving two interfaces and an earlier version of the Palo Alto software, we observed error-free rates of nearly 47,000 connections per second. Either rate is very high and will probably be more than sufficient for the majority of enterprise users.
While there’s room for improvement in the PA-5060’s performance, especially when it comes to UTM performance and SSL decryption, we’re encouraged by these results. The PA-5060 is already far faster than the PA-4020 tested earlier, and it’s still one of the few firewalls with true application-layer inspection capabilities. With some optimizations to UTM and SSL performance, it may do away with security/performance tradeoffs once and for all.