Last updated at Wed, 27 Sep 2017 15:29:35 GMT
The Metasploit Framework has had performance issues at startup for a long time. It is not uncommon for initial loading of our 600 modules to take upwards of 30 seconds (or worse on older hardware). Previously, I attributed the slow startup time to the massive amount of ruby code and the staggering number of objects that had to be instantiated. After all, msf is far and away the biggest ruby project on the planet. HD has done an excellent job with the new module format to reduce that initial overhead but algorithm complexity evidently has more to do with the problem. A couple of weeks ago Yoann Guillot sent us a simple patch that changed a few uses of = on String objects to <<. The performance change was profound.
It turns out that a << b realloc()'s a to be big enough to hold both buffers, then concatenates b to the end of a, a Big-Oh of n operation. a = b, on the other hand, makes a new buffer big enough to hold both a and b, then copies them each into the new buffer. By itself, this will always be a little slower than <<, but it is still O(n).
However...
framework3 $ time ruby -e 'a = "A"; 100000.times { a << "A" }'
real 0m0.338s
user 0m0.312s
sys 0m0.024s
framework3 $ time ruby -e 'a = "A"; 100000.times { a += "A" }'
real 0m15.462s
user 0m15.321s
sys 0m0.068s
Put = in a loop, as in the above example, and it becomes a O(n2) operation because you have to copy "A", then "AA", then "AAA"... Now that we've seen the underlying problem, let's see what kind of difference the patch makes.
Before the patch:
framework3 $ rm ~/.msf3/modcache && time (echo exit | ./msfconsole >/dev/null)
real 0m45.428s
user 0m43.679s
sys 0m0.912s
After the patch:
framework3 $ rm ~/.msf3/modcache && time (echo exit | ./msfconsole >/dev/null)
real 0m15.970s
user 0m15.213s
sys 0m0.548s
As you can see, startup time is still not blazingly fast on old hardware (all of the benchmarks were performed on a 1.4GHz Pentium M) but it is now much more
tolerable.
Before you run off and change every instance of = to << in your ruby code, it's important to note that the two don't perform the same operation. Because ruby does assignment by reference, the latter overwrites any variables that point to the one you're operating on while the former leaves any references untouched.
framework3 $ irb
>> a = "A"
=> "A"
>> b = a
=> "A"
>> a << "B"
=> "AB"
>> b
=> "AB">> c = "C"
=> "C"
>> d = c
=> "C"
>> c += "D"
=> "CD"
>> d
=> "C"
This is really the first time in my ruby experience where I've had to think about the underlying implementation. Up until now, the ruby interpreter was a magical box from which dazzling lights and impressive fireballs of code sprang to life. Now I see the man behind the curtain around every corner.