Herb Sutter likes to point out how Console.WriteLine() calls a virtual method - ToString() - to illustrate the reliance on virtual methods in .NET vs direct method calls (through templates) in C++. He's done it on his blog, and more recently at Lang.NEXT.
What's being implied here of course is that .NET is generally slower than C++. While Sutter might have a point, text input/output is a poor example, because C++'s standard streams are slower than you can imagine. Much, much slower than .NET's. Raymond Chen and Rico Mariani famously competed at writing a fast English/Chinese dictionary in C++ vs C#; the C# version was several times faster until Raymond Chen ultimately scrapped all standard iostreams and wrote his own.
By curiosity, I decided to write a simple benchmark and see for myself. The operation would be this:
- read a chunky text file as a list or array of lines
- reverse each line
- write the results to the original file.
That, repeated multiple times and benchmarked.
The approach for .NET: File.ReadAllLines(), a for loop over the result, Array.Reverse, and File.WriteAllLines. That's the way I would write this procedure normally; it's as simple as possible, and I can't think of an obvious way of making it more efficient.
The approach for C++: C++ has std::reverse which is the equivalent of Array.Reverse (but more generic), however, it doesn't have an equivalent of File.Read/Write-AllLines. I searched a bit online to see what was considered the standard approach, and eventually settled on opening an ifstream, reading the file line by line into a vector of strings using std::getline, and outputting it to an ofstream using the operator <<.
If an obvious, faster approach exists for C++, I'm all ears. Note that storing the lines as a vector here is a requirement of the test, not an implementation detail (An array could be used, but then the number of lines would need to be known in advance. I don't see an easy way of doing that.) Of course, the test could be done line-by-line, avoiding the creation of multiple strings and a container - but the goal of the test is precisely to benchmark the performance of all these things. Admittedly, it benchmarks more than just I/O - but I don't do this every day so I might as well benchmark a few things together.
My first implementation was in C++/CLI, and for 20 iterations over a random ~175KB json file I had, the results were:
standard cpp streams: 340 ms
.NET System.IO : 58 ms
Unsure whether these results were skewed towards .NET or C++ due to using the CLR switch, I created separate C++ and C# projects instead. The results:
C++: 134 ms
C#: 57 ms
Note that this is, of course, in Release mode and without a debugger attached. Just for fun, let's see what happens in the typical "F5" scenario: debugging the Debug build.
C++: 13653 ms (that's 13,6 seconds, yes!)
C#: 59 ms
This is consistent with other benchmarks I did in the past: C# is relatively unaffected by being built in Debug mode and having a debugger attached; C++ presents extreme discrepancies in that regard.
So there you have it. C# is twice as fast as C++ at reversing lines of text in a text file (and in debug it's 231 times as fast). So much for avoiding virtual method calls, eh?
I'm aware that a faster C++ version is possible. The C++ version I wrote is pure STL and very straightforward. It is of course possible to use faster libraries, or to write your own, but the purpose of the test is just to compare the standard facilities of both languages.
Update(2/5/2012): Thanks to Brandon Live for pointing out some inconsistencies betweeen the two versions: notably, C++ counted time differently, which resulted in a very slight advantage for C#, and C++ had no "warm-up" call. I've updated the numbers and code, but the difference is small enough as to not affect any of the conclusions: C# is comfortably twice as fast in stand-alone and 200+ times as fast when debugged (and yes that includes with "native debugging" on).
C# version:
C++ version:
What's being implied here of course is that .NET is generally slower than C++. While Sutter might have a point, text input/output is a poor example, because C++'s standard streams are slower than you can imagine. Much, much slower than .NET's. Raymond Chen and Rico Mariani famously competed at writing a fast English/Chinese dictionary in C++ vs C#; the C# version was several times faster until Raymond Chen ultimately scrapped all standard iostreams and wrote his own.
By curiosity, I decided to write a simple benchmark and see for myself. The operation would be this:
- read a chunky text file as a list or array of lines
- reverse each line
- write the results to the original file.
That, repeated multiple times and benchmarked.
The approach for .NET: File.ReadAllLines(), a for loop over the result, Array.Reverse, and File.WriteAllLines. That's the way I would write this procedure normally; it's as simple as possible, and I can't think of an obvious way of making it more efficient.
The approach for C++: C++ has std::reverse which is the equivalent of Array.Reverse (but more generic), however, it doesn't have an equivalent of File.Read/Write-AllLines. I searched a bit online to see what was considered the standard approach, and eventually settled on opening an ifstream, reading the file line by line into a vector of strings using std::getline, and outputting it to an ofstream using the operator <<.
If an obvious, faster approach exists for C++, I'm all ears. Note that storing the lines as a vector here is a requirement of the test, not an implementation detail (An array could be used, but then the number of lines would need to be known in advance. I don't see an easy way of doing that.) Of course, the test could be done line-by-line, avoiding the creation of multiple strings and a container - but the goal of the test is precisely to benchmark the performance of all these things. Admittedly, it benchmarks more than just I/O - but I don't do this every day so I might as well benchmark a few things together.
My first implementation was in C++/CLI, and for 20 iterations over a random ~175KB json file I had, the results were:
standard cpp streams: 340 ms
.NET System.IO : 58 ms
Unsure whether these results were skewed towards .NET or C++ due to using the CLR switch, I created separate C++ and C# projects instead. The results:
C++: 134 ms
C#: 57 ms
Note that this is, of course, in Release mode and without a debugger attached. Just for fun, let's see what happens in the typical "F5" scenario: debugging the Debug build.
C++: 13653 ms (that's 13,6 seconds, yes!)
C#: 59 ms
This is consistent with other benchmarks I did in the past: C# is relatively unaffected by being built in Debug mode and having a debugger attached; C++ presents extreme discrepancies in that regard.
So there you have it. C# is twice as fast as C++ at reversing lines of text in a text file (and in debug it's 231 times as fast). So much for avoiding virtual method calls, eh?
I'm aware that a faster C++ version is possible. The C++ version I wrote is pure STL and very straightforward. It is of course possible to use faster libraries, or to write your own, but the purpose of the test is just to compare the standard facilities of both languages.
Update(2/5/2012): Thanks to Brandon Live for pointing out some inconsistencies betweeen the two versions: notably, C++ counted time differently, which resulted in a very slight advantage for C#, and C++ had no "warm-up" call. I've updated the numbers and code, but the difference is small enough as to not affect any of the conclusions: C# is comfortably twice as fast in stand-alone and 200+ times as fast when debugged (and yes that includes with "native debugging" on).
C# version:
using System; using System.Diagnostics; using System.IO; namespace PerfTestCSharp { class Program { static void Main() { // Call once for warm-up CSharpPerformOperation(); long csharpTime = 0; for (int i = 0; i < 20; ++i) { var sw = Stopwatch.StartNew(); CSharpPerformOperation(); csharpTime += sw.ElapsedMilliseconds; } Console.WriteLine(""); Console.WriteLine("C# time: {0} ms", csharpTime); Console.ReadKey(); } static void CSharpPerformOperation() { var lines = File.ReadAllLines("text.txt"); for (int i = 0; i < lines.Length; ++i) { var charArr = lines[i].ToCharArray(); Array.Reverse(charArr); lines[i] = new string(charArr); } File.WriteAllLines("text.txt", lines); } } }
C++ version:
#include <vector> #include <fstream> #include <string> #include <algorithm> #include <iostream> #include <windows.h> #include <sstream> using namespace std; // implementation of a high-precision counter from http://stackoverflow.com/questions/1739259/how-to-use-queryperformancecounter double PCFreq = 0.0; __int64 CounterStart = 0; void StartCounter() { LARGE_INTEGER li; if(!QueryPerformanceFrequency(&li)) cout << "QueryPerformanceFrequency failed!\n"; PCFreq = double(li.QuadPart)/1000.0; QueryPerformanceCounter(&li); CounterStart = li.QuadPart; } double GetCounter() { LARGE_INTEGER li; QueryPerformanceCounter(&li); return double(li.QuadPart-CounterStart)/PCFreq; } void CPPPerformOperation() { vector<string> lines; ifstream inFile("text.txt"); string line; while(getline(inFile, line)) { lines.push_back(line); } for (size_t i = 0; i < lines.size(); ++i) { reverse(begin(lines[i]), end(lines[i])); } ofstream outFile("text.txt"); for (auto it = begin(lines); it != end(lines); ++it) { outFile << *it << "\n"; } } int main() { // call once for warm-up... CPPPerformOperation(); double totalTime = 0; for (int i = 0; i < 20; ++i) { StartCounter(); CPPPerformOperation(); totalTime += GetCounter(); } cout << "CPP time : " << totalTime << " milliseconds."; system("pause"); }
I think what Herb was really trying to get at with the virtual dispatch is that with C++ templates you can implement something that can avoid it.
Obviously you could have a class that implements a virtual ToString just like C# and Java, but there is also the possibility for a faster implementation (such as boost::lexical_cast or boost::spirit). Here instead of having to go through the virtual dispatch (which could inhibit inlining) the compiler will know which conversion methods needs to be called based on the class' type and potentially inline it.
As a note, lexical_cast had a lot of performance problems in the past, solely because it was based on stringstream (which is a part of the crap that is iostreams). Later Boost versions have spent time optimising the conversion routines to eliminate this dependancy.
Anyway there are two problems with his comment: the first is that you get different behaviour when dealing with inheritance and the second is that this is a micro-optimisation you generally don't care about.