Join the Stack Overflow Community
Stack Overflow is a community of 6.7 million programmers, just like you, helping each other.
Join them; it only takes a minute:
Sign up

For a long time, I have thought of C++ as faster than javascript. However, today I made a benchmark script to compare the speed of floating point calculations in the two languages and the result is amazing!

Javascript is almost 4 times faster than c++!

I let both of the languages to do the same job on my i5-430M laptop, performing a = a + b for 100000000 times. C++ takes about 410 ms, while javascript takes only about 120 ms.

I really do not have any idea why javascript can run so fast in this case. Can anyone explain that?

The code I used for the javascript is (run with nodejs):

(function() {
    var a = 3.1415926, b = 2.718;
    var i, j, d1, d2;
    for(j=0; j<10; j++) {
        d1 = new Date();
        for(i=0; i<100000000; i++) {
            a = a + b;
        }
        d2 = new Date();
        console.log("Time Cost:" + (d2.getTime() - d1.getTime()) + "ms");
    }
    console.log("a = " + a);
})();

And the code for c++ (compiled by g++) is:

#include <stdio.h>
#include <ctime>

int main() {
    double a = 3.1415926, b = 2.718;
    int i, j;
    clock_t start, end;
    for(j=0; j<10; j++) {
        start = clock();
        for(i=0; i<100000000; i++) {
            a = a + b;
        }
        end = clock();
        printf("Time Cost: %dms\n", (end - start) * 1000 / CLOCKS_PER_SEC);
    }
    printf("a = %lf\n", a);
    return 0;
}
share|improve this question
4  
Please add -O3 -ffast-math and see what happens with the C++ timings. – Jesse Good Jun 11 '13 at 3:34
5  
"For long time, I always think that c++ should be faster than javascript. " You do understand that Javascript engines are usually implemented in C++ – jamylak Jun 11 '13 at 3:40
5  
With optimization turned on, the C++ version is showing about 90 ms (though that obviously varies with the processor). – Jerry Coffin Jun 11 '13 at 3:40
7  
@user2189264: Who cares? Testing optimization with optimization turned off makes no sense. – Jerry Coffin Jun 11 '13 at 3:46
2  
@user2189264: See expanded answer below. I've tested on both a current Intel processor and a an old AMD that's quite slow by current standards. Both are giving substantially better results than you're seeing. – Jerry Coffin Jun 11 '13 at 4:22
up vote 111 down vote accepted

I may have some bad news for you if you're on a Linux system (which complies with POSIX at least in this situation). The clock() call returns number of clock ticks consumed by the program and scaled by CLOCKS_PER_SEC, which is 1,000,000.

That means, if you're on such a system, you're talking in microseconds for C and milliseconds for JavaScript (as per the JS online docs). So, rather than JS being four times faster, C++ is actually 250 times faster.

Now it may be that you're on a system where CLOCKS_PER_SECOND is something other than a million, you can run the following program on your system to see if it's scaled by the same value:

#include <stdio.h>
#include <time.h>
#include <stdlib.h>

#define MILLION * 1000000

static void commaOut (int n, char c) {
    if (n < 1000) {
        printf ("%d%c", n, c);
        return;
    }

    commaOut (n / 1000, ',');
    printf ("%03d%c", n % 1000, c);
}

int main (int argc, char *argv[]) {
    int i;

    system("date");
    clock_t start = clock();
    clock_t end = start;

    while (end - start < 30 MILLION) {
        for (i = 10 MILLION; i > 0; i--) {};
        end = clock();
    }

    system("date");
    commaOut (end - start, '\n');

    return 0;
}

The output on my box is:

Tuesday 17 November  11:53:01 AWST 2015
Tuesday 17 November  11:53:31 AWST 2015
30,001,946

showing that the scaling factor is a million. If you run that program, or investigate CLOCKS_PER_SEC and it's not a scaling factor of one million, you need to look at some other things.


The first step is to ensure your code is actually being optimised by the compiler. That means, for example, setting -O2 or -O3 for gcc.

On my system with unoptimised code, I see:

Time Cost: 320ms
Time Cost: 300ms
Time Cost: 300ms
Time Cost: 300ms
Time Cost: 300ms
Time Cost: 300ms
Time Cost: 300ms
Time Cost: 300ms
Time Cost: 300ms
Time Cost: 300ms
a = 2717999973.760710

and it's three times faster with -O2, albeit with a slightly different answer, though only by about one millionth of a percent:

Time Cost: 140ms
Time Cost: 110ms
Time Cost: 100ms
Time Cost: 100ms
Time Cost: 100ms
Time Cost: 100ms
Time Cost: 100ms
Time Cost: 100ms
Time Cost: 100ms
Time Cost: 100ms
a = 2718000003.159864

That would bring the two situations back on par with each other, something I'd expect since JavaScript is not some interpreted beast like in the old days, where each token is interpreted whenever it's seen.

Modern JavaScript engines (V8, Rhino, etc) can compile the code to an intermediate form (or even to machine language) which may allow performance roughly equal with compiled languages like C.

But, to be honest, you don't tend to choose JavaScript or C++ for its speed, you choose them for their areas of strength. There aren't many C compilers floating around inside browsers and I've not noticed many operating systems nor embedded apps written in JavaScript.

share|improve this answer
8  
Man. You saved us! – Mark Garcia Jun 11 '13 at 3:47
1  
I think it is not the case, 400ms is something is easy to feel. The output appears really slow than javascript. – streaver91 Jun 11 '13 at 3:52
1  
I mean the time cost of my origin scripts will be printed to the screen after each big loop(10 big loops altogether). And the time of each loop is 400 ms for c++, 100ms for javascript, and these are long enough for me too feel the difference. – streaver91 Jun 11 '13 at 4:05
8  
@user2189264, don't feel, measure! Feeling may be good to start a hypothesis but it's no good in evaluating it :-) In any case, printing times outside of the program being called includes stuff outside of what you're measuring (such as the afore-mentioned process startup/shutdown). – paxdiablo Jun 11 '13 at 4:06
1  
I bet the clock returns milliseconds on my computer. I change the inner loop to 1000000000, ten times more than the original value. And the time takes for each inner loop is 4 seconds, and the output of each inner loop is about 4100ms. So either the print method can take 4 seconds and the loop take 4.1ms, or the loop really take 4100ms. The print method cannot take such long time, as in the previous case, each loop, include printing, takes only about half a second. – streaver91 Jun 11 '13 at 4:20

Doing a quick test with turning on optimization, I got results of about 150 ms for an ancient AMD 64 X2 processor, and about 90 ms for a reasonably recent Intel i7 processor.

Then I did a little more to give some idea of one reason you might want to use C++. I unrolled four iterations of the loop, to get this:

#include <stdio.h>
#include <ctime>

int main() {
    double a = 3.1415926, b = 2.718;
    double c = 0.0, d=0.0, e=0.0;
    int i, j;
    clock_t start, end;
    for(j=0; j<10; j++) {
        start = clock();
        for(i=0; i<100000000; i+=4) {
            a += b;
            c += b;
            d += b;
            e += b;
        }
        a += c + d + e;
        end = clock();
        printf("Time Cost: %fms\n", (1000.0 * (end - start))/CLOCKS_PER_SEC);
    }
    printf("a = %lf\n", a);
    return 0;
}

This let the C++ code run in about 44ms on the AMD (forgot to run this version on the Intel). Then I turned on the compiler's auto-vectorizer (-Qpar with VC++). This reduced the time a little further still, to about 40 ms on the AMD, and 30 ms on the Intel.

Bottom line: if you want to use C++, you really need to learn how to use the compiler. If you want to get really good results, you probably also want to learn how to write better code.

I should add: I didn't attempt to test a version under Javascript with the loop unrolled. Doing so might provide a similar (or at least some) speed improvement in JS as well. Personally, I think making the code fast is a lot more interesting than comparing Javascript to C++.

If you want code like this to run fast, unroll the loop (at least in C++).

Since the subject of parallel computing arose, I thought I'd add another version using OpenMP. While I was at it, I cleaned up the code a little bit, so I could keep track of what was going on. I also changed the timing code a bit, to display the overall time instead of the time for each execution of the inner loop. The resulting code looked like this:

#include <stdio.h>
#include <ctime>

int main() {
    double total = 0.0;
    double inc = 2.718;
    int i, j;
    clock_t start, end;
    start = clock();

    #pragma omp parallel for reduction(+:total) firstprivate(inc)
    for(j=0; j<10; j++) {
        double a=0.0, b=0.0, c=0.0, d=0.0;
        for(i=0; i<100000000; i+=4) {
            a += inc;
            b += inc;
            c += inc;
            d += inc;
        }
        total += a + b + c + d;
    }
    end = clock();
    printf("Time Cost: %fms\n", (1000.0 * (end - start))/CLOCKS_PER_SEC);

    printf("a = %lf\n", total);
    return 0;
}

The primary addition here is the following (admittedly somewhat arcane) line:

#pragma omp parallel for reduction(+:total) firstprivate(inc)

This tells the compiler to execute the outer loop in multiple threads, with a separate copy of inc for each thread, and adding together the individual values of total after the parallel section.

The result is about what you'd probably expect. If we don't enable OpenMP with the compiler's -openmp flag, the reported time is about 10 times what we saw for individual executions previously (409 ms for the AMD, 323 MS for the Intel). With OpenMP turned on, the times drop to 217 ms for the AMD, and 100 ms for the Intel.

So, on the Intel the original version took 90ms for one iteration of the outer loop. With this version we're getting just slightly longer (100 ms) for all 10 iterations of the outer loop -- an improvement in speed of about 9:1. On a machine with more cores, we could expect even more improvement (OpenMP will normally take advantage of all available cores automatically, though you can manually tune the number of threads if you want).

share|improve this answer
    
Cool! Parallel computing without explicit parallel code – streaver91 Jun 11 '13 at 4:23
    
@user2189264: Yes and no -- it's still executing in a single core. With a little more work (some openMP directives, for example) we could have it execute on multiple cores as well, effectively multiplying the speed again. All I've done so far though is let it make better use of the resources on a single core (exposed instruction level parallelism, not thread-level parallelism). – Jerry Coffin Jun 11 '13 at 4:32

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.