Wednesday, July 6, 2022

[FIXED] When is it safe to capture a lambda inside another lambda by reference?

Issue

Suppose you have the following program:

static std::function<int(int)> pack_a_lambda( std::function<int(int)> to_be_packed ) {
    return [=]( int value ) {
        return to_be_packed( value * 4 );
    };
}

int main() {
    auto f = pack_a_lambda( []( int value ) {
        return value * 2;
    } );

    int result = f( 2 );

    std::cout << result << std::endl; // should print 16
    return 0;
}

I haven't tried the exact code above, cause I tested it in Google Tests and then slightly edited it like above. So, the function pack_a_lambda takes a lambda by value as input. Here, I believe the temporary lambda is copied. Then, when we create the new lambda, we again capture the copied lambda to_be_packed by value. It works, and seems to me it should be safe.

Now suppose we capture that lambda by reference instead:

static std::function<int(int)> pack_a_lambda( std::function<int(int)> to_be_packed ) {
    return [&]( int value ) {
        return to_be_packed( value * 4 );
    };
}

In my specific use case, the resulting lambda executes four times faster. In the simplified example above I couldn't reproduce this difference, though. In fact, here it seems that capturing the lambda by reference makes it ever-so-slightly slower. So there is clearly some performance difference.

But is it safe? The argument to_be_packed is copied, but it's still a temporary right? That should make it not safe. But I'm not sure. My UB sanitizer and my AddressSanitizer does not complain, but I concede that doesn't prove anything. If I pass to_be_packed by reference...

static std::function<int(int)> pack_a_lambda( const std::function<int(int)> &to_be_packed ) {
    return [&]( int value ) {
        return to_be_packed( value * 4 );
    };
}

...the AddressSanitizer complains, which is not surprising, because the lambda I pass into the function is also a temporary. So that leaves example two: Is it safe or not, and what are possible reasons it might be faster to execute in some cases?


Solution

static std::function<int(int)> pack_a_lambda( std::function<int(int)> to_be_packed ) {
    return [&]( int value ) {
        return to_be_packed( value * 4 );
    };
}

is Undefined behavior as you "return" reference to local variable. By value is the safe way here.

static std::function<int(int)> pack_a_lambda(const std::function<int(int)>& to_be_packed ) {
    return [&]( int value ) {
        return to_be_packed( value * 4 );
    };
}

might be correct. you have to ensure that lifetime of passed parameter is longer than the returned std::function.

auto func = std::function([]( int value ) {
        return value * 2;
    });
auto f = pack_a_lambda(func); // OK
// auto f2 = pack_a_lambda([](int){ return 42;}); // KO: temporary std::function created

as temporary can bind to const reference, in that case, safer to delete the r-value version:

static std::function<int(int)> pack_a_lambda(std::function<int(int)>&&) = delete;


Answered By - Jarod42
Answer Checked By - David Marino (PHPFixing Volunteer)

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.