Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR][CIRGen] Fixes function calls with return values and cleanup stage #1214

Merged
merged 6 commits into from
Dec 17, 2024

Conversation

gitoleg
Copy link
Collaborator

@gitoleg gitoleg commented Dec 6, 2024

The Problem

Let's take a look at the following code:

struct A {
~A() {}
};

int foo() { return 42; }
void bar() {
  A a;
  int b = foo(); 
}

The call to foo guarded by the synthetic tryOp looks approximately like the following:

cir.try synthetic cleanup {
   %2 = cir.call exception @_Z3foov() : () -> !s32i cleanup {
      cir.call @_ZN1AD1Ev(%0) : (!cir.ptr<!ty_A>) -> () extra(#fn_attr1)   // call to destructor of 'A'
      cir.yield
    } 
    cir.yield
} catch [#cir.unwind {
    cir.resume 
}] 
cir.store %2, %1: !s32i, !cir.ptr<!s32i>  // CIR verification error

The result of the foo call is in the try region - and is not accessible from the outside, so the code generation fails with
operand #0 does not dominate its use .

Solution

So we have several options how to handle this properly.

  1. We may intpoduce a new operation here, like TryCall but probably more high level one, e.g. introduce the InvokeOp.
  2. Also, we may add the result to TryOp.
  3. The fast fix that is implemented in this PR is a temporary alloca where we store the call result right in the try region. And the result of the whole emitCall is a load from the temp alloca.

So this PR is both the request for changes and an open discussion as well - how to handle this properly. So far I choose the third approach. If it's ok - I will need to create one more PR with a similar fix for the aggregated results or update this one.

@bcardosolopes
Copy link
Member

(cc @smeenai)

@bcardosolopes
Copy link
Member

bcardosolopes commented Dec 6, 2024

Thanks for working on this, we are in the middle of redesigning exceptions and cleanups, so good timing.

The fact that we try to use one cir.try to handle both actual try/catch and the synthetic (spurious) ones (which are basically invokes) is complicating things. Part of the idea here was also to eventually have a pass which allow us to merge a bunch of those synthetic calls into just one, with many calls inside - as you noticed that would require using extra allocas though.

I'd prefer a mix between (1) and (2): a call operation that has a high level invoke semantics (i.e. it has an attached catch):

%2 = cir.call exception @_Z3foov() : () -> !s32i cleanup {
      cir.call @_ZN1AD1Ev(%0) : (!cir.ptr<!ty_A>) -> () extra(#fn_attr1)   // call to destructor of 'A'
      cir.yield
} catch [#cir.unwind {
    // This probably don't need to be a catch, nor we need this region, but at least an attribute to convey
    // we later need a landinpad
    cir.resume 
}] 
cir.store %2, %1: !s32i, !cir.ptr<!s32i>  // CIR verification error

We lose the "merging" capabilities I mentioned earlier, but perhaps we can look at it later in the pipeline. If we go this route, another problem would be how to integrate try/catches, perhaps those could be enforced not to have a catch blocks? Thoughts?

@bcardosolopes
Copy link
Member

(cc @ChuanqiXu9 which asked another question recently)

Copy link
Collaborator

@smeenai smeenai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing this!

In #1123, we've been discussing a potential alternative representation of cleanups, which might help here. For the current way cleanups are setup, is there any way to have the store to the b alloca occur directly inside the cir.try_synthetic, instead of going through tmp?

clang/lib/CIR/CodeGen/CIRGenCall.cpp Outdated Show resolved Hide resolved
@gitoleg
Copy link
Collaborator Author

gitoleg commented Dec 9, 2024

@smeenai

For the current way cleanups are setup, is there any way to have the store to the b alloca occur directly inside the cir.try_synthetic, instead of going through tmp?

At the first glance I don't see such way - at the moment emitCall is called, the ReturnValue slot is not set - and we need to create this workaround with temp location.

@gitoleg
Copy link
Collaborator Author

gitoleg commented Dec 9, 2024

@bcardosolopes

We lose the "merging" capabilities I mentioned earlier, but perhaps we can look at it later in the pipeline. If we go this route, another problem would be how to integrate try/catches, perhaps those could be enforced not to have a catch blocks? Thoughts?

What is the problem with try/catches? I'm not sure I understand.
In general, I like the idea of calls with a cleanup region and the 'need landing pad' attribute. Not sure though how we can merge the cleanups though - we need to prove somehow that all the cleanups in the list of calls are equal to each other. From this point of view single synthetic try looks more suitable.

we are in the middle of redesigning exceptions and cleanups, so good timing.

So is it better to close the PR for now? There are several other C++ things I wanted to take a look at. But I can continue here as well - so up to you to decide, what is better.

@bcardosolopes
Copy link
Member

What is the problem with try/catches? I'm not sure I understand.

It's not a problem, it's more about the fact that automatic variables inside try/catch are already allocated within the cir.try scope, meaning we don't have the same problem as the synthetic version does.

we need to prove somehow that all the cleanups in the list of calls are equal to each other

This is more of a future opt, in order to lower multiple cleanups region to the same basic block, I was thinking about adding an operation to tag identical landing pads (CIRGen knows exactly which ones are the same, so we could leverage that). I haven't fleshed out all the details, but this is my brain dump.

From this point of view single synthetic try looks more suitable.

This is definetely how we probably want to see it out of CIRGen, it was more about a future opt!

So is it better to close the PR for now? There are several other C++ things I wanted to take a look at.

This PR is good (see below), but because I'm working on making cleanups closer to OG, I had to give it a break on all multiple inheritance stuff, so that route is also open if you want to look at those.

But I can continue here as well - so up to you to decide, what is better.

This is a good change and will fix real problems we are all facing. I'm focusing first on general cleanups (not necessarily exception related) so I think you should go ahead with this approach - when (if) we change the design of this later we can rely on the testcases you added and make sure we don't regress. If possible, it would probably be nice if this PR can get rid of the extra alloca/load/store during FlattenCFG, but more like bonus point.

Copy link
Member

@ChuanqiXu9 ChuanqiXu9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I tried this in Spec2017 and it fixes a problem. Are you doing similar things?

Copy link

github-actions bot commented Dec 10, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@gitoleg gitoleg force-pushed the free-call-with-dtor branch from 892b92e to c78799e Compare December 10, 2024 13:53
@gitoleg
Copy link
Collaborator Author

gitoleg commented Dec 10, 2024

LGTM. I tried this in Spec2017 and it fixes a problem. Are you doing similar things?

Not the same but similar!)

If possible, it would probably be nice if this PR can get rid of the extra alloca/load/store during FlattenCFG, but more like bonus point.

Well, I would glad to get the extra bonus point) but looks like there are around 10 tests that fail if I eliminate the temp alloca. Of course, I can fix them as well but I doubt it's what do you need.
Alternatively, I can remove exactly temp allocas, i.e. such ones whose names contain 'tmp' substring.

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after some nits.

clang/lib/CIR/CodeGen/CIRGenCall.cpp Outdated Show resolved Hide resolved
clang/lib/CIR/CodeGen/CIRGenCall.cpp Outdated Show resolved Hide resolved
@bcardosolopes
Copy link
Member

Well, I would glad to get the extra bonus point) but looks like there are around 10 tests that fail if I eliminate the temp alloca.

You just introduced this alloca, that's surprising.

I can remove exactly temp allocas, i.e. such ones whose names contain 'tmp' substring.

Oh, I'm actually only worried about the ones you just introduced, not anything new. It can come in another PR though, since it's a small cleanup / improvement.

Thank you!

@gitoleg
Copy link
Collaborator Author

gitoleg commented Dec 13, 2024

@bcardosolopes done!

@bcardosolopes bcardosolopes merged commit 3d50a02 into llvm:main Dec 17, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants