Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] [OpLib] Create a Dialect for Representing Operator Libraries #7771

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

andrewb1999
Copy link
Contributor

Opening a draft PR to start getting comments while I finish up the verifier implementation and testing.

The goal of this PR is to introduce a new OpLib library that represents a library of operators in CIRCT. The initial intended use for this library is to unify the representation of operators for SCFToCalyx and LoopScheduleToCalyx. Instead of each pass having their own mapping from high-level operations to a calyx primitive, there will first be a pass that inserts the operator library and then both passes will look at the operator library to determine lowerings. We can also use this information in AffineToLoopSchedule to build the scheduling problem without hardcoding latency and delay information.

After this initial use, OpLib could be augmented to support lowerings to dialects other than Calyx as well as allowing users to add their own custom operators. Supporting variable bitwidth operators that have functions from bitwidth to delay is also something I am interested in implementing eventually.

Example:

hw.module.extern @ext_fmult(in %clk : i1 {calyx.clk}, in %left : i32 {calyx.data}, in %right : i32 {calyx.data}, in %ce : i1, out result : i32 {calyx.stable, calyx.data}) attributes {filename = "fmult.sv", verilogName = "fmult"}

oplib.library @lib0 {
  oplib.operator @fmult [
    latency<4>,
    incDelay<0.5>,
    outDelay<0.5>
  ] {
    oplib.target @target0(%l: f32, %r: f32) -> f32 {
      %o = oplib.operation "mulf" in "arith"(%l, %r : f32, f32) : f32
      oplib.output %o : f32
    }
    oplib.calyx_match(@target0 : (f32, f32) -> f32) produce {
      %clk, %left, %right, %ce, %result = calyx.primitive @fmult of @ext_fmult : i1, i32, i32, i1, i32
      oplib.yield clk(%clk : i1), ce(%ce : i1), ins(%left, %right : i32, i32), outs(%result : i32)
    }
  }

  oplib.operator @addi [
    latency<0>,
    incDelay<0.2>,
    outDelay<0.2>
  ] {
    oplib.target @target0(%l: i32, %r: i32) -> i32 {
      %o = oplib.operation "addi" in "arith"(%l, %r : i32, i32) : i32
      oplib.output %o : i32
    }
    oplib.calyx_match(@target0 : (i32, i32) -> i32) produce {
      %left, %right, %out = calyx.std_add @add : i32, i32, i32
      oplib.yield ins(%left, %right : i32, i32), outs(%out : i32)
    }
  }
}

Singling in on the @fmult operator, we see it has a latency of 4 an incoming delay of 0.5 ns and an outgoing delay of 0.5 ns. The oplib.target defines a match target. In this example we will match against arith.mulf operations. The oplib.calyx_match defines what calyx primitive to produce if the target is matched. In this case we produce a custom calyx primitive for an external floating point multiplier. oplib.calyx_match verifies that the target type bitwidths match the yielded ins and outs type bitwidths.

Let me know if there are any questions or comments about how this works. I will also implement the use of operators in SCFToCalyx before finalizing this PR (but won't yet remove the hardcoded operators).

@cgyurgyik
Copy link
Member

This is awesome!!

return success();
}
}];
// let assemblyFormat = [{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: commented out code.

attr = builder.getArrayAttr(alreadyParsed);
return success();
}
return {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit for clarity: failure()

Copy link
Contributor

@darthscsi darthscsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, please write a rational doc for the dialect. Semi-standard protocol is to first commit a rational doc with the minimal dialect.td file.

Second, and working without a rational here, so guessing from the code, this seems to have significant overlap with both the pattern rewriting infrastructure and PDLL. If the goal is storing mappings and information about them, an alternate path may be to use table gen to generate the mapping directly rather than generating a dialect which then encodes the mapping. Tablegen has a number of backends and was originally designed to build large mapping tables.

@darthscsi darthscsi dismissed their stale review December 13, 2024 23:36

Dropping to comment after another pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants