Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue running the tutorial #523

Closed
pradyotprakash opened this issue Oct 9, 2017 · 4 comments
Closed

Issue running the tutorial #523

pradyotprakash opened this issue Oct 9, 2017 · 4 comments

Comments

@pradyotprakash
Copy link

pradyotprakash commented Oct 9, 2017

I installed tvm as per the instructions in the github repo. When I try to run the tutorial provided here, I get an error when trying to execute the statement,

fadd_cuda = tvm.build(s, [A, B, C], "cuda", target_host="llvm", name="myadd")

I get the following error.

<ipython-input-5-a63326b9397b> in <module>()
      5 s[C].bind(tx, tvm.thread_axis("threadIdx.x"))
      6 
----> 7 fadd_cuda = tvm.build(s, [A, B, C], "cuda", target_host="llvm", name="myadd")

/home/pradyot/.local/lib/python2.7/site-packages/tvm-0.1.0-py2.7-linux-x86_64.egg/tvm/build_module.pyc in build(sch, args, target, target_host, name, binds)
    366     fhost = [ir_pass.LowerIntrin(x, target_host) for x in fhost]
    367     fhost = [ir_pass.CombineContextCall(x) for x in fhost]
--> 368     mhost = codegen.build_module(fhost, target_host)
    369 
    370     if fdevice:

/home/pradyot/.local/lib/python2.7/site-packages/tvm-0.1.0-py2.7-linux-x86_64.egg/tvm/codegen.pyc in build_module(lowered_func, target)
     18         The corressponding module.
     19     """
---> 20     return _Build(lowered_func, target)
     21 
     22 _init_api("tvm.codegen")

/home/pradyot/.local/lib/python2.7/site-packages/tvm-0.1.0-py2.7-linux-x86_64.egg/tvm/_ffi/function.pyc in my_api_func(*args)
    253         The result of the API function call.
    254         """
--> 255         return flocal(*args)
    256     return my_api_func
    257 

/home/pradyot/.local/lib/python2.7/site-packages/tvm-0.1.0-py2.7-linux-x86_64.egg/tvm/_ffi/_ctypes/function.pyc in __call__(self, *args)
    181         check_call(_LIB.TVMFuncCall(
    182             self.handle, values, tcodes, ctypes.c_int(num_args),
--> 183             ctypes.byref(ret_val), ctypes.byref(ret_tcode)))
    184         _ = temp_args
    185         _ = args

/home/pradyot/.local/lib/python2.7/site-packages/tvm-0.1.0-py2.7-linux-x86_64.egg/tvm/_ffi/base.pyc in check_call(ret)
     60     """
     61     if ret != 0:
---> 62         raise TVMError(py_str(_LIB.TVMGetLastError()))
     63 
     64 

TVMError: [20:21:14] src/codegen/codegen.cc:27: Check failed: bf != nullptr Target llvm is not enabled

Stack trace returned 10 entries:
[bt] (0) /home/pradyot/.local/lib/python2.7/site-packages/tvm-0.1.0-py2.7-linux-x86_64.egg/tvm/libtvm.so(_ZN3tvm7codegen5BuildERKNS_5ArrayINS_11LoweredFuncEvEERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0xf99) [0x7fbf49afbc39]
[bt] (1) /home/pradyot/.local/lib/python2.7/site-packages/tvm-0.1.0-py2.7-linux-x86_64.egg/tvm/libtvm.so(+0x316979) [0x7fbf49ab6979]
[bt] (2) /home/pradyot/.local/lib/python2.7/site-packages/tvm-0.1.0-py2.7-linux-x86_64.egg/tvm/libtvm.so(TVMFuncCall+0x5e) [0x7fbf49cb440e]
[bt] (3) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call_unix64+0x4c) [0x7fbf75e85e40]
[bt] (4) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call+0x2eb) [0x7fbf75e858ab]
[bt] (5) /usr/lib/python2.7/lib-dynload/_ctypes.x86_64-linux-gnu.so(_ctypes_callproc+0x48f) [0x7fbf7609d3df]
[bt] (6) /usr/lib/python2.7/lib-dynload/_ctypes.x86_64-linux-gnu.so(+0x11d82) [0x7fbf760a1d82]
[bt] (7) /usr/bin/python(PyObject_Call+0x43) [0x4b0cb3]
[bt] (8) /usr/bin/python(PyEval_EvalFrameEx+0x5faf) [0x4c9faf]
[bt] (9) /usr/bin/python(PyEval_EvalCodeEx+0x255) [0x4c2765]

The stack trace tells me that llvm is not enabled. However, the installation instructions mention that llvm is not needed for CUDA targets. Anyway, I re-build tvm using llvm 4.0.1 but I still get a similar error. What am I doing wrong?

@tqchen
Copy link
Member

tqchen commented Oct 9, 2017

You want to do make clean then make when build with llvm, because some of the pre-defined macros are not traced by makefile

@pradyotprakash
Copy link
Author

Yes, I deleted the initial folder before trying this again. Not sure why I am getting this :/
In any case, the basic example should work without llvm, right?

@tqchen
Copy link
Member

tqchen commented Oct 9, 2017

This is mainly because the target_host is force set to "llvm". If we do not set the target_host, it will redirect to "stackvm" that do not rely on LLVM.

However, we do need llvm for some advanced features, like storing the code into shared library.

@tqchen
Copy link
Member

tqchen commented Oct 9, 2017

you want to make sure that LLVM_CONFIG is set properly, and after you rebuild, you reinstalled the library.

@tqchen tqchen closed this as completed Oct 11, 2017
zxybazh added a commit to zxybazh/tvm that referenced this issue Jan 7, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
zxybazh pushed a commit to zxybazh/tvm that referenced this issue Jan 7, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>

Squashed commit

[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

* ...

* update

* update

* print

* more

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

* [Meta Schedule] Initiate experiments on CUDA

* ...

* fix boolean printing

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

Squashed commit

[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

* parallel vectorize unroll & random compute location

* rebased

[Meta Schedule] Per-Store-Feature (apache#521)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

* Squashed commit

[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

* annotate

* annotate

* lint

* test

* fix

* fix

* fix

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

Fix sttr func & schedule naming.

Fix schedule -> sch.

Add feature extractor.

Fix init.

Add cost model.

Remove unused include.

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

* wip

fix

* revoke change to gallery

* split postprocessors to separate files

* rename attrs

* minor

* minor tweak on utils.h

* refactor disallow-dynamic-loop

* refactor verify_gpu_code

* succesfully give up refactoring parallelize-vectorize-unroll

* python structuring

* unittests

Co-authored-by: Junru Shao <junrushao1994@gmail.com>

Fix issues.

Fix init.

Finish random model part.

Finish xgb model.

Minor fix.

Rebase.

Add init.

Await refactor of callback.

Update a bit on the test case.

Move impos.

Minor fix.

More fixes.

Remove unused import.

Fix per store feature test.

Update model save / load.

* Fix model save / load with tar.

* Fix issues.

* Remove dup.

Co-authored-by: Junru Shao <junrushao1994@gmail.com>

User-Interface: Tune-TIR (apache#525)

* User-Interface: Tune-TIR

* fix fix fix

User-Interface: Tune-TE (apache#527)

* fix a lot of issues

* Add tune-te

Get CUDA tuning working (apache#529)

[Meta Schedule] Evolutionary Search (apache#522)

* Checkpoint.

Fix cost model comment.

Finish evolutionary seaarch.

Remove  extra code.

Fix compile.

Add comments.

Add python part.

Ad test.

Update other files & comments.

* Squashed commit

[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>

* [TensorIR] GetProducer, GetConsumer (apache#506)

* [MetaScheduleRefactor] Annotate&Unannotate (apache#505)

* annotate

* annotate

* lint

* test

* fix

* fix

* fix

* [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

* Blockize & Tensorize (apache#514)

* Blockize & Tensorize

* Update tensor intrin

* Fix blockized & Recalculate affine flags

* Cleanup utils.cc

* Add test cases of blockize

* Re-enable affine flag checking

* Checkpoint.

Fix cost model comment.

Finish evolutionary seaarch.

Remove  extra code.

Fix compile.

Add comments.

Add python part.

Ad test.

Update other files & comments.

Fix random seed bug.

Minor fix.

Fix num-cores.

Add docs.

Check point.

Add max_fail_cnt.

Minor fix.

Minor fix.

Segfault.

Fix pointers to trace.

Test fix.

Remove measure callbacks.

Refactor a bit.

Split function.

Adjust variable name.

Minor fixes.

Add mutator probs to TuneContext.

Add token.

Fix loops.

Remove include.

Add has workload for database.

Add check.

Add concurrent bitmask.

* Fix TuneContext.

* Fix haash & stuff.

* Modifyy shash.

* Remove trace field.

* Minor fix.

* Fix cbmask.

* Fix numbers.

Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

Tune relay.

Further add interface.

Remove unused import

Fix rebase.

Add task name dispatch.

Add task deduplication.

Rename extract_task to extract_task_from_relay

Remove duplicate function def.

Minor fix.
junrushao added a commit to junrushao/tvm that referenced this issue Jan 27, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (#1)

Hot fix for bound predicate (#3)

[Meta Schedule] Update Tune Relay (#4)

[Performance Align] fixing codegen problems (#5)

[PerfAlign] NRM & SFM on Raspi Aligned (#6)

[BugFix] Apply bound predicate directly to loops when possible (#12)

[BugFix] Fix CrossThreadReduction on CUDA (#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (#11)

[Minor][MemHammer] Minor tweaks in code review (#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (#16)

Fix cooperative fetching (#17)

Fixes for codegen (#18)

[Hotfix] A unittest (#19)

Fix for GRP sketch gen (#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (#22)

[MemHammer][Refactor] Code Review (#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (#24)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
Hzfengsy added a commit to Hzfengsy/tvm that referenced this issue Feb 2, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (#1)

Hot fix for bound predicate (apache#3)

[Meta Schedule] Update Tune Relay (apache#4)

[Performance Align] fixing codegen problems (apache#5)

[PerfAlign] NRM & SFM on Raspi Aligned (apache#6)

[BugFix] Apply bound predicate directly to loops when possible (apache#12)

[BugFix] Fix CrossThreadReduction on CUDA (apache#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (apache#11)

[Minor][MemHammer] Minor tweaks in code review (apache#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (apache#16)

Fix cooperative fetching (apache#17)

Fixes for codegen (apache#18)

[Hotfix] A unittest (apache#19)

Fix for GRP sketch gen (apache#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22)

[MemHammer][Refactor] Code Review (apache#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
Hzfengsy pushed a commit to Hzfengsy/tvm that referenced this issue Feb 12, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (#1)

Hot fix for bound predicate (apache#3)

[Meta Schedule] Update Tune Relay (apache#4)

[Performance Align] fixing codegen problems (apache#5)

[PerfAlign] NRM & SFM on Raspi Aligned (apache#6)

[BugFix] Apply bound predicate directly to loops when possible (apache#12)

[BugFix] Fix CrossThreadReduction on CUDA (apache#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (apache#11)

[Minor][MemHammer] Minor tweaks in code review (apache#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (apache#16)

Fix cooperative fetching (apache#17)

Fixes for codegen (apache#18)

[Hotfix] A unittest (apache#19)

Fix for GRP sketch gen (apache#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22)

[MemHammer][Refactor] Code Review (apache#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>

fix

some fixes

fix test
Hzfengsy pushed a commit to Hzfengsy/tvm that referenced this issue Feb 19, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (#1)

Hot fix for bound predicate (apache#3)

[Meta Schedule] Update Tune Relay (apache#4)

[Performance Align] fixing codegen problems (apache#5)

[PerfAlign] NRM & SFM on Raspi Aligned (apache#6)

[BugFix] Apply bound predicate directly to loops when possible (apache#12)

[BugFix] Fix CrossThreadReduction on CUDA (apache#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (apache#11)

[Minor][MemHammer] Minor tweaks in code review (apache#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (apache#16)

Fix cooperative fetching (apache#17)

Fixes for codegen (apache#18)

[Hotfix] A unittest (apache#19)

Fix for GRP sketch gen (apache#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22)

[MemHammer][Refactor] Code Review (apache#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>

fix

some fixes

fix test
Hzfengsy pushed a commit to Hzfengsy/tvm that referenced this issue Feb 19, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (#1)

Hot fix for bound predicate (apache#3)

[Meta Schedule] Update Tune Relay (apache#4)

[Performance Align] fixing codegen problems (apache#5)

[PerfAlign] NRM & SFM on Raspi Aligned (apache#6)

[BugFix] Apply bound predicate directly to loops when possible (apache#12)

[BugFix] Fix CrossThreadReduction on CUDA (apache#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (apache#11)

[Minor][MemHammer] Minor tweaks in code review (apache#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (apache#16)

Fix cooperative fetching (apache#17)

Fixes for codegen (apache#18)

[Hotfix] A unittest (apache#19)

Fix for GRP sketch gen (apache#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22)

[MemHammer][Refactor] Code Review (apache#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>

fix

some fixes

fix test
junrushao added a commit to junrushao/tvm that referenced this issue Feb 20, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (#1)

Hot fix for bound predicate (#3)

[Meta Schedule] Update Tune Relay (#4)

[Performance Align] fixing codegen problems (#5)

[PerfAlign] NRM & SFM on Raspi Aligned (#6)

[BugFix] Apply bound predicate directly to loops when possible (#12)

[BugFix] Fix CrossThreadReduction on CUDA (#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (#11)

[Minor][MemHammer] Minor tweaks in code review (#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (#16)

Fix cooperative fetching (#17)

Fixes for codegen (#18)

[Hotfix] A unittest (#19)

Fix for GRP sketch gen (#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (#22)

[MemHammer][Refactor] Code Review (#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (#24)

Import & Cache Mechanism (#26)

[BugFix] Fix Winograd Test Script (#25)

Add task extraction & caching (#27)

A few fixes for task extraction (#28)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
junrushao added a commit to junrushao/tvm that referenced this issue Feb 20, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (#1)

Hot fix for bound predicate (#3)

[Meta Schedule] Update Tune Relay (#4)

[Performance Align] fixing codegen problems (#5)

[PerfAlign] NRM & SFM on Raspi Aligned (#6)

[BugFix] Apply bound predicate directly to loops when possible (#12)

[BugFix] Fix CrossThreadReduction on CUDA (#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (#11)

[Minor][MemHammer] Minor tweaks in code review (#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (#16)

Fix cooperative fetching (#17)

Fixes for codegen (#18)

[Hotfix] A unittest (#19)

Fix for GRP sketch gen (#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (#22)

[MemHammer][Refactor] Code Review (#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (#24)

Import & Cache Mechanism (#26)

[BugFix] Fix Winograd Test Script (#25)

Add task extraction & caching (#27)

A few fixes for task extraction (#28)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
junrushao added a commit to junrushao/tvm that referenced this issue Feb 20, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (#1)

Hot fix for bound predicate (#3)

[Meta Schedule] Update Tune Relay (#4)

[Performance Align] fixing codegen problems (#5)

[PerfAlign] NRM & SFM on Raspi Aligned (#6)

[BugFix] Apply bound predicate directly to loops when possible (#12)

[BugFix] Fix CrossThreadReduction on CUDA (#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (#11)

[Minor][MemHammer] Minor tweaks in code review (#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (#16)

Fix cooperative fetching (#17)

Fixes for codegen (#18)

[Hotfix] A unittest (#19)

Fix for GRP sketch gen (#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (#22)

[MemHammer][Refactor] Code Review (#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (#24)

Import & Cache Mechanism (#26)

[BugFix] Fix Winograd Test Script (#25)

Add task extraction & caching (#27)

A few fixes for task extraction (#28)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
zxybazh added a commit to zxybazh/tvm that referenced this issue Feb 21, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (apache#1)

Hot fix for bound predicate (apache#3)

[Meta Schedule] Update Tune Relay (apache#4)

[Performance Align] fixing codegen problems (apache#5)

[PerfAlign] NRM & SFM on Raspi Aligned (apache#6)

[BugFix] Apply bound predicate directly to loops when possible (apache#12)

[BugFix] Fix CrossThreadReduction on CUDA (apache#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (apache#11)

[Minor][MemHammer] Minor tweaks in code review (apache#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (apache#16)

Fix cooperative fetching (apache#17)

Fixes for codegen (apache#18)

[Hotfix] A unittest (apache#19)

Fix for GRP sketch gen (apache#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22)

[MemHammer][Refactor] Code Review (apache#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
zxybazh added a commit to zxybazh/tvm that referenced this issue Feb 21, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (apache#1)

Hot fix for bound predicate (apache#3)

[Meta Schedule] Update Tune Relay (apache#4)

[Performance Align] fixing codegen problems (apache#5)

[PerfAlign] NRM & SFM on Raspi Aligned (apache#6)

[BugFix] Apply bound predicate directly to loops when possible (apache#12)

[BugFix] Fix CrossThreadReduction on CUDA (apache#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (apache#11)

[Minor][MemHammer] Minor tweaks in code review (apache#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (apache#16)

Fix cooperative fetching (apache#17)

Fixes for codegen (apache#18)

[Hotfix] A unittest (apache#19)

Fix for GRP sketch gen (apache#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22)

[MemHammer][Refactor] Code Review (apache#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24)

Import & Cache Mechanism (apache#26)

[BugFix] Fix Winograd Test Script (apache#25)

Add task extraction & caching (apache#27)

A few fixes for task extraction (apache#28)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
zxybazh added a commit to zxybazh/tvm that referenced this issue Feb 21, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (apache#1)

Hot fix for bound predicate (apache#3)

[Meta Schedule] Update Tune Relay (apache#4)

[Performance Align] fixing codegen problems (apache#5)

[PerfAlign] NRM & SFM on Raspi Aligned (apache#6)

[BugFix] Apply bound predicate directly to loops when possible (apache#12)

[BugFix] Fix CrossThreadReduction on CUDA (apache#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (apache#11)

[Minor][MemHammer] Minor tweaks in code review (apache#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (apache#16)

Fix cooperative fetching (apache#17)

Fixes for codegen (apache#18)

[Hotfix] A unittest (apache#19)

Fix for GRP sketch gen (apache#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22)

[MemHammer][Refactor] Code Review (apache#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
zxybazh added a commit to zxybazh/tvm that referenced this issue Feb 22, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (apache#1)

Hot fix for bound predicate (apache#3)

[Meta Schedule] Update Tune Relay (apache#4)

[Performance Align] fixing codegen problems (apache#5)

[PerfAlign] NRM & SFM on Raspi Aligned (apache#6)

[BugFix] Apply bound predicate directly to loops when possible (apache#12)

[BugFix] Fix CrossThreadReduction on CUDA (apache#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (apache#11)

[Minor][MemHammer] Minor tweaks in code review (apache#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (apache#16)

Fix cooperative fetching (apache#17)

Fixes for codegen (apache#18)

[Hotfix] A unittest (apache#19)

Fix for GRP sketch gen (apache#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22)

[MemHammer][Refactor] Code Review (apache#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24)

Import & Cache Mechanism (apache#26)

[BugFix] Fix Winograd Test Script (apache#25)

Add task extraction & caching (apache#27)

A few fixes for task extraction (apache#28)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
zxybazh added a commit to zxybazh/tvm that referenced this issue Feb 22, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (apache#1)

Hot fix for bound predicate (apache#3)

[Meta Schedule] Update Tune Relay (apache#4)

[Performance Align] fixing codegen problems (apache#5)

[PerfAlign] NRM & SFM on Raspi Aligned (apache#6)

[BugFix] Apply bound predicate directly to loops when possible (apache#12)

[BugFix] Fix CrossThreadReduction on CUDA (apache#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (apache#11)

[Minor][MemHammer] Minor tweaks in code review (apache#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (apache#16)

Fix cooperative fetching (apache#17)

Fixes for codegen (apache#18)

[Hotfix] A unittest (apache#19)

Fix for GRP sketch gen (apache#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22)

[MemHammer][Refactor] Code Review (apache#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants