Infinite compile time

Usage of VEX tools

Infinite compile time

Postby ODerin » Sat Feb 25, 2006 3:41 pm

Hello,

When compiling imgpipe benchmark for two-clustered machine with 16 registers, the compilation never finishes. (specifically it keeps on compiling jpeg/jcmaster.c) We stopped it manually after one day. For 32 and more registers, there is no problem, compilation ends successfully. Below you can find the fmmdump and the compilation command.

Reason for this may be that for this particular code, it may be possible, at some point, to fill all the issue width with operations which may potentially use issuewidth*3 = 24 registers. And some more registers may be needed for intercluster copy operations. At the end it may be the case that we need more than 32 registers in total. But still, by inserting extra loads and stores or by comprimising from 100% usage of issue width, compiler should be able to compile it for even less number of registers.

Should we wait more for the compiler to finish? Is there a way to derive a safe number of registers which makes sure that there won't be such infinite compile times? For example for the given configuration, Issuewidth*3 doesn't work to guess this safe number.

This is the compilation command that doesn't end.
Code: Select all
/opt/vex/FC4/bin/cc -O3 -H3 -prefetch -DVEX_RESTRICT -DJAMMED -width 2 -fmm=auto.mm -fmmdump   -c -o jpeg/jcmaster.o jpeg/jcmaster.c

This is the fmmdump of the mentioned configuration:
Code: Select all
RES: IssueWidth 8
RES: MemLoad 8
RES: MemStore 8
RES: MemPft 1
RES: IssueWidth.0 4
RES: Alu.0 4
RES: Mpy.0 2
RES: CopySrc.0 1
RES: CopyDst.0 1
RES: Memory.0 1
RES: IssueWidth.1 4
RES: Alu.1 4
RES: Mpy.1 2
RES: CopySrc.1 1
RES: CopyDst.1 1
RES: Memory.1 1
DEL: AluR.0 0
DEL: Alu.0 0
DEL: CmpBr.0 1
DEL: CmpGr.0 0
DEL: Select.0 0
DEL: Multiply.0 1
DEL: Load.0 2
DEL: LoadLr.0 3
DEL: Store.0 0
DEL: Pft.0 0
DEL: Asm1L.0 0
DEL: Asm2L.0 0
DEL: Asm3L.0 0
DEL: Asm4L.0 0
DEL: Asm1H.0 1
DEL: Asm2H.0 1
DEL: Asm3H.0 1
DEL: Asm4H.0 1
DEL: CpGrGR.0 1
DEL: CpGrBr.0 1
DEL: CpBrGr.0 0
DEL: CpGrLr.0 2
DEL: CpLrGr.0 0
DEL: Spill.0 0
DEL: Restore.0 2
DEL: RestoreLr.0 3
DEL: AluR.1 0
DEL: Alu.1 0
DEL: CmpBr.1 1
DEL: CmpGr.1 0
DEL: Select.1 0
DEL: Multiply.1 1
DEL: Load.1 2
DEL: LoadLr.1 3
DEL: Store.1 0
DEL: Pft.1 0
DEL: Asm1L.1 0
DEL: Asm2L.1 0
DEL: Asm3L.1 0
DEL: Asm4L.1 0
DEL: Asm1H.1 1
DEL: Asm2H.1 1
DEL: Asm3H.1 1
DEL: Asm4H.1 1
DEL: CpGrGR.1 1
DEL: CpGrBr.1 1
DEL: CpBrGr.1 0
DEL: CpGrLr.1 2
DEL: CpLrGr.1 0
REG: $r0 16
REG: $b0 8
REG: $b1 8
REG: $r1 16



Thank you,

Onur
ODerin
 
Posts: 2
Joined: Mon Feb 06, 2006 6:55 pm
Location: ALaRI

Re: Infinite compile time

Postby frb » Sun Mar 05, 2006 10:39 pm

ODerin wrote:Hello,

When compiling imgpipe benchmark for two-clustered machine with 16 registers, the compilation never finishes. (specifically it keeps on compiling jpeg/jcmaster.c) We stopped it manually after one day. For 32 and more registers, there is no problem, compilation ends successfully. Below you can find the fmmdump and the compilation command.

Reason for this may be that for this particular code, it may be possible, at some point, to fill all the issue width with operations which may potentially use issuewidth*3 = 24 registers. And some more registers may be needed for intercluster copy operations. At the end it may be the case that we need more than 32 registers in total. But still, by inserting extra loads and stores or by comprimising from 100% usage of issue width, compiler should be able to compile it for even less number of registers.

Should we wait more for the compiler to finish? Is there a way to derive a safe number of registers which makes sure that there won't be such infinite compile times? For example for the given configuration, Issuewidth*3 doesn't work to guess this safe number.

This is the compilation command that doesn't end.
Code: Select all
/opt/vex/FC4/bin/cc -O3 -H3 -prefetch -DVEX_RESTRICT -DJAMMED -width 2 -fmm=auto.mm -fmmdump   -c -o jpeg/jcmaster.o jpeg/jcmaster.c


Onur


Your analysis is probably correct, what happens is that you're defining a machine that is a bit out-of-balance, and the compiler chokes. What I would try doing first is reducing the unrolling factor. You're using -H3, which is a pretty aggressive unrolling (don't remember the amount, but probably a lot). Try with -H1 (or omit -Hx flag), and if you have manual unrolling pragmas, reduce them until compilation succeeds. If it still doesn't work after that, let me know, and I'll take a better look.

-- Paolo
Last edited by frb on Mon Mar 06, 2006 9:21 am, edited 1 time in total.
frb
 
Posts: 62
Joined: Thu Nov 12, 2009 3:44 pm

Re: Infinite compile time

Postby ODerin » Sun Mar 05, 2006 11:15 pm

It compiles when we relax -Hx and -Ox flags. But this is kind of comprimise from code quality.

frb wrote:Your analysis is probably correct, what happens is that you're defining a machine that is a bit out-of-balance, and the compiler chokes. What I would try doing first is reducing the unrolling factor. You're using -H3, which is a pretty aggressive unrolling (don't remember the amount, but probably a lot). Try with -H1 (or emit -Hx flag), and if you have manual unrolling pragmas, reduce them until compilation succeeds. If it still doesn't work after that, let me know, and I'll take a better look.

-- Paolo
ODerin
 
Posts: 2
Joined: Mon Feb 06, 2006 6:55 pm
Location: ALaRI

Re: Infinite compile time

Postby frb » Mon Mar 06, 2006 9:20 am

ODerin wrote:It compiles when we relax -Hx and -Ox flags. But this is kind of comprimise from code quality.


Not really. It doesn't make much sense to unroll the code - say - 32 times, when unrolling it 4 times exposes enough ILP to take full advantage of the machine. Unfortunately, there's no magic formula that'll tell you how much to unroll, so that's why compiler have pragmas, flags, etc. to help you guide the unrolling. Relaxing unrolling aggressiveness is the right approach in the experiment you're running.

-- Paolo
frb
 
Posts: 62
Joined: Thu Nov 12, 2009 3:44 pm


Return to VEX Tools



Who is online

Users browsing this forum: No registered users and 8 guests