flag_operations_are_free
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revision | |||
| flag_operations_are_free [2026/01/11 16:21] – appledog | flag_operations_are_free [2026/01/11 16:30] (current) – appledog | ||
|---|---|---|---|
| Line 15: | Line 15: | ||
| Here's LDAL-1: | Here's LDAL-1: | ||
| - | < | + | < |
| ; Test program: 1 million LDAL [$1000] operations | ; Test program: 1 million LDAL [$1000] operations | ||
| ; Uses CD (32-bit "count down" counter register) | ; Uses CD (32-bit "count down" counter register) | ||
| Line 67: | Line 67: | ||
| * OVERFLOW_FLAG = value === 0x8000; | * OVERFLOW_FLAG = value === 0x8000; | ||
| - | That is a significant amount of flags. Having this make no impact whatsoever was surprising, so I removed the IF statements blocking these flags on DEC. This produces LDAL-2b, which surprised me by getting again the exact same 2.1 MIPS. So, over 2 million if statements wasn't moving the needle? | + | That is a significant amount of code to remove, but ONE compare op was killing it. Having this make no impact whatsoever was surprising, so I removed the IF statements blocking these flags on DEC. This produces LDAL-2b, which surprised me by getting again the exact same 2.1 MIPS. So, over 2 million if statements |
| - | + | ||
| - | I replaced the flag fences and I created LDAL-3; this time, I had only 100,000 execution cycles, but 10 copies of LDAL. My heart lept when I saw the score; 7.55 MIPS! This meant that LDAL was executing much faster than the other instructions. I immediately created LDAL-4 which had 1,000 lines of LDAL and loaded CD with 1 million. The goal was simple: execute 1 billion LDAL instructions and time the result. The results were spectacular. 78 MIPS. I did try with CMP,0 and SEF mode, and it was slower (73 MIPS). The immediate conclusion is that SEF mode was useless. CMP was dragging everything down. But I didn't know why. | + | |
| - | + | ||
| - | For the record, I created versions which used LDA and LDAB | + | |
| + | I replaced the flag fences and I created LDAL-3; this time, I had 100,000 runs of 10 LDAL operations. My heart lept for joy when I saw the score; 7.55 MIPS! This meant that LDAL was executing much faster than the other instructions. I immediately created LDAL-4 which had 1,000 lines of LDAL and loaded CD with 1 million. The goal was simple: execute 1 billion LDAL instructions and time the result. The results were spectacular. 78 MIPS. I did try with CMP,0 and SEF mode, and it was slower (73 MIPS). The immediate conclusion is that SEF mode was useless. CMP was dragging everything down. But I didn't know why. | ||
| - | 78 MIPS With SEF & CMP CD, 0 | + | I experimented with some other LD instructions It turned out that LDBLX and LDAB were extremely slow, just as slow as CMP. I once again tested CMP with and without SEF/CLF just to confirm: Yes, one CMP operation was many times slower than millions of by-the-way flag checks. Adding a CMP lowered the MIPS to 73 but removing it got us over 78. |
| - | 73 MIPS With CLF & no CMP | + | |
| + | The final conclusion was that my memory system was not optimized. One of the major issues was that I was creating an array in web assembly every register access. I moved that out of the loop and saw MIPS return to normal. In fact it was better than normal- for normal load and store operations I was at 55 MIPS. | ||
flag_operations_are_free.txt · Last modified: by appledog
