* Add fast_timer_t that is 16-bit or 32-bit based on architecture
A 16-bit timer will overflow sooner but be faster to compare on AVR.
* Avoid 8-bit timer overflows in debounce algorithms
Count down remaining elapsed time instead of trying to do 8-bit timer
comparisons.
Add a "none" implementation that is automatically used if DEBOUNCE is
0 otherwise it will break the _pk/_pr count down.
* Avoid unnecessary polling of the entire matrix in sym_eager_pk
The matrix only needs to be updated when a debounce timer expires.
* Avoid unnecessary polling of the entire matrix in sym_eager_pr
The matrix only needs to be updated when a debounce timer expires.
The use of the "needed_update" variable is trying to do what
"matrix_need_update" was added to fix but didn't work because it only
applied when all keys finished debouncing.
* Fix sym_defer_g timing inconsistency compared to other debounce algorithms
DEBOUNCE=5 should process the key after 5ms, not 6ms
* Add debounce tests
* tmk_core/common: Fixing TIMER_DIFF macro to calculate difference correctly after the timer wraps.
Let's go through an example, using the following macro:
If the first timer read is 0xe4 and the second one is 0x32, the timer wrapped.
If the timer would have had more bits, it's new value would have been 0x132,
and the correct difference in time is 0x132 - 0xe4 = 0x4e
old code TIMER_DIFF_8(0x32, 0xe4) = 0xff - 0xe4 + 0x32 = 0x4d, which is wrong.
new code TIMER_DIFF_8(0x32, 0xe4) = 0xff + 1 - 0xe4 + 0x32 = 0x4e, which is correct.
This also gives a chance for a smart compiler to optimize the code using normal
integer overflow.
For example on AVR, the following C code:
uint8_t __attribute__ ((noinline)) test(uint8_t current_timer, uint8_t start_timer)
{
return TIMER_DIFF_8(current_timer, start_timer);
}
With the original code, it gets translated to the following list of instructions:
00004c6e <test>:
4c6e: 98 2f mov r25, r24
4c70: 86 1b sub r24, r22
4c72: 96 17 cp r25, r22
4c74: 08 f4 brcc .+2 ; 0x4c78 <test+0xa>
4c76: 81 50 subi r24, 0x01 ; 1
4c78: 08 95 ret
But with this commit, it gets translated to a single instruction:
00004c40 <test>:
4c40: 86 1b sub r24, r22
4c42: 08 95 ret
This unfortunately doesn't always work so nicely, for example the following C code:
int __attribute__ ((noinline)) test(uint8_t current_timer, uint8_t start_timer)
{
return TIMER_DIFF_8(current_timer, start_timer);
}
(Note: return type changed to int)
With the original code it gets translated to:
00004c6e <test>:
4c6e: 28 2f mov r18, r24
4c70: 30 e0 ldi r19, 0x00 ; 0
4c72: 46 2f mov r20, r22
4c74: 50 e0 ldi r21, 0x00 ; 0
4c76: 86 17 cp r24, r22
4c78: 20 f0 brcs .+8 ; 0x4c82 <test+0x14>
4c7a: c9 01 movw r24, r18
4c7c: 84 1b sub r24, r20
4c7e: 95 0b sbc r25, r21
4c80: 08 95 ret
4c82: c9 01 movw r24, r18
4c84: 84 1b sub r24, r20
4c86: 95 0b sbc r25, r21
4c88: 81 50 subi r24, 0x01 ; 1
4c8a: 9f 4f sbci r25, 0xFF ; 255
4c8c: 08 95 ret
Wth this commit it gets translated to:
00004c40 <test>:
4c40: 28 2f mov r18, r24
4c42: 30 e0 ldi r19, 0x00 ; 0
4c44: 46 2f mov r20, r22
4c46: 50 e0 ldi r21, 0x00 ; 0
4c48: 86 17 cp r24, r22
4c4a: 20 f0 brcs .+8 ; 0x4c54 <test+0x14>
4c4c: c9 01 movw r24, r18
4c4e: 84 1b sub r24, r20
4c50: 95 0b sbc r25, r21
4c52: 08 95 ret
4c54: c9 01 movw r24, r18
4c56: 84 1b sub r24, r20
4c58: 95 0b sbc r25, r21
4c5a: 93 95 inc r25
4c5c: 08 95 ret
There is not much performance improvement in this case, however at least with this
commit it functions correctly.
Note: The following commit will improve compiler output for the latter example.
* tmk_core/common: Improve code generation for TIMER_DIFF* macros
Because of integer promotion the compiler is having a hard time generating
efficient code to calculate TIMER_DIFF* macros in some situations.
In the below example, the return value is "int", and this is causing the
trouble.
Example C code:
int __attribute__ ((noinline)) test(uint8_t current_timer, uint8_t start_timer)
{
return TIMER_DIFF_8(current_timer, start_timer);
}
BEFORE: (with -Os)
00004c40 <test>:
4c40: 28 2f mov r18, r24
4c42: 30 e0 ldi r19, 0x00 ; 0
4c44: 46 2f mov r20, r22
4c46: 50 e0 ldi r21, 0x00 ; 0
4c48: 86 17 cp r24, r22
4c4a: 20 f0 brcs .+8 ; 0x4c54 <test+0x14>
4c4c: c9 01 movw r24, r18
4c4e: 84 1b sub r24, r20
4c50: 95 0b sbc r25, r21
4c52: 08 95 ret
4c54: c9 01 movw r24, r18
4c56: 84 1b sub r24, r20
4c58: 95 0b sbc r25, r21
4c5a: 93 95 inc r25
4c5c: 08 95 ret
AFTER: (with -Os)
00004c40 <test>:
4c40: 86 1b sub r24, r22
4c42: 90 e0 ldi r25, 0x00 ; 0
4c44: 08 95 ret
Note: the example is showing -Os but improvements can be seen at all optimization levels,
including -O0. We never use -O0, but I tested it to make sure that no extra code is
generated in that case.OA
* quantum/debounce: Fix custom wrapping timers in eager_pr and eager_pk debounce algorithms
Please see the below simulated sequence of events:
Column A is the 16-bit value returned by read_timer();
Column B is the value returned by custom_wrap_timer_read();
Column C is the original code: (timer_read() % MAX_DEBOUNCE)
A, B, C
65530, 19, 30
65531, 20, 31
65532, 21, 32
65533, 22, 33
65534, 23, 34
65535, 24, 35
0 25, 0
1, 26, 1
2, 27, 2
3, 28, 3
4, 29, 4
5, 30, 5
read_timer() wraps about every 1.09 seconds, and so debouncing might
fail at these times without this commit.
* quantum/debounce/eager_pr and eager_pk: modifications for code readability according to code review.
* quantum/debounce/eager_pr and eager_pk: modifications for code readability according to code review. (2)