[Mspgcc-users] Instruction reordering

Discussion:

Wayne Uroda

2015-11-27 11:32:08 UTC

Hello,

I have a question about instruction reordering in the compiler.

In this blog
http://preshing.com/20120625/memory-ordering-at-compile-time/
It says that a compiler barrier should be used where writes to memory must not be reordered.

My MSP430 code is all bare metal, foreground-background code and I am not using any type of RTOS, so it could be considered lock-free.

Anyway, a new employee at my work pointed out that the compiler is free to reorder my memory writes, even those marked volatile. I said that I had never actually seen mspgcc do this, but now I am curious: will mspgcc reorder memory writes, eg to global variables?
Is this dependent on the msp430 side of gcc (the backend), or more on the AST/RTL side?

Or to put it another way, should I be reviewing my shared (between background and ISR) variables and placing compiler barriers where variables must be stored in an exact order?

I am using a very old version of mspgcc, 3.2.3, I think it was the last stable Windows binary release before the CPUX instructions started making their way in, sorry I don't know the version, from 2008/2009 maybe?

Thanks
- Wayne

------------------------------------------------------------------------------

David Brown

2015-11-27 13:19:25 UTC

Permalink

Post by Wayne Uroda
Hello,
I have a question about instruction reordering in the compiler.
In this blog
http://preshing.com/20120625/memory-ordering-at-compile-time/ It says
that a compiler barrier should be used where writes to memory must
not be reordered.

Looks like good information.

Post by Wayne Uroda
My MSP430 code is all bare metal, foreground-background code and I am
not using any type of RTOS, so it could be considered lock-free.
Anyway, a new employee at my work pointed out that the compiler is
free to reorder my memory writes, even those marked volatile. I said
will mspgcc reorder memory writes, eg to global variables? Is this
dependent on the msp430 side of gcc (the backend), or more on the
AST/RTL side?

The compiler can re-order any reads or writes it likes, except volatile
accesses. The order of volatile accesses will always match the source
code. But the compiler can move non-volatile reads and writes back and
forth between volatile accesses.

So if you have non-volatile variables a, b, c, and volatile variables u,
v, and source code like this:

a = 1;
u = 100;
b = 2;
v = 200;
c = 3;

Then the compiler can generate code like this:

a = 1;
c = 3;
u = 100;
v = 200;
b = 2;

But it may not generate code like this:

a = 1;
v = 200;
b = 2;
u = 100;
c = 3;

(The same applies to reads.)

The volatile accesses must be exact, in the same number and the same
order with respect to other volatile accesses. But the non-volatile
accesses can be moved around as much as the compiler wants.

Note that variable accesses can also be moved around with respect to
function calls, /if/ the compiler knows the function does not make
volatile accesses.

The compiler can also eliminate "dead" stores, or unused reads. Given:

a = 1;
u = 100;
a = 2;
v = 200;
a = 3;

The compiler can generate:

u = 100;
v = 200;
a = 3;

or

a = 3;
u = 100;
v = 100;

Post by Wayne Uroda
Or to put it another way, should I be reviewing my shared (between
background and ISR) variables and placing compiler barriers where
variables must be stored in an exact order?

Yes, these must be handled carefully. In particular, volatile accesses
give no indication about how non-volatile accesses are handled.
Sometimes people write code like this:

int data[4];
volatile bool newData;

void update(int x) {
data[0] = data[1];
data[1] = data[2];
data[2] = data[3];
data[3] = x;
newData = true;
}

And they think that making the newData flag a volatile is sufficient.
It is /not/ sufficient - the compiler can set newData before doing
anything with the data[] array. And then your interrupt code will work
fine in all your testing, and hit a race condition when the customer is
watching.

You have to make each access to "data" here volatile, or you need a
memory barrier between "data[3] = x;" and "newData = true;". A memory
barrier tells the compiler that all memory writes before the barrier
need to be completed, no writes after the barrier may be started, and
any data read before the barrier is now invalid.

And note that even then, nothing in memory barriers or volatile access
will ensure that the reads or writes are atomic.

Some useful macros/functions:

#define volatileAccess(v) *((volatile typeof((v)) *) &(v))

static inline void compilerBarrier(void) {
asm volatile("" ::: "memory");
}

Post by Wayne Uroda
I am using a very old version of mspgcc, 3.2.3, I think it was the
last stable Windows binary release before the CPUX instructions
started making their way in, sorry I don't know the version, from
2008/2009 maybe?

That was probably the final release before the mspgcc 4 project was
started. And now there is a new port msp430-elf from Red Hat and TI,
which is where you should go for moving on for the future.

gcc 3.2.3 did not optimise as aggressively as newer gcc, so it will do a
lot less re-arrangement of code. In general, the compiler will
re-arrange the order of writes if it has good reason to do so - if the
result is smaller and/or faster. If it makes no difference, then it
will not re-order the writes.

------------------------------------------------------------------------------

Wayne Uroda

2015-11-28 03:04:15 UTC

Permalink

Thanks very much David.

I already disable interrupts when I need an atomic operation, such as a write to 32 bit variable or a read-modify-write on a 16bit variable.

Am I right to assume that simple writes to 16 bit variables are always atomic on MSP430? Eg done = 0xabcd, the CPU doesn't do two byte writes or anything dumb like that, the memory will be X (don't care) before the write and will be 0xabcd after, or should I disable interrupts for all writes to shared memory (shared with an ISR)?

My colleague asserts that the compiler is even free to reorder volatile accesses, like

volatile int a;
volatile int b;
int c;

int main()
{
a = 5;
b = 0;
c = 1;
return 0;
}

You are saying that the compiler will always write a before b, even without a memory barrier. Is that right?

But in the above, the write to c can happen anywhere, as far as I understand.

Lastly, this compiler (3.2.3) is only used on our legacy projects. More recently we are using the last LTS of mspgcc before Ti/RH started their port.
I've not seriously tried to use the new RH compiler, and my company is largely moving away from MSP430 towards ARM Cortex now.

Thanks
- Wayne

Post by David Brown

Looks like good information.

The compiler can re-order any reads or writes it likes, except volatile
accesses. The order of volatile accesses will always match the source
code. But the compiler can move non-volatile reads and writes back and
forth between volatile accesses.
So if you have non-volatile variables a, b, c, and volatile variables u,
a = 1;
u = 100;
b = 2;
v = 200;
c = 3;
a = 1;
c = 3;
u = 100;
v = 200;
b = 2;
a = 1;
v = 200;
b = 2;
u = 100;
c = 3;
(The same applies to reads.)
The volatile accesses must be exact, in the same number and the same
order with respect to other volatile accesses. But the non-volatile
accesses can be moved around as much as the compiler wants.
Note that variable accesses can also be moved around with respect to
function calls, /if/ the compiler knows the function does not make
volatile accesses.
a = 1;
u = 100;
a = 2;
v = 200;
a = 3;
u = 100;
v = 200;
a = 3;
or
a = 3;
u = 100;
v = 100;

Post by Wayne Uroda
Or to put it another way, should I be reviewing my shared (between
background and ISR) variables and placing compiler barriers where
variables must be stored in an exact order?

Yes, these must be handled carefully. In particular, volatile accesses
give no indication about how non-volatile accesses are handled.
int data[4];
volatile bool newData;
void update(int x) {
data[0] = data[1];
data[1] = data[2];
data[2] = data[3];
data[3] = x;
newData = true;
}
And they think that making the newData flag a volatile is sufficient.
It is /not/ sufficient - the compiler can set newData before doing
anything with the data[] array. And then your interrupt code will work
fine in all your testing, and hit a race condition when the customer is
watching.
You have to make each access to "data" here volatile, or you need a
memory barrier between "data[3] = x;" and "newData = true;". A memory
barrier tells the compiler that all memory writes before the barrier
need to be completed, no writes after the barrier may be started, and
any data read before the barrier is now invalid.
And note that even then, nothing in memory barriers or volatile access
will ensure that the reads or writes are atomic.
#define volatileAccess(v) *((volatile typeof((v)) *) &(v))
static inline void compilerBarrier(void) {
asm volatile("" ::: "memory");
}

That was probably the final release before the mspgcc 4 project was
started. And now there is a new port msp430-elf from Red Hat and TI,
which is where you should go for moving on for the future.
gcc 3.2.3 did not optimise as aggressively as newer gcc, so it will do a
lot less re-arrangement of code. In general, the compiler will
re-arrange the order of writes if it has good reason to do so - if the
result is smaller and/or faster. If it makes no difference, then it
will not re-order the writes.
------------------------------------------------------------------------------
_______________________________________________
Mspgcc-users mailing list
https://lists.sourceforge.net/lists/listinfo/mspgcc-users

------------------------------------------------------------------------------

Peter Bigot

2015-11-28 12:35:38 UTC

Permalink

Your colleague is mistaken regarding re-ordering of access to
volatile-qualified objects. David's description is correct and nicely
detailed. In the example a will always be written before b unless the
compiler is broken.

(I should point out that mspgcc in the 3.2.3 era in fact probably *was*
broken with respect to correct ordering of some volatile operations,
because of a hack intended to improve peripheral register access
performance. There were errors in volatile behavior when I started
maintenance that didn't get cleaned up until sometime around August 2011.
There's probably traffic in the list archive around that time that explains
the problem.)

MSP430 writes of 16-bit values in any version of mspgcc I released would be
atomic except if they occurred to an object declared with the "packed"
attribute, in which case the byte values would be written separately to
prevent alignment problems. (There were bugs in this usage too.)

I also use ARM Cortex devices these days, and am very pleased to see that
TI is moving the MSP432 to using CMSIS-Core--compatible peripheral
structure definitions as of March 2016 (
http://processors.wiki.ti.com/index.php/MSP432_CMSIS_Update?DCMP=epd-mcu-msp-gen&HQS=MSP432CMSIS).
If the Tiva and CC3K folks were that adaptable I might still use TI chips.

Peter

Post by Wayne Uroda
Thanks very much David.
I already disable interrupts when I need an atomic operation, such as a
write to 32 bit variable or a read-modify-write on a 16bit variable.
Am I right to assume that simple writes to 16 bit variables are always
atomic on MSP430? Eg done = 0xabcd, the CPU doesn't do two byte writes or
anything dumb like that, the memory will be X (don't care) before the write
and will be 0xabcd after, or should I disable interrupts for all writes to
shared memory (shared with an ISR)?
My colleague asserts that the compiler is even free to reorder volatile accesses, like
volatile int a;
volatile int b;
int c;
int main()
{
a = 5;
b = 0;
c = 1;
return 0;
}
You are saying that the compiler will always write a before b, even
without a memory barrier. Is that right?
But in the above, the write to c can happen anywhere, as far as I understand.
Lastly, this compiler (3.2.3) is only used on our legacy projects. More
recently we are using the last LTS of mspgcc before Ti/RH started their
port.
I've not seriously tried to use the new RH compiler, and my company is
largely moving away from MSP430 towards ARM Cortex now.
Thanks
- Wayne

Post by David Brown

Looks like good information.

The compiler can re-order any reads or writes it likes, except volatile
accesses. The order of volatile accesses will always match the source
code. But the compiler can move non-volatile reads and writes back and
forth between volatile accesses.
So if you have non-volatile variables a, b, c, and volatile variables u,
a = 1;
u = 100;
b = 2;
v = 200;
c = 3;
a = 1;
c = 3;
u = 100;
v = 200;
b = 2;
a = 1;
v = 200;
b = 2;
u = 100;
c = 3;
(The same applies to reads.)
The volatile accesses must be exact, in the same number and the same
order with respect to other volatile accesses. But the non-volatile
accesses can be moved around as much as the compiler wants.
Note that variable accesses can also be moved around with respect to
function calls, /if/ the compiler knows the function does not make
volatile accesses.
a = 1;
u = 100;
a = 2;
v = 200;
a = 3;
u = 100;
v = 200;
a = 3;
or
a = 3;
u = 100;
v = 100;

Post by Wayne Uroda
Or to put it another way, should I be reviewing my shared (between
background and ISR) variables and placing compiler barriers where
variables must be stored in an exact order?

That was probably the final release before the mspgcc 4 project was
started. And now there is a new port msp430-elf from Red Hat and TI,
which is where you should go for moving on for the future.
gcc 3.2.3 did not optimise as aggressively as newer gcc, so it will do a
lot less re-arrangement of code. In general, the compiler will
re-arrange the order of writes if it has good reason to do so - if the
result is smaller and/or faster. If it makes no difference, then it
will not re-order the writes.

------------------------------------------------------------------------------

Post by David Brown
_______________________________________________
Mspgcc-users mailing list
https://lists.sourceforge.net/lists/listinfo/mspgcc-users

------------------------------------------------------------------------------
_______________________________________________
Mspgcc-users mailing list
https://lists.sourceforge.net/lists/listinfo/mspgcc-users

David Brown

2015-11-28 14:32:30 UTC

Permalink

It is always nice to get confirmation, especially from the person that
actually maintained and re-wrote the code in question!

As Peter points out, discussing the behaviour of the compiler is always
on the assumption that the compiler does not have bugs there. The later
gcc 4 versions that Peter put together had fewer bugs and more features
than the gcc 3 msp430 port. We use them on a couple of projects (we
have not moved to the new TI/Red Hat port), and I have been very happy
with the tools.

We too are moving more and more to Cortex M devices (Freescale Kinetis
mostly). But when working with high temperature electronics, the msp430
can still be a good choice - there is a 150C qualified msp430, with
partial qualification (excluding some peripherals) at 175C. There are
no other microcontrollers of any sort, AFAIK, that are comparable in
size and power consumption that work at high temperatures. One day, I
am sure someone (probably TI!) will release an M3/M4 for high
temperature - but until then, we will use the msp430 for such
specialised projects.

mvh.,

David

Post by Peter Bigot
Your colleague is mistaken regarding re-ordering of access to
volatile-qualified objects. David's description is correct and nicely
detailed. In the example a will always be written before b unless the
compiler is broken.
(I should point out that mspgcc in the 3.2.3 era in fact probably *was*
broken with respect to correct ordering of some volatile operations,
because of a hack intended to improve peripheral register access
performance. There were errors in volatile behavior when I started
maintenance that didn't get cleaned up until sometime around August
2011. There's probably traffic in the list archive around that time
that explains the problem.)
MSP430 writes of 16-bit values in any version of mspgcc I released would
be atomic except if they occurred to an object declared with the
"packed" attribute, in which case the byte values would be written
separately to prevent alignment problems. (There were bugs in this
usage too.)
I also use ARM Cortex devices these days, and am very pleased to see
that TI is moving the MSP432 to using CMSIS-Core--compatible peripheral
structure definitions as of March 2016
(http://processors.wiki.ti.com/index.php/MSP432_CMSIS_Update?DCMP=epd-mcu-msp-gen&HQS=MSP432CMSIS).
If the Tiva and CC3K folks were that adaptable I might still use TI chips.
Peter
Thanks very much David.
I already disable interrupts when I need an atomic operation, such
as a write to 32 bit variable or a read-modify-write on a 16bit
variable.
Am I right to assume that simple writes to 16 bit variables are
always atomic on MSP430? Eg done = 0xabcd, the CPU doesn't do two
byte writes or anything dumb like that, the memory will be X (don't
care) before the write and will be 0xabcd after, or should I disable
interrupts for all writes to shared memory (shared with an ISR)?
My colleague asserts that the compiler is even free to reorder
volatile accesses, like
volatile int a;
volatile int b;
int c;
int main()
{
a = 5;
b = 0;
c = 1;
return 0;
}
You are saying that the compiler will always write a before b, even
without a memory barrier. Is that right?
But in the above, the write to c can happen anywhere, as far as I understand.
Lastly, this compiler (3.2.3) is only used on our legacy projects.
More recently we are using the last LTS of mspgcc before Ti/RH
started their port.
I've not seriously tried to use the new RH compiler, and my company
is largely moving away from MSP430 towards ARM Cortex now.
Thanks
- Wayne

Post by David Brown

Post by Wayne Uroda
Hello,
I have a question about instruction reordering in the compiler.
In this blog
http://preshing.com/20120625/memory-ordering-at-compile-time/ It

says

Post by David Brown

Post by Wayne Uroda
that a compiler barrier should be used where writes to memory must
not be reordered.

Looks like good information.

Post by Wayne Uroda
My MSP430 code is all bare metal, foreground-background code and

I am

Post by David Brown

Post by Wayne Uroda
not using any type of RTOS, so it could be considered lock-free.
Anyway, a new employee at my work pointed out that the compiler is
free to reorder my memory writes, even those marked volatile. I said
will mspgcc reorder memory writes, eg to global variables? Is this
dependent on the msp430 side of gcc (the backend), or more on the
AST/RTL side?

The compiler can re-order any reads or writes it likes, except

volatile

Post by David Brown
accesses. The order of volatile accesses will always match the

source

Post by David Brown
code. But the compiler can move non-volatile reads and writes

back and

Post by David Brown
forth between volatile accesses.
So if you have non-volatile variables a, b, c, and volatile

variables u,

Post by David Brown
a = 1;
u = 100;
b = 2;
v = 200;
c = 3;
a = 1;
c = 3;
u = 100;
v = 200;
b = 2;
a = 1;
v = 200;
b = 2;
u = 100;
c = 3;
(The same applies to reads.)
The volatile accesses must be exact, in the same number and the same
order with respect to other volatile accesses. But the non-volatile
accesses can be moved around as much as the compiler wants.
Note that variable accesses can also be moved around with respect to
function calls, /if/ the compiler knows the function does not make
volatile accesses.
The compiler can also eliminate "dead" stores, or unused reads.
a = 1;
u = 100;
a = 2;
v = 200;
a = 3;
u = 100;
v = 200;
a = 3;
or
a = 3;
u = 100;
v = 100;

Post by Wayne Uroda
Or to put it another way, should I be reviewing my shared (between
background and ISR) variables and placing compiler barriers where
variables must be stored in an exact order?

Yes, these must be handled carefully. In particular, volatile

accesses

Post by David Brown
give no indication about how non-volatile accesses are handled.
int data[4];
volatile bool newData;
void update(int x) {
data[0] = data[1];
data[1] = data[2];
data[2] = data[3];
data[3] = x;
newData = true;
}
And they think that making the newData flag a volatile is sufficient.
It is /not/ sufficient - the compiler can set newData before doing
anything with the data[] array. And then your interrupt code

will work

Post by David Brown
fine in all your testing, and hit a race condition when the

customer is

Post by David Brown
watching.
You have to make each access to "data" here volatile, or you need a
memory barrier between "data[3] = x;" and "newData = true;". A

memory

Post by David Brown
barrier tells the compiler that all memory writes before the barrier
need to be completed, no writes after the barrier may be started, and
any data read before the barrier is now invalid.
And note that even then, nothing in memory barriers or volatile

access

Post by David Brown
will ensure that the reads or writes are atomic.
#define volatileAccess(v) *((volatile typeof((v)) *) &(v))
static inline void compilerBarrier(void) {
asm volatile("" ::: "memory");
}

will do a

Post by David Brown
lot less re-arrangement of code. In general, the compiler will
re-arrange the order of writes if it has good reason to do so -

if the

Post by David Brown
result is smaller and/or faster. If it makes no difference, then it
will not re-order the writes.

------------------------------------------------------------------------------

Post by David Brown
_______________________________________________
Mspgcc-users mailing list
https://lists.sourceforge.net/lists/listinfo/mspgcc-users

------------------------------------------------------------------------------