PageRenderTime 11ms CodeModel.GetById 8ms app.highlight 1ms RepoModel.GetById 1ms app.codeStats 0ms

/Documentation/fujitsu/frv/atomic-ops.txt

https://bitbucket.org/evzijst/gittest
Plain Text | 134 lines | 85 code | 49 blank | 0 comment | 0 complexity | 43cf696405b14d1cdc679418c40c362b MD5 | raw file
  1			       =====================================
  2			       FUJITSU FR-V KERNEL ATOMIC OPERATIONS
  3			       =====================================
  4
  5On the FR-V CPUs, there is only one atomic Read-Modify-Write operation: the SWAP/SWAPI
  6instruction. Unfortunately, this alone can't be used to implement the following operations:
  7
  8 (*) Atomic add to memory
  9
 10 (*) Atomic subtract from memory
 11
 12 (*) Atomic bit modification (set, clear or invert)
 13
 14 (*) Atomic compare and exchange
 15
 16On such CPUs, the standard way of emulating such operations in uniprocessor mode is to disable
 17interrupts, but on the FR-V CPUs, modifying the PSR takes a lot of clock cycles, and it has to be
 18done twice. This means the CPU runs for a relatively long time with interrupts disabled,
 19potentially having a great effect on interrupt latency.
 20
 21
 22=============
 23NEW ALGORITHM
 24=============
 25
 26To get around this, the following algorithm has been implemented. It operates in a way similar to
 27the LL/SC instruction pairs supported on a number of platforms.
 28
 29 (*) The CCCR.CC3 register is reserved within the kernel to act as an atomic modify abort flag.
 30
 31 (*) In the exception prologues run on kernel->kernel entry, CCCR.CC3 is set to 0 (Undefined
 32     state).
 33
 34 (*) All atomic operations can then be broken down into the following algorithm:
 35
 36     (1) Set ICC3.Z to true and set CC3 to True (ORCC/CKEQ/ORCR).
 37
 38     (2) Load the value currently in the memory to be modified into a register.
 39
 40     (3) Make changes to the value.
 41
 42     (4) If CC3 is still True, simultaneously and atomically (by VLIW packing):
 43
 44	 (a) Store the modified value back to memory.
 45
 46	 (b) Set ICC3.Z to false (CORCC on GR29 is sufficient for this - GR29 holds the current
 47	     task pointer in the kernel, and so is guaranteed to be non-zero).
 48
 49     (5) If ICC3.Z is still true, go back to step (1).
 50
 51This works in a non-SMP environment because any interrupt or other exception that happens between
 52steps (1) and (4) will set CC3 to the Undefined, thus aborting the store in (4a), and causing the
 53condition in ICC3 to remain with the Z flag set, thus causing step (5) to loop back to step (1).
 54
 55
 56This algorithm suffers from two problems:
 57
 58 (1) The condition CCCR.CC3 is cleared unconditionally by an exception, irrespective of whether or
 59     not any changes were made to the target memory location during that exception.
 60
 61 (2) The branch from step (5) back to step (1) may have to happen more than once until the store
 62     manages to take place. In theory, this loop could cycle forever because there are too many
 63     interrupts coming in, but it's unlikely.
 64
 65
 66=======
 67EXAMPLE
 68=======
 69
 70Taking an example from include/asm-frv/atomic.h:
 71
 72	static inline int atomic_add_return(int i, atomic_t *v)
 73	{
 74		unsigned long val;
 75
 76		asm("0:						\n"
 77
 78It starts by setting ICC3.Z to true for later use, and also transforming that into CC3 being in the
 79True state.
 80
 81		    "	orcc		gr0,gr0,gr0,icc3	\n"	<-- (1)
 82		    "	ckeq		icc3,cc7		\n"	<-- (1)
 83
 84Then it does the load. Note that the final phase of step (1) is done at the same time as the
 85load. The VLIW packing ensures they are done simultaneously. The ".p" on the load must not be
 86removed without swapping the order of these two instructions.
 87
 88		    "	ld.p		%M0,%1			\n"	<-- (2)
 89		    "	orcr		cc7,cc7,cc3		\n"	<-- (1)
 90
 91Then the proposed modification is generated. Note that the old value can be retained if required
 92(such as in test_and_set_bit()).
 93
 94		    "	add%I2		%1,%2,%1		\n"	<-- (3)
 95
 96Then it attempts to store the value back, contingent on no exception having cleared CC3 since it
 97was set to True.
 98
 99		    "	cst.p		%1,%M0		,cc3,#1	\n"	<-- (4a)
100
101It simultaneously records the success or failure of the store in ICC3.Z.
102
103		    "	corcc		gr29,gr29,gr0	,cc3,#1	\n"	<-- (4b)
104
105Such that the branch can then be taken if the operation was aborted.
106
107		    "	beq		icc3,#0,0b		\n"	<-- (5)
108		    : "+U"(v->counter), "=&r"(val)
109		    : "NPr"(i)
110		    : "memory", "cc7", "cc3", "icc3"
111		    );
112
113		return val;
114	}
115
116
117=============
118CONFIGURATION
119=============
120
121The atomic ops implementation can be made inline or out-of-line by changing the
122CONFIG_FRV_OUTOFLINE_ATOMIC_OPS configuration variable. Making it out-of-line has a number of
123advantages:
124
125 - The resulting kernel image may be smaller
126 - Debugging is easier as atomic ops can just be stepped over and they can be breakpointed
127
128Keeping it inline also has a number of advantages:
129
130 - The resulting kernel may be Faster
131   - no out-of-line function calls need to be made
132   - the compiler doesn't have half its registers clobbered by making a call
133
134The out-of-line implementations live in arch/frv/lib/atomic-ops.S.