These ‘-m’ options are defined for the SH implementations:
-m1-m2-m2e-m2a-nofpu-m2a-single-only-m2a-single-m2a-m3-m3e-m4-nofpu-m4-single-only-m4-single-m4-m4-100-m4-100-nofpu-m4-100-single-m4-100-single-only-m4-200-m4-200-nofpu-m4-200-single-m4-200-single-only-m4-300-m4-300-nofpu-m4-300-single-m4-300-single-only-m4-340-m4-500-isa=sh4-nofpu to the assembler. -m4a-nofpu-m4a-single-only-m4a-single-m4a-m4al-m4a-nofpu, except that it implicitly passes -dsp to the assembler. GCC doesn't generate any DSP instructions at the moment. -m5-32media-m5-32media-nofpu-m5-64media-m5-64media-nofpu-m5-compact-m5-compact-nofpu-mb-ml-mdalign-mdalign. -mrelax-relax. -mbigtableswitch tables. The default is to use 16-bit offsets. -mbitops-mfmovdfmovd. Check -mdalign for alignment constraints. -mrenesas-mno-renesas-mnomacsaveMAC register as call-clobbered, even if -mrenesas is given. -mieee-mno-ieee-mieee is implicitly enabled. If -ffinite-math-only is enabled -mno-ieee is implicitly set, which results in faster floating-point greater-equal and less-equal comparisons. The implcit settings can be overridden by specifying either -mieee or -mno-ieee. -minline-ic_invalidate-musermode is in effect and the selected code generation option (e.g. -m4) does not allow the use of the icbi instruction. If the selected code generation option does not allow the use of the icbi instruction, and -musermode is not in effect, the inlined code manipulates the instruction cache address array directly with an associative write. This not only requires privileged mode at run time, but it also fails if the cache line had been mapped via the TLB and has become unmapped. -misize-mpadstruct-matomic-model=model
none’sh*-*-linux*. soft-gusa’sh*-*-linux* and SH3* or SH4*. When the target is SH4A, this option will also partially utilize the hardware atomic instructions movli.l and movco.l to create more efficient code, unless ‘strict’ is specified. soft-tcb’gbr-offset=’ parameter has to be specified as well. soft-imask’SR.IMASK = 1111. This model works only when the program runs in privileged mode and is only suitable for single-core systems. Additional support from the interrupt/exception handling code of the system is not required. This model is enabled by default when the target is sh*-*-linux* and SH1* or SH2*. hard-llcs’movli.l and movco.l instructions only. This is only available on SH4A and is suitable for multi-core systems. Since the hardware instructions support only 32 bit atomic variables access to 8 or 16 bit variables is emulated with 32 bit accesses. Code compiled with this option will also be compatible with other software atomic model interrupt/exception handling systems if executed on an SH4A system. Additional support from the interrupt/exception handling code of the system is not required for this model. gbr-offset=’soft-tcb’ model has been selected. For other models this parameter is ignored. The specified value must be an integer multiple of four and in the range 0-1020. strict’-mtastas.b opcode for __atomic_test_and_set. Notice that depending on the particular hardware and software configuration this can degrade overall performance due to the operand cache line flushes that are implied by the tas.b instruction. On multi-core SH4A processors the tas.b instruction must be used with caution since it can result in data corruption for certain cache configurations. -mprefergot-musermode-mno-usermode-musermode also implies -mno-inline-ic_invalidate if the inlined code would not work in user mode. -musermode is the default when the target is sh*-*-linux*. If the target is SH1* or SH2* -musermode has no effect, since there is no user mode. -multcost=number
-mdiv=strategy
fp’inv’inv:minlat’inv’ where, if no CSE or hoisting opportunities have been found, or if the entire operation has been hoisted to the same place, the last stages of the inverse calculation are intertwined with the final multiply to reduce the overall latency, at the expense of using a few more instructions, and thus offering fewer scheduling opportunities with other code. call’inv:minlat’ strategy. This gives high code density for m5-*media-nofpu compilations. call2’inv:call’inv:call2’inv:fp’inv’ algorithm for initial code generation, but if the code stays unoptimized, revert to the ‘call’, ‘call2’, or ‘fp’ strategies, respectively. Note that the potentially-trapping side effect of division by zero is carried by a separate instruction, so it is possible that all the integer instructions are hoisted out, but the marker for the side effect stays where it is. A recombination to floating-point operations or a call is not possible in that case. inv20u’inv20l’inv:minlat’ strategy. In the case that the inverse calculation is not separated from the multiply, they speed up division where the dividend fits into 20 bits (plus sign where applicable) by inserting a test to skip a number of operations in this case; this test slows down the case of larger dividends. ‘inv20u’ assumes the case of a such a small dividend to be unlikely, and ‘inv20l’ assumes it to be likely. For targets other than SHmedia strategy can be one of:
call-div1’div1 to perform the operation. Division by zero calculates an unspecified result and does not trap. This is the default except for SH4, SH2A and SHcompact. call-fp’call-div1. call-table’div1 instruction with case distinction for larger divisors. Division by zero calculates an unspecified result and does not trap. This is the default for SH4. Specifying this for targets that do not have dynamic shift instructions will default to call-div1. When a division strategy has not been specified the default strategy will be selected based on the current target. For SH2A the default strategy is to use the divs and divu instructions instead of library function calls.
-maccumulate-outgoing-args-mdivsi3_libfunc=name
call’ and ‘inv:call’ division strategies, and the compiler still expects the same sets of input/output/clobbered registers as if this option were not present. -mfixed-range=register-range
-mindexed-addressing-mno-indexed-addressing. -mgettrcost=number
gettr instruction to number. The default is 2 if -mpt-fixed is in effect, 100 otherwise. -mpt-fixedpt* instructions won't trap. This generally generates better-scheduled code, but is unsafe on current hardware. The current architecture definition says that ptabs and ptrel trap when the target anded with 3 is 3. This has the unintentional effect of making it unsafe to schedule these instructions before a branch, or hoist them out of a loop. For example, __do_global_ctors, a part of libgcc that runs constructors at program startup, calls functions in a list which is delimited by −1. With the -mpt-fixed option, the ptabs is done before testing against −1. That means that all the constructors run a bit more quickly, but when the loop comes to the end of the list, the program crashes because ptabs loads −1 into a target register. Since this option is unsafe for any hardware implementing the current architecture specification, the default is -mno-pt-fixed. Unless specified explicitly with -mgettrcost, -mno-pt-fixed also implies -mgettrcost=100; this deters register allocation from using target registers for storing ordinary integers.
-minvalid-symbolsmovi/shori/ptabs or movi/shori/ptrel, but with assembler and/or linker tricks it is possible to generate symbols that cause ptabs or ptrel to trap. This option is only meaningful when -mno-pt-fixed is in effect. It prevents cross-basic-block CSE, hoisting and most scheduling of symbol loads. The default is -mno-invalid-symbols. -mbranch-cost=num
-mzdcbranch-mno-zdcbranchbt and bf are fast. If -mzdcbranch is specified, the compiler will try to prefer zero displacement branch code sequences. This is enabled by default when generating code for SH4 and SH4A. It can be explicitly disabled by specifying -mno-zdcbranch. -mfused-madd-mno-fused-madd-mfused-madd option is now mapped to the machine-independent -ffp-contract=fast option, and -mno-fused-madd is mapped to -ffp-contract=off. -mfsca-mno-fscafsca instruction for sine and cosine approximations. The option -mfsca must be used in combination with -funsafe-math-optimizations. It is enabled by default when generating code for SH4A. Using -mno-fsca disables sine and cosine approximations even if -funsafe-math-optimizations is in effect. -mfsrra-mno-fsrrafsrra instruction for reciprocal square root approximations. The option -mfsrra must be used in combination with -funsafe-math-optimizations and -ffinite-math-only. It is enabled by default when generating code for SH4A. Using -mno-fsrra disables reciprocal square root approximations even if -funsafe-math-optimizations and -ffinite-math-only are in effect. -mpretend-cmove
© Free Software Foundation
Licensed under the GNU Free Documentation License, Version 1.3.
https://gcc.gnu.org/onlinedocs/gcc-4.9.3/gcc/SH-Options.html