Without any optimization switch, the compiler’s goal is to reduce the cost of compilation and to make debugging produce the expected results. This means that statements are independent: if you stop the program with a breakpoint between statements, you can then assign a new value to any variable or change the program counter to any other statement in the subprogram and get exactly the results you would expect from the source code. However, the generated programs are considerably larger and slower than when optimization is enabled.
Turning on optimization makes the compiler attempt to improve the performance and/or code size at the expense of compilation time and possibly the ability to debug the program.
You can pass the -O
switch, with or without an operand
(the permitted forms with an operand are -O0
, -O1
,
-O2
, -O3
, -Os
, -Oz
, and
-Og
) to gcc
to control the optimization level. If you
pass multiple -O
switches, with or without an operand,
the last such switch is the one that’s used:
-O0
No optimization (the default); generates unoptimized code but has the fastest compilation time. Debugging is easiest with this switch.
Note that many other compilers do substantial optimization even if
‘no optimization’ is specified. With GCC, it is very unusual to
use -O0
for production if execution time is of any
concern, since -O0
means (almost) no optimization. You
should keep this difference between GCC and other compilers in
mind when doing performance comparisons.
-O1
Moderate optimization (same as -O
without an operand);
optimizes reasonably well but does not degrade compilation time
significantly. You may not be able to see some variables in the
debugger, and changing the value of some variables in the debugger
may not have the effect you desire.
-O2
Extensive optimization; generates highly optimized code but has an increased compilation time. You may see significant impacts on your ability to display and modify variables in the debugger.
-O3
Full optimization; attempts more sophisticated transformations, in particular on loops, possibly at the cost of larger generated code. You may be hardly able to use the debugger at this optimization level.
-Os
Optimize for size (code and data) of resulting binary rather than
speed; based on the -O2
optimization level, but disables
some of its transformations that often increase code size, as well
as performs further optimizations designed to reduce code size.
-Oz
Optimize aggressively for size (code and data) of resulting binary rather than speed; may increase the number of instructions executed if these instructions require fewer bytes to be encoded.
-Og
Optimize for debugging experience rather than speed; based on the
-O1
optimization level, but attempts to eliminate all the
negative effects of optimization on debugging.
Higher optimization levels perform more global transformations on the program and apply more expensive analysis algorithms in order to generate faster and more compact code. The price in compilation time, and the resulting improvement in execution time, both depend on the particular application and the hardware environment. You should experiment to find the best level for your application.
Since the precise set of optimizations done at each level will vary from
release to release (and sometime from target to target), it is best to think
of the optimization settings in general terms.
See the ‘Options That Control Optimization’ section in
Using the GNU Compiler Collection (GCC)
for details about
the -O
settings and a number of -f
switches that
individually enable or disable specific optimizations.
Unlike some other compilation systems, GCC has been tested extensively at all optimization levels. There are some bugs which appear only with optimization turned on, but there have also been bugs which show up only in ‘unoptimized’ code. Selecting a lower level of optimization does not improve the reliability of the code generator, which in practice is highly reliable at all optimization levels.
A note regarding the use of -O3
: The use of this optimization level
ought not to be automatically preferred over that of level -O2
,
since it often results in larger executables which may run more slowly.
See further discussion of this point in Inlining of Subprograms.