Spatially tiled architectures such as CGRAs are robust architectural choices for accelerating applications in the DSP, scientific computing, and embedded domains. In the embedded application domain in particular, CGRAs offer a low power alternative to FPGAs by providing coarse-grained word-level computation resources as opposed to FPGAs’ fine-grained bit- level LUTs. This key difference provides a dramatic decrease in terms of power and area, since many of the interconnect logic needed in FPGAs are effectively eliminated by the word-level granularity of the CGRA. The effectiveness of CGRAs in the embedded application domain creates a market that call for low power computations that extends through the top level architecture to the level of all individual units of the CGRA. Using various low-power design techniques and tools, we implement multiple methods to ultimately reduce the amount of power consumed by the Functional Unit of the CGRA. In the process, we also document the toolsets we utilized to accomplish our goal. We show an average of 4.24x savings in terms of dynamic and static power consumption of the ALU, while still meeting all requirements set by the MOSAIC project.