This paper introduces a multi-rate hidden Markov model (multi-rate HMM) for multi-scale stochastic modeling of non-stationary processes. The multi-rate HMM decomposes the process variability into scale-based components, and characterizes both the intra-scale temporal evolution of the process and the inter-scale interactions. Scales are organized in a hierarchical manner from coarser scales to finer ones, allowing for the efficient representation of both long- and short-term context information simultaneously. Computationally efficient probabilistic inference and parameter estimation algorithms for the multi-rate HMM are given. We apply these models to the prediction of machining tool wear which exhibit both long-range dependence and multi-scale dynamics. A multi-category tool-wear prediction system architecture is presented for modeling the wear progress over multiple time scales during a tool's lifetime. The classification results on challenging titanium milling tasks show that multi-rate HMMs outperform HMMs in terms of both accuracy and confidence of predictions.