Least mean squares (LMS) algorithms represent the simplest and most easily applied adaptive algorithms. The recursive least squares (RLS) algorithms on the other hand are known for their excellent performance and greater fidelity, but come with increased complexity and computational cost. In performance, RLS approaches the Kalman filter in adaptive filtering applications, at somewhat reduced required throughput in the signal processor. Compared to the LMS algorithm, the RLS approach offers faster convergence and smaller error with respect to the unknown system, at the expense of requiring more computations.
Note that the signal paths and identifications are the same whether the filter uses RLS or LMS. The difference lies in the adapting portion.
The LMS filters adapt their coefficients until the difference between the desired signal and the actual signal is minimal (least mean squares of the error signal). This is the state when the filter weights converge to optimal values, that is, they have converged close enough to the actual coefficients of the unknown system. This class of algorithms adapt based on the error at the current time. The RLS adaptive filter is an algorithm which recursively finds the filter coefficients that minimize a weighted linear least squares cost function relating to the input signals. These filters adapt based on the total error computed from the beginning.
The LMS filters use a gradient based approach to perform the adaptation. The initial weights are assumed to be small, in most cases, very close to zero. At each step, the filter weights are updated based on the gradient of the mean square error. If the gradient is positive, the filter weights are reduced, so that the error does not increase positively. If the gradient is negative, the filter weights are increased. The step size with which the weights change must be chosen appropriately. If the step size is very small, the algorithm converges very slowly. If the step size is very large, the algorithm converges very fast and the system might not be stable at the minimum error value. In order to have a stable system, the step size, μ, must be within the following limits:
$$0<\mu <\frac{2}{{\lambda}_{\mathrm{max}}},$$
where, λ_{max} is the largest eigen value of the input autocorrelation matrix.
The RLS filters minimize a cost function, C by appropriately selecting the filter coefficients, w(n), and updating the filter as the new data arrives. The cost function is given by the following equation:
$$C({w}_{n})={\displaystyle \sum _{i=0}^{n}{\lambda}^{ni}{e}^{2}(i)},$$
where,
w_{n} –– RLS adaptive filter coefficients.
e(i) –– Error between the desired signal, d, and the estimate of the desired signal, dest, at the current time index. The signal, dest is the output of the RLS filter, and hence implicitly depends on the current filter coefficients.
0 < λ ≤ 1 –– Forgetting factor which gives exponentially less weight to older samples. When λ = 1, all previous error is considered of equal weight in the total error. As λ approaches zero, the past errors play a smaller role in the total. For example, when λ = 0.1, the RLS algorithm multiplies an error value from 50 samples in the past by an attenuation factor of 0.1^{50} = 1 x 10^{−50}, considerably deemphasizing the influence of the past error on the current total error.
In cases where the error value might come from a spurious input data point or points, the forgetting factor lets the RLS algorithm reduce the value of older error data by multiplying the old data by the forgetting factor.
Here is a table that summarizes the key differences between the two types of algorithms:
LMS Algorithm  RLS Algorithm 

Simple and can be easily applied.  Increased complexity and computational cost. 
Takes longer to converge.  Faster convergence. 
Adaptation is based on gradient based approach which updates filter weights in a manner to converge to the optimum filter weights.  Adaptation is based on recursive approach that finds the filter coefficients which minimize a weighted linear least squares cost function relating to the input signals. 
Larger steady state error with respect to unknown system compared to the RLS algorithm.  Smaller steady state error with respect to unknown system. 
No account for the past data.  Accounts for past data from the beginning to the current data point. 
Objective is to minimize the current mean square error between the desired signal and the output.  Objective is to minimize the total weighted squared error between the desired signal and the output. 
No memory involved. Older error values play no role in the total error considered.  Has infinite memory. All error data is considered in the total error. Using the forgetting factor, the older data can be can be less emphasized compared to the newer data. Since 0 ≤ λ < 1, applying the factor is equivalent to weighting the older error. 
LMS based FIR adaptive filters in DSP System Toolbox™:
 RLS based FIR adaptive filters in DSP System Toolbox:

Within limits, you can use any of the adaptive filter algorithms to solve an adaptive filter problem by replacing the adaptive portion of the application with a new algorithm.
[1] Hayes, Monson H., Statistical Digital Signal Processing and Modeling, John Wiley & Sons, 1996, 493–552.
[2] Haykin, Simon, Adaptive Filter Theory, PrenticeHall, Inc., 1996.