1. In the simplest terms a window is a function that is zero everywhere except in a specified range. The window function is then usually multiplied by a signal (often in a sliding fashion to get a result over the full signal) to focus some analysis on only the section of the signal covered by the window. There are many different types of window from simple rectangular to Gaussian windows and others with much greater tapering of the signal in the windowed region.
2. I assume you mean STFT. This is the Short Time Fourier Transform and is a sliding (or discrete moving at least) window version of the FFT.
Applying the FFT to a signal gives you pure frequency domain information with no information of where in the time domain of your signal the different frequency components occur. The STFT breaks up a signal into e.g. sections of a given length (32, for example) often with an overlap (e.g. 16) to avoid losing parts of the signal to windowing or getting edge effects from the windows and returns a 2d time-frequency result rather than a 1d frequency result. This result is known as a spectrogram which gives information of the joint time-frequency content of the signal according to the various windowing and fft-length decisions you make. There is a trade-off between time resolution and frequency resolution here - it isn't a magical panacea to achieve perfect resolution in both domains simultaneously of course.