Doing display capture seems like a poor workaround. The suggested method will fail for any practically large images. A reduced-size text image would need to be generated and padded out to correct geometry and text position. ... and all for a crudely binarized copy of what used to be antialiased text. If you're going to do display capture, at least preserve the antialiasing.
textstring = 'LOOK AT ALL THESE ROCKS';
textcolor = [85 250 118];
inpict = imread('parkavenue.jpg');
hf = figure('units','normalized','position',[0 0 1 1]);
tt = text('units','pixels','position',[fontsize fontsize],'fontunits','pixels', ...
ttpos = round(tt.Extent); close(hf)
ttpos(2) = size(tpict,1)-ttpos(2);
tpict = tpict(ttpos(2)-ttpos(4):ttpos(2),ttpos(1):ttpos(1)+ttpos(3));
tpict = imcomplement(im2double(tpict));
error('text is too big to fit %dpx>%dpx',st(2),s(2))
error('text is too big to fit with given offset (%dpx+%dpx)>%dpx',os(2),st(2),s(2))
tpict = padarray(tpict,os,0,'pre');
tpict = padarray(tpict,s(1:2)-(os+st),0,'post');
cpict = ones(s(1:2)).*permute(textcolor,[1 3 2]);
outpict = double(inpict).*(1-tpict) + double(cpict).*tpict;
outpict = uint8(outpict);
The above code is not unbreakable, particularly due to the guesstimate for the required text matting size. It will also fail if the text block itself is too large to be rendered on screen. At least this demonstrates the concept and preserves the antialiasing because it uses linear blending.
Is there something else?
I suppose it really depends what the needs and expectations are. There are a number of text to image tools on the File Exchange. The result is a small image of text. Combining that into any image should be a simple task. Some may support antialiased output suitable for linear blending like the above example. Others support only binary output, which has different utility and can allow compositing with simple masking.
For example, using textim() from MIMT (binary masking):
textcolor = [250 100 220];
inpict = repmat(imread('coins.png'),[1 1 3]);
tt = textim(textstring,'ibm-vga-16x9');
m(os(1)+(1:st(1)),os(2)+(1:st(2))) = tt;
outpict(repmat(logical(m),[1 1 size(inpict,3)])) = 0;
outpict = outpict + uint8(m.*permute(textcolor,[1 3 2]));
MIMT has other tools that would simplify the positioning/compositing, but this example is generalized except for the textim() call.
What's on the FEX?
MIMT has both textim() and textblock(), which generate compact images of text in legacy hardware fonts. (CP437 based)
text2im() by Tobias Kiessling is similar, but only capable of a single font (the same default font used by MIMT textim()) (also CP437 based)
text_to_image by Alec Jacobson is more flexible, but uses Imagemagick (external dependency), and is consequently slower.
There are also others:
And there are slightly different approaches: