It would be helpful if you could share which textbook theory you are referring too. Directivity is often used interchangeablly with the term gain but the gain referred to by antenna engineers are quite different than the gain referred to array signal processing engineers.
The gain for arrays ignal processing engineers are measured by signal to noise ratio between what we can achieve using an array and what we can acheive using a single element. In this case, when you have 4 elements, the signal to noise ratio gain is always 4 along boresight, which translates to 6 dB. To compue this gain, we can use phased.ArrayGain.
The gain for antenna engineers, or directivity when we assume efficiency is 1, measures how focused the power is radiated between an array and an isotropic antenna. This is what directivity() computes and the result is actualy in dBi, where i stands for isotropic (although I've see dB used in literature and probably also makes it more confusing). The computation of directivity actually consider not only the number of elements, but also how they are placed in the array. Therefore, different geometry could lead to different result. In your case, a ULA is istropic in elevation cut along boresight yet a URA will have a pencil beam around the boresight. So it does look like the URA is more focused than ULA, thus a higher directivity seems reasonable.