Why does Gmail clip messages?
I dealt with this issue when working on two different projects. Spent several days figuring out the solution and fixed it for the first project. A few years passed. Ran into a similar problem, and totally forgot about the previous solution, so spent another few days addressing the same problem and later realized that I already addressed it before. Anyway, this is post primarily to myself in case I ran into the same problem again, but it may help other people.
PROBLEM
When my application sends email notifications, some of them appear as clipped in Gmail. The notifications are formatted as HTML and translated to various languages, but not all translation cause this issue. For example, notifications translated to German, Spanish, Finnish, French, Italian, Portuguese, and Swedish would be clipped, while translations to Arabic, Czech, Hebrew, Dutch, Japanese, Russian, Vietnamese and Chinese, would not (there are more languages in both lists). The notifications were about the same size (very short, just a few sentences), did not include any embedded media (and had just a small logo in the image reference tag). I checked multiple articles and none of the possible reasons causing message clipping applied to my notifications.
TROUBLESHOOTING
A while back, I had a similar issue caused by a copyright character. I use my own framework to generate email notification files from the templates and one of the third party libraries in the framework converted the HTML entities (such as ©) to the Unicode character equivalents (such as ©). At some point I added the capability to convert the Unicode characters back to the HTML entities, so it should have taken care of the problem, but I noticed that in the actual message in Gmail, the copyright character was again in Unicode. I will get back to it later, but since all my templates underwent the same process and all contained the same copyright message and some of them worked fine while others were clipped, it should not have been the issue (don't know if Gmail fixed it or something else happened by the Unicode copyright character I see in the email body now does not seem to cause clipping).
After spending many hours trying to isolate which particular characters cause message clipping, I noticed that the issue mostly affects accented characters. I wondered if the problem was caused by character sets. I downloaded the email messages from Gmail and noticed the discrepancy in the Content-Type header. The translated messages that were not clipped had the content type set to text/html; charset=utf-8, while the clipped messages were set to text/html; charset=iso-8859-1.
EXPLANATION
Here is what caused Gmail message clipping for me. We use SendGrid to send notifications, and their API (we use C#) does not allow specifying the content type character set. When we used the standard .NET messaging API, specifying the character set was trivial (and we did not have this problem), but for some reason, the SendGrid API does not support it. Instead, it tries to determine the most appropriate character set based on the message text. I do not know why they do it. Maybe it's a way to reduce message size or something else, but the bottom line is that there is no programmatic way to tell the API what character set to use when sending email messages. And I'm not sure about other clients, but Gmail is very particular about the content type it receives and the content of the message, so when it sees the ISO-8859-1 character set in the content type header, but notices accented characters, it clips the message, even if the message holds a single sentence (or word). I filed an issue with the SendGrid C# API about this (and another one for the HTML entities conversion), but something tells me that they will not fix it, so we need to find a solution.
SOLUTION
I'm not sure if there is a better way to address it, but the solution I picked was to include an invisible Unicode character in all message bodies to force SendGrid to set up content type character set to UTF-8. If you are having the same problem, add a line like this somewhere in the HTML email message body:
<!-- DO NOT REMOVE! KEEP THIS ELEMENT WITH THE ⌀ CHARACTER TO HELP SENDGRID USE UTF-8 ENCODING. --> <span style="color: transparent; user-select: none; font-size: 0; display: none; visibility: hidden;">⌀</span>
I am using an HTML element with an invisible character instead of an HTML comment because some frameworks may remove comments before sending an email message to reduce the message size. So far, it solved my problem, and I do not see any clipping of any translation among the three dozen that we support. Now, the most important part is not to forget to do this in the next project.
