I am storing multiple domain names to some database, after I have converted every single domain title to it's IDNA version. What I have to be aware of maximum length this kind of IDNA-converted domain title might have in order to define the database field's max length.
Now, I understand the utmost quantity of figures inside a domain title (including any subdomains) is 255 figures.
Where I dropped it:
That's easy initially, but... performs this mean regular ascii figures of worldwide figures (think UTF-8 encoding)?
To provide you with a good example: The domain "müller.p" has 9 figures after i ignore that "ü" is definitely an worldwide character that requires more bytes to become symbolized. The IDNA version of "müller.p" is "xn--mller-kva.p", that has 16 figures. This shows there's certainly a positive change in maximum length based on "if" it's IDNA converted or otherwise.
Based on what type of figures they mean, the 255-character maximum may be the worldwide character version, the IDNA converted version as well as both.
And that is where I dropped it a little... especially, since I must consider that does not all domain names is going to be sane and things like "öüßüöäéèê.example.äöüßüöäéèê-äöüßüöäéèê.test.äöüßüöäéèê.com" as well as worse isn't surprising.
So, "speculating" and "wishing for topInch isn't a choice. I have to know without a doubt...
Now you ask ,:
In line with the known proven fact that the utmost quantity of figures inside a domain title (including any subdomains) is 255 figures... what's the maximum period of an IDNA converted domain title?
Or did they mean the IDNA converted version (punycode) can also be limited to 255 figures (which indicates domain names with worldwide/unicode figures would really have shorter limits within their unicode representation, as their IDNA converted version would need to respect the 255 char limit)?
OK, I believe I discovered myself which snippet I discovered (by searching the web) assisted:
There have been basically two different choices open for presenting internationalized domains (IDN). The very first ended up being to make changes towards the domain title system (DNS) which may allow unicode figures for use directly. It had been felt this was too drastic a stride, and therefore the 2nd option was selected. This involved producing an formula to specify the way a unicode string should become a allowed ASCII domain title. This ACE string (ACE means ASCII Compatible Encoding) will be joined in to the DNS. The development of IDN implies that, for the first time, the entry within the DNS is no more identical using the domain title.
The reply is the length to respect may be the 255 character limit as DNS needs it.
My suspicion was correct. The domain title and also the entry within the DNS are two various things with IDN. It is the maximum entire DNS entry that counts.
The domain title "müller.p" has 9 figures, however the corresponding ACE (ASCII Compatible Encoding) string "xn--mller-kva.p", however, has 16 figures.
It is the ACE string that's utilized by DNS and it is the ACE string that falls underneath the 255 character limit. Which means that the utmost limit of it's unicode (domain) version is determined by the amount of unicode figures used and when - after IDNA conversion - the string continues to fit inside the 255 character limit.
Geez, the specs sure could've been be a little clearer on such things as this. Especially as worldwide domains have been in existence since somewhere near March first, 2004. However I found the solution, and that is what counts.
Possibly it will help someone who's getting exactly the same question.
The easy answer associated with my database area length is 255 CHARs.
The truth that I keep domains within their IDNA converted (punycode/ACE string) version only verifies this maximum character limit.
My understanding would be that the 255-character limit will be considered following the IDNA conversion.
The reason being DNS records have this character limit, as well as in general DNS records are only able to contain letters, numbers and hyphens (from Wikipedia). The DNS server therefore uses the Punycode version from the IDN because of its record, not the Unicode version.