Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Public Suffix API #676

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

mckenfra
Copy link

This formalizes #231 into a concrete proposal.

Verified

This commit was created on github.com and signed with GitHub’s verified signature.
@Rob--W Rob--W requested review from oliverdunk and xeenon August 20, 2024 12:29
Copy link
Member

@oliverdunk oliverdunk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! I will reach out to the PSL maintainers I have been in contact with to ask them to take a look. I'll also share this internally to get an overall opinion from Chrome.

@mckenfra mckenfra changed the title Add Public Suffix API proposal Proposal: Public Suffix API Aug 23, 2024
meets any of the following criteria:
* Contains a character that is invalid in an Internationalized Domain Name (IDN) - e.g. symbols, whitespace
* Is an IP address - IPv4 or IPv6
* Is a public suffix itself - including the case of it being a single-label suffix not explicitly matched in the PSL

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to think about the callers a bit more to determine what they want here.
If we assume the caller is something like isKnownPublicSuffix(...) then they would want a "true" for IPs and things that are already a public suffix but a false for empty labels or the root label. If we assume the caller is something like getHighlightParts(...) then an IP should probably look like a registrable domain.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the API in c217156: getRegistrableDomain() is now a proxy for hasKnownPublicSuffix(), because it returns null in cases where a domain is valid but has an unknown eTLD (i.e. it no longer defaults to assuming a single-label eTLD). So if the promise returned by getRegistrableDomain() fufills with a nonnull value, then it can be inferred that the domain has an eTLD in the PSL.

The API does not currently have a getHighlightParts() method, but we could possibly add it in the Future Work section if we can identify an appropriate use case.

Copy link

@simon-friedberger simon-friedberger Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I was sufficiently clear here.

My point is there should be an API such that if I want to highlight the different URL parts in the address bar or find out what the current "site" is for applying some setting there needs to be an API that takes.

docs.google.com/... -> google.com - ICANN
foo.github.io -> foo.github.io - PRIVATE
10.0.0.1/login -> 10.0.0.1 - even though this is an IP
avocado.banana -> avocado.banana - even though this is not a valid TLD, because it can be made to resolve locally

On the other hand, if the extension is trying to determine if something is a domain or a search term we probably want
docs.google.com/... -> google.com - ICANN
foo.github.io -> foo.github.io - PRIVATE
10.0.0.1/login -> 10.0.0.1 - even though this is an IP
avocado.banana -> ERROR/searchTerm

I think having different functions to call depending on what people want will probably be a nicer API but just returning a tuple and adding additional information about the result like type:RegistrableDomain|IPAddress|UnknownTLD would also be an option.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I can see how that could be useful. Is it something we could address in a future version of this API?

Copy link

@simon-friedberger simon-friedberger Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like a lot of effort to me so I would say let's make sure to include it in the first version. (Otherwise people will always use the old version because "what if the new version isn't supported yet?")

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could suggest the API method you have in mind? (I.e. method name, parameter(s), return type)

Copy link
Author

@mckenfra mckenfra Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current proposal, an error is thrown if the value passed to getRegistrableDomain(value) is invalid, e.g. if it contains invalid characters or is an IP address.

One way of supporting your use case would be to make it possible to determine if the error was thrown due to the value being an IP address. We could add a requirement that the cause property of the error should be set to { code: "IPAddress" }.

Then, my earlier example code to handle your use cases would be updated as follows:

async function getIPAddressOrRegistrableDomain(domain) {
  try {
    return await publicSuffix.getRegistrableDomain(domain);
  } catch (e) {
    if (e.cause && e.cause.code === "IPAddress") {
      return domain; // It's an IP address
    }
    throw e; // Not an IP address, so rethrow the error
  }
}

try {

  // Use case 1: what is the curent site?
  const currentSite = await getIPAddressOrRegistrableDomain(domain) || domain;

  // Use case 2: is this a known domain, or should I search?
  const isDomain = await getIPAddressOrRegistrableDomain(domain) !== null;

} catch (e) {
  // Invalid domain
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good. I think this function should be part of the API. And calling this function export function hasSameRegistrableDomain(...domains: string[]) : Promise<boolean>; should also allow matching same IPs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok sure, but we may need to think of a different function name than hasSameRegistrableDomain(), because an IP address is not quite the same as a registrable domain.

Also, the name of this API is publicSuffix, but we are now adding methods that explicitly support IP addresses, which do not really have anything to do with public suffixes. Are we happy to keep the name publicSuffix for the API?

I'm not saying we can't make the changes you're suggesting, I just want to make sure we're ok with the naming of the API and methods.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make a higher level function like "getSite" and "hasSameSite" which just allows the things that make sense in a browser URL bar, like an IP or we just add an argument "allow IP". I guess the first one is nicer because it allows updating that function if we think of something else that needs to be allowed.

@w3c w3c deleted a comment from happyhelper188 Apr 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants