Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SocketsHttpHandler.ConnectionFactory usability with non-DNS endpoints #41149

Open
JamesNK opened this issue Aug 21, 2020 · 17 comments
Open

SocketsHttpHandler.ConnectionFactory usability with non-DNS endpoints #41149

JamesNK opened this issue Aug 21, 2020 · 17 comments
Labels
area-System.Net.Http documentation Documentation bug or enhancement, does not impact product or test code
Milestone

Comments

@JamesNK
Copy link
Member

JamesNK commented Aug 21, 2020

Today it feels like there is some friction when using a SocketsConnectionFactory that needs a non-DNS endpoint. The endpoint passed into SocketsConnectionFactory.ConnectAsync comes from HttpRequestMessage.RequestUri, and that must always be a valid URI and always is turned into a DnsEndPoint.

To specify a custom endpoint you need to implement a custom SocketsConnectionFactory and override the endpoint. One option is to hardcode the endpoint when the factory is created:

public class UnixDomainSocketConnectionFactory : SocketsConnectionFactory
{
    private readonly UnixDomainSocketEndPoint _endPoint;

    public UnixDomainSocketConnectionFactory(UnixDomainSocketEndPoint endPoint) : base(AddressFamily.Unix, SocketType.Stream, ProtocolType.Unspecified)
    {
        _endPoint = endPoint;
    }

    public override ValueTask<Connection> ConnectAsync(EndPoint? endPoint, IConnectionProperties? options = null, CancellationToken cancellationToken = default)
    {
        return base.ConnectAsync(_endPoint, options, cancellationToken);
    }
}

This works, but you're limited to a single endpoint for the SocketsHttpHandler. You can't have different HttpRequestMessage instances be sent to different UDS endpoints.

An alternative could be to put the endpoint in HttpRequestMessage.Option. Something like this (I haven't tried it):

public class UnixDomainSocketConnectionFactory2 : SocketsConnectionFactory
{
    public UnixDomainSocketConnectionFactory2() : base(AddressFamily.Unix, SocketType.Stream, ProtocolType.Unspecified)
    {
    }

    public override ValueTask<Connection> ConnectAsync(EndPoint? endPoint, IConnectionProperties? options = null, CancellationToken cancellationToken = default)
    {
        if (options.TryGet(typeof(HttpRequestMessage), out var request))
        {
            request.Options.TryGetValue(new HttpRequestOptionsKey<EndPoint>(nameof(EndPoint)), out endPoint);
        }

        return base.ConnectAsync(endPoint, options, cancellationToken);
    }
}

Now the endpoint can be different per-request. However I'm not sure what the rules are around deciding whether ConnectAsync should be called. Would the RequestUri need to change if the endpoint specified on HttpRequestMessage.Options changed?


Idea: Have SocketsHttpHandler check for a custom endpoint in HttpRequestMessage.Options automatically. If there is a request with an endpoint in the options then SocketsHttpHandler will use it to determine if a connection already exists for that endpoint, and it will pass that endpoint to SocketsConnectionFactory.ConnectAsync.

Usage:

var socketsHttpHandler = new SocketsHttpHandler
{
    ConnectionFactory = new SocketsConnectionFactory(AddressFamily.Unix, SocketType.Stream, ProtocolType.Unspecified);
};
var httpClient = new HttpClient(socketsHttpHandler);

// Create request and specify UDS endpoint in its options.
// SocketsHttpHandler knows about this option name and will use it to figure out if a connection exists
// and will pass this endpoint to ConnectAsync.
var request = new HttpRequestMessage(HttpMethod.Get, "http://localhost");
var endPointKey = new HttpRequestOptionsKey<EndPoint>(nameof(EndPoint));
request.Options.Set(endPointKey, new UnixDomainSocketEndPoint(SocketPath));

var response = await httpClient.SendAsync(request);
@ghost
Copy link

ghost commented Aug 21, 2020

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Aug 21, 2020
@geoffkizer
Copy link
Contributor

I would really prefer to avoid mucking around with the logic that associates requests to connection pools, as it is both perf critical and somewhat complex (mostly for corner cases).

It seems easy enough to achieve what you want by just making up hostnames. HttpClient will always use different connections for different hostnames.

@JamesNK
Copy link
Member Author

JamesNK commented Aug 24, 2020

It is simple only if you call one endpoint from a SocketsHttpHandler. Calling multiple non-DNS endpoints requires you to co-ordinate the HttpRequestMessage.RequestUri with the endpoint.

Something like:

_endpointAndUriMapping = new Dictionary<string, UnixDomainSocketEndpoint>();
_endpointAndUriMapping["http://localhost:80"] = new UnixDomainSocketEndpoint(@"c:\one.sock");
_endpointAndUriMapping["http://localhost:81"] = new UnixDomainSocketEndpoint(@"c:\two.sock");
_endpointAndUriMapping["http://localhost:82"] = new UnixDomainSocketEndpoint(@"c:\three.sock");

_socketsHttpHandler = new SocketsHttpHandler
{
    ConnectionFactory = new UnixDomainSocketConnectionFactory2(endpointAndUriMapping);
};
_httpClient = new HttpClient(socketsHttpHandler);

public class UnixDomainSocketConnectionFactory2 : SocketsConnectionFactory
{
    private readonly Dictionary<string, UnixDomainSocketEndpoint> _endpointMapping;

    public UnixDomainSocketConnectionFactory2(Dictionary<string, UnixDomainSocketEndpoint> endpointMapping) : base(AddressFamily.Unix, SocketType.Stream, ProtocolType.Unspecified)
    {
        _endpointMapping = endpointMapping;
    }

    public override ValueTask<Connection> ConnectAsync(EndPoint? endPoint, IConnectionProperties? options = null, CancellationToken cancellationToken = default)
    {
        // Map URL to socket endpoint
        if (options.TryGet(typeof(HttpRequestMessage), out var request))
        {
            endPoint = _endpointMapping[request.RequestUri.OriginalString];
        }

        return base.ConnectAsync(endPoint, options, cancellationToken);
    }
}

public async HttpResponseMessage MakeSocketsRequest(string socketPath)
{
    // Map socket path to URL
    var requestUri = _endpointAndUriMapping.Single(e => e.Value.Path == socketPath).Key;
    
    var response = await httpClient.SendAsync(new HttpRequestMessage(HttpMethod.Get, requestUri));
    return response
}

It is possible, but this feels pretty ugly. You need to understand that the host needs to be unique per connection. You need to track the non-DNS endpoints you are using and ensure each one has a unique host. You need to map between them on each side of a HTTP call (endpoint -> URI when creating HttpRequestMessage, URI -> endpoint inside the connection factory).

It would be nicer if there was a way for the client to do this work for you.

I looked at what you are doing to identify connections and found HttpConnectionKey. Couldn't Host and Port on it be replaced with EndPoint? By default the value would be a DnsEndpoint. If a custom endpoint is specified then it would be checked instead.

internal readonly struct HttpConnectionKey : IEquatable<HttpConnectionKey>
{
public readonly HttpConnectionKind Kind;
public readonly string? Host;
public readonly int Port;
public readonly string? SslHostName; // null if not SSL
public readonly Uri? ProxyUri;
public readonly string Identity;
public HttpConnectionKey(HttpConnectionKind kind, string? host, int port, string? sslHostName, Uri? proxyUri, string identity)
{
Kind = kind;
Host = host;
Port = port;
SslHostName = sslHostName;
ProxyUri = proxyUri;
Identity = identity;
}
// In the common case, SslHostName (when present) is equal to Host. If so, don't include in hash.
public override int GetHashCode() =>
(SslHostName == Host ?
HashCode.Combine(Kind, Host, Port, ProxyUri, Identity) :
HashCode.Combine(Kind, Host, Port, SslHostName, ProxyUri, Identity));
public override bool Equals(object? obj) =>
obj is HttpConnectionKey hck &&
Equals(hck);
public bool Equals(HttpConnectionKey other) =>
Kind == other.Kind &&
Host == other.Host &&
Port == other.Port &&
ProxyUri == other.ProxyUri &&
SslHostName == other.SslHostName &&
Identity == other.Identity;
}

@davidfowl
Copy link
Member

I sent this exact email to @geoffkizer and @scalablecory and I think the HttpRequestMessage options or properties is the most appropriate way to round trip this information from the caller to the transport.

@davidfowl
Copy link
Member

@JamesNK the problem is that you still need to represent the HTTP information when making a request (e.g What is the scheme? http or https)

Kestrel does this jank URI scheme for Kestrel that maybe we can clean up an formalize across HttpClient and server:

(http|https)://{transport scheme}/{url}

HTTPS over TCP - https://localhost:5000/someurl
HTTPS over UDS - https://unix://one.sock/someurl
HTTP over named pipe - http://pipe://namedpipe/someurl

Thoughts? (The nested URL Scheme is... something).

@scalablecory
Copy link
Contributor

My suggestion is to either use HttpRequestMessage.Options or (to avoid allocating the property bag every time) to encode whatever you need within the URI hostname.

@JamesNK
Copy link
Member Author

JamesNK commented Aug 24, 2020

@JamesNK the problem is that you still need to represent the HTTP information when making a request (e.g What is the scheme? http or https)

Kestrel does this jank URI scheme for Kestrel that maybe we can clean up an formalize across HttpClient and server:

(http|https)://{transport scheme}/{url}

HTTPS over TCP - localhost:5000/someurl
HTTPS over UDS - https://unix://one.sock/someurl
HTTP over named pipe - http://pipe://namedpipe/someurl

Thoughts? (The nested URL Scheme is... something).

http/https would still be used to determine whether TLS is used.

I tried squishing socket endpoint information into the URL like what Kestrel does (https://unix://one.sock/someurl), but they're not Uri formats that HttpClient will allow.

@scalablecory scalablecory changed the title SocketsConnectionFactory usability with non-DNS endpoints SocketsHttpHandler.ConnectionFactory usability with non-DNS endpoints Aug 24, 2020
@scalablecory
Copy link
Contributor

Now the endpoint can be different per-request. However I'm not sure what the rules are around deciding whether ConnectAsync should be called. Would the RequestUri need to change if the endpoint specified on HttpRequestMessage.Options changed?

It's keyed on the hostname or SNI. This is another reason to encode things within the hostname.

@halter73
Copy link
Member

FWIW, Kestrel's "jank URI scheme" was inspired by NGINX 😆

proxy_pass http://unix:/tmp/backend.socket:/uri/;

http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_pass

BTW, both NGINX and Kestrel expect <protocol>://unix:/<path> not <protocol>://unix://<path>. Kestrel doesn't support named pipes yet (dotnet/aspnetcore#14207). A custom transport could add support for this, but Kestrel's current address parsing would likely incorrectly treat "pipe" as a hostname rather than special syntax for named pipes. I don't know if NGINX supports named pipes. I doubt it.

Thoughts? (The nested URL Scheme is... something).

Agreed, but it serves its purpose.

@scalablecory scalablecory added documentation Documentation bug or enhancement, does not impact product or test code and removed untriaged New issue has not been triaged by the area owner labels Aug 30, 2020
@scalablecory scalablecory added this to the 5.0.0 milestone Aug 30, 2020
@scalablecory
Copy link
Contributor

I think the API here is adequate, but we should come up with a concrete recommendation and sample for our docs.

@davidfowl
Copy link
Member

davidfowl commented Aug 30, 2020

Maybe for the transports we have now. It forces callers to have a mapping between valid hostnames and their endpoint type.

What happens when there's a mismatch between what host name is supported vs what is valid for the transport endpoint?

@scalablecory
Copy link
Contributor

scalablecory commented Aug 31, 2020

Maybe for the transports we have now. It forces callers to have a mapping between valid hostnames and their endpoint type.

What happens when there's a mismatch between what host name is supported vs what is valid for the transport endpoint?

It's an HTTP client, it's perfectly reasonable to require a valid HTTP URI and to present the resulting DnsEndPoint to the connection factory. So long as there is some mechanism that works for the people to transform these into whatever is valid for their protocol, as have been outlined here, I think we are fine.

If we see a lot of people wanting to do this, or if there is a quantifiable perf issue, they yea, maybe it'll be worth some investment...

@davidfowl
Copy link
Member

@simonferquel I'd love your opinion here. When talking to the docker daemon over named pipes on windows or domain sockets on linux (similar to #28721), what syntax did you envision when making requests over the http client? Also you may have some go expertise here so drawing a comparison on what go does might be interesting.

@simonferquel
Copy link

simonferquel commented Aug 31, 2020

@davidfowl, the approach we have currently in Docker Desktop, is to create one custom HttpHandler per socket path / named pipe path (we have our own HttpHandler implementation that does very basic HTTP 1.0 handling only for now, and btw, we are very eager to be able to move to .net 5 and use this new stuff!).
We use the host-name part of the http URI solely for adding context in our logs (something like having our URIs be like http://gui-to-vm-lifecycle-server/some/endpoint).
But if we had the connection factory here, what I'd love to do is to have a single connection factory that we would initialize like:

var factory = new NamedPipeConnectionFactory(namesMap: new Dictionary<string, string>{
   {"gui-to-vm-lifecycle-server", @"\\.pipe\dockerVMLifecycleServer'},
});

or something similar.
It might be only personal taste, but I really dislike the weird http://<transport-protocol>:<transport params>:<path> thing. IMO it makes URIs completely unreadable with namedpipe/unix socket paths mixed with endpoint paths. For some cases where our unix socket paths are quite long, it would be very unpractical).

@simonferquel
Copy link

Also for the "why gui-to-vm-lifecycle-server and not lifecycle-server only" it is very handy for logging purposes on the server side, as this is automatically passed into the host header. So both on the client and server side, our log entries contain context about both caller and callee very easily (and that is something that we can generalize very easily cross-language)

@karelz karelz modified the milestones: 5.0.0, 6.0.0 Sep 1, 2020
@simonferquel
Copy link

@karelz I just saw the change of milestone on this issue. Does it mean we'll have to wait for 6.0.0 to see the integration of HttpClient / Connection Abstractions ? Or is it just this issue specifically ?

@scalablecory
Copy link
Contributor

@simonferquel we're implementing a callback-based approach in 5.0 to unblock people. The full Connections API will be revisited for 6.0.

@karelz karelz modified the milestones: 6.0.0, Future May 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Net.Http documentation Documentation bug or enhancement, does not impact product or test code
Projects
None yet
Development

No branches or pull requests

8 participants