被这个坑坑得刻骨铭心!先爆一下 corefx 中 System.Net.Dns.GetHostAddressesAsync() 真面目。
public static Task<IPHostEntry> GetHostEntryAsync(IPAddress address) { NameResolutionPal.EnsureSocketsAreInitialized(); return Task<IPHostEntry>.Factory.FromAsync( (arg, requestCallback, stateObject) => BeginGetHostEntry(arg, requestCallback, stateObject), asyncResult => EndGetHostEntry(asyncResult), address, null); }
接着看看在 Linux 与 Windows 上踩坑的后果。
Linux:
Microsoft.AspNetCore.Server.Kestrel.Internal.Networking.UvException: Error -24 EMFILE too many open files
Windows(1.3万个线程):
引发踩坑的代码:
Task<IPAddress[]> task = System.Net.Dns.GetHostAddressesAsync(host); task.Wait(5000); var addresses = task.Result;
上面的代码是在构造函数中调用的,只能同步调用,无法异步调用。
踩坑的条件:在一定数量的请求并发时才出现,如果只有很少的请求不会出现。所以,当我们发布时,将服务器从负载均衡上摘下来,结束进程,更新程序,在本机访问后(host解析已完成)挂上负载均衡,问题不会出现。如果不从负载均衡上摘下来,直接结束 asp.net core 程序的进程,新启动的进程就会出现这个问题。
接下来尝试解决方法。
1)参考 Synchronously waiting for an async operation, and why does Wait() freeze the program here ,将上面的代码改为:
var task = Task.Run(async () => { return await System.Net.Dns.GetHostAddressesAsync(host); }); task.Wait(5000); var addresses = task.Result;
死锁问题依旧。
2)参考 System.Data.SqlClient 中的实现:
private static async Task<Socket> ConnectAsync(string serverName, int port) { if (RuntimeInformation.IsOSPlatform(OSPlatform.Windows)) { var socket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp); await socket.ConnectAsync(serverName, port).ConfigureAwait(false); return socket; } // On unix we can‘t use the instance Socket methods that take multiple endpoints IPAddress[] addresses = await Dns.GetHostAddressesAsync(serverName).ConfigureAwait(false); return await ConnectAsync(addresses, port).ConfigureAwait(false); }
(注:SqlClient中在Windows上没有调用Dns.GetHostAddressesAsync)
将 Dns.GetHostAddressesAsync 放在一个 async/await 代理方法中:
private static async Task<IPAddress[]> GetHostAddressesAsyncProxy(string host) { return await System.Net.Dns.GetHostAddressesAsync(host); }
死锁依旧。
3)修改 System.Net.Dns 的源代码,将异步方法
public static Task<IPAddress[]> GetHostAddressesAsync(string hostNameOrAddress) { NameResolutionPal.EnsureSocketsAreInitialized(); return Task<IPAddress[]>.Factory.FromAsync( (arg, requestCallback, stateObject) => BeginGetHostAddresses(arg, requestCallback, stateObject), asyncResult => EndGetHostAddresses(asyncResult), hostNameOrAddress, null); }
改为同步方法
public static Task<IPAddress[]> GetHostAddressesAsync(string hostNameOrAddress) { NameResolutionPal.EnsureSocketsAreInitialized(); return Task.FromResult<IPAddress[]>(GetHostEntry(hostNameOrAddress).AddressList); }
问题解决!
说明死锁问题的确是由于在构造函数中同步调用异步方法引起的。目前 System.Net.NameResolution 只提供了异步的 API 进行主机名的解析,上面的 GetHostEntry() 是同步方法,但只支持 netstandard 2.0 ,而目前 nuget.org 上的 System.Net.NameResolution 支持到 netstandard 1.3 。