如果出现这样的提示,说明IP已经被拉黑了。
那么即使不是恶意的访问(几秒一次不算吧),也得上代理。
//创建无Chrome无头参数 ChromeOptions options=new ChromeOptions(); //chromeOptions.addArguments("-headless"); String proxyServer = "93.170.6.26:8080"; // proxy Proxy proxy = new Proxy().setHttpProxy(proxyServer).setSslProxy(proxyServer); options.setProxy(proxy); WebDriver wd = new ChromeDriver(options);好了,又可以愉快的访问了。
如果要对Chrome设置参数,比如下载路径。或者不下载图片。可以自定义参数
我自己的百度的感受就是,基本上搜索结果里面Python的占了主流。Java不是做爬虫的好方式。
还有个问题,如果发现代理IP失效了,怎么动态替换呢?
比如:Caused by: java.net.ConnectException: Failed to connect to localhost/0:0:0:0:0:0:0:1:6012 at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:247) at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:165) at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:257) at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:135) at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:114) at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:126) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:200) at okhttp3.RealCall.execute(RealCall.java:77) at org.openqa.selenium.remote.internal.OkHttpClient.execute(OkHttpClient.java:105) at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:155) at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:83) ... 34 more Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)