Giving an url, that redirected is a url with spaces, to Jsoup leads to an
error. How resolve this?
Hello I have to parse pages wich URI is resolved by server redirect.
Example:
I have
http://www.juventus.com/wps/poc?uri=wcm:oid:91da6dbb-4089-49c0-a1df-3a56671b7020
that redirected is
http://www.juventus.com/wps/wcm/connect/JUVECOM-IT/news/primavera%20convocati%20villar%20news%2010agosto2013?pragma=no-cache
This is URI of the page that I have to parse. The problem is that redirect
URI contains spaces, here's the code.
String url =
"http://www.juventus.com/wps/poc?uri=wcm:oid:91da6dbb-4089-49c0-a1df-3a56671b7020";
Document doc = Jsoup.connect(url).get();
Element img = doc.select(".juveShareImage").first();
String imgurl = img.absUrl("src");
System.out.println(imgurl);
I get this error at the second line:
Exception in thread "main" org.jsoup.HttpStatusException: HTTP error
fetching URL. Status=404,
URL=http://www.juventus.com/wps/wcm/connect/JUVECOM-IT/news/primavera
convocati villar news 10agosto2013?pragma=no-cache
that contains the redirected url, so this means that JSoup gets the
correct redirected URI. Is there a way to replace the ' ' with %20 so I
can parse with no problem?
Thanks!
No comments:
Post a Comment