public static String loadPage(String url) throws IOException {
    if (url.indexOf("://") < 0)
      url = "http://" + url;
    return loadPage(new URL(url));
  }
  
  public static String loadPage(URL url) throws IOException {
    System.out.println("Loading: " + url.toExternalForm());
    URLConnection con = url.openConnection();
    return loadPage(con, url);
  }
  public static String loadPage(URLConnection con, URL url) throws IOException {
    String contentType = con.getContentType();
    if (contentType == null)
      throw new IOException("Page could not be read: " + url);
    //Log.info("Content-Type: " + contentType);
    String charset = loadPage_guessCharset(contentType);
    Reader r = new InputStreamReader(con.getInputStream(), charset);
    StringBuilder buf = new StringBuilder();
    while (true) {
      int ch = r.read();
      if (ch < 0)
        break;
      //Log.info("Chars read: " + buf.length());
      buf.append((char) ch);
    }
    return buf.toString();
  }
  
  static String loadPage_guessCharset(String contentType) {
    Pattern p = Pattern.compile("text/html;\\s+charset=([^\\s]+)\\s*");
    Matcher m = p.matcher(contentType);
    /* If Content-Type doesn't match this pre-conception, choose default and hope for the best. */
    return m.matches() ? m.group(1) : "ISO-8859-1";
  }
Began life as a copy of #2000483
Snippet is not live.
Travelled to 12 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt
No comments. add comment
| Snippet ID: | #2000484 | 
| Snippet name: | loadPage | 
| Eternal ID of this version: | #2000484/1 | 
| Text MD5: | 464b3fff5dfdf23376e0b2ca029a6f32 | 
| Author: | stefan | 
| Category: | |
| Type: | New Tinybrain snippet | 
| Public (visible to everyone): | Yes | 
| Archived (hidden from active list): | No | 
| Created/modified: | 2015-08-02 14:56:51 | 
| Source code size: | 1369 bytes / 36 lines | 
| Pitched / IR pitched: | No / Yes | 
| Views / Downloads: | 875 / 2665 | 
| Referenced in: | [show references] |