Not logged in.  Login/Logout/Register | List snippets | | Create snippet | Upload image | Upload data

50
LINES

< > BotCompany Repo | #1005070 // Try scraping medium.com - fixing

JavaX source code [tags: use-pretranspiled] - run with: x30.jar

Libraryless. Click here for Pure Java version (2534L/17K/56K).

!752

p {
  S username = "stefanreich";
  Map<S, O> data = loadMediumJSON(username);
  printStructureLines(data);
  
  pcall {
    Map user = (Map) data.get("user");
    print("Name: " + user.get("name"));
    print("Bio: " + user.get("bio"));
    print("Image ID: " + user.get("imageid"));
    // Could do more here, like Twitter name
  }
  
  pcall {
    Map references = (Map) data.get("references");
    psl("References: ", references);
    Map<S, Map> posts = (Map) references.get("Post");
    psl("Posts: ", posts);
    for (S postID : keys(posts)) {
      print("Post " + postID);
      Map post = posts.get(postID);
      psl("Post: ", post);
      
      S uniqueSlug = getString(post, "uniqueSlug");
      S link = "https://medium.com/@" + username + "/" + uniqueSlug;
      S title = getString(post, "title");
      print("Title: " + title);
      print("Probable link: " + link);
      //psl(post);
      print();
    }
  }
}

sbool loadMediumJSON_verbose;

static Map<S, O> loadMediumJSON(S username) {
  S html = loadPageWithUserAgent("https://medium.com/@" + username, "Mac Safari");
  if (loadMediumJSON_verbose) print(html);
  
  S prefix = [[window["obvInit"](]];
  int i = indexOf(html, prefix);
  if (i < 0) null;
  int j = indexOf(html, "\n", i);
  if (j < 0) null;
  S json = dropSuffix(")", trim(substring(html, i+l(prefix), j)));
  ret jsonDecodeMap(json);
}

Author comment

Began life as a copy of #1004101

download  show line numbers  debug dex  old transpilations   

Travelled to 14 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, cfunsshuasjs, ddnzoavkxhuk, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt

No comments. add comment

Snippet ID: #1005070
Snippet name: Try scraping medium.com - fixing
Eternal ID of this version: #1005070/1
Text MD5: 2aaba4723f055b121c24afb345b9150c
Transpilation MD5: 39b64bf31a6b959fb74618f5c9ea3d42
Author: stefan
Category: javax / networking
Type: JavaX source code
Public (visible to everyone): Yes
Archived (hidden from active list): No
Created/modified: 2016-10-04 21:59:11
Source code size: 1431 bytes / 50 lines
Pitched / IR pitched: No / No
Views / Downloads: 489 / 529
Referenced in: [show references]