Hello There, Guest!
View New Posts  |  View Today's Posts
[help]httpwebrequest/HtmlAgilityPack html parse?

  • 0 Vote(s) - 0 Average


05-02-2014, 01:29 PM #1
ѕα3єкα
Junior Member
**
Posts: 37 Threads:12 Joined: Dec 2012 Reputation: 0

[help]httpwebrequest/HtmlAgilityPack html parse?
I want to grab text info to listview...in columns

eg..
first column = company name
second column = posting date
third column = post name/detail

Code:
http://jobsearch.naukri.com/jobs-in-delhi



I tried HtmlAgilityPack ver dont know how to go about the rest...
Code:
'create a web request
        Dim wreq As HttpWebRequest = WebRequest.Create("http://jobsearch.naukri.com/jobs-in-delhi")

        'set the agent to mimic a recent browser
        wreq.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5"

        'how you're getting the page
        wreq.Method = "get"

        'create the html doc & web
        Dim document As New HtmlAgilityPack.HtmlDocument
        Dim web As New HtmlAgilityPack.HtmlWeb

        'start a response
        Dim res As HttpWebResponse = wreq.GetResponse()

        'get a stream from the response
        document.Load(res.GetResponseStream, True)
This post was last modified: 05-02-2014, 01:32 PM by ѕα3єкα.

05-02-2014, 01:52 PM #2
Florin
Junior Member
Team Reboot
Posts: 456 Threads:71 Joined: Dec 2011 Reputation: 14

RE: [help]httpwebrequest/HtmlAgilityPack html parse?
Maybe this is your start: http://static.naukimg.com/s/1/109/j/srp_...4022014.js

There are interesting fuctions on that website. It uses Js to make an AJAX request. Maybe you need to find the parameters and make the ajax request with .net .

Code:
//AJAX FUNCTIONS START
function makeRequest(url,callBack,datatype,params,method)
{
  var locdomain=window.location.hostname+"";
  if(locdomain.indexOf("corp.naukri.com")!=-1)
            url=url.replace("jobsearch","corp");
  if(method)
{    
  $n(document).ajaxReq({
       type: method?method:'GET',
       url : url,            
       datatype : datatype,
       data :  params,
    success : function callB(resp){
        if(callBack)
        callBack(resp);
    },        
    error : function(){              
        //alert('Custom Error Message');              
    }          
  });
}
else{
  $n(document).ajaxReq({
       url : url,
       datatype : datatype,
       data :  params,
    success : function callB(resp){
        if(callBack)
        callBack(resp);
    },
    error : function(){
        //alert('Custom Error Message');
    }
  });
}

}

Or trying to parse that source .

05-02-2014, 02:50 PM #3
ѕα3єкα
Junior Member
**
Posts: 37 Threads:12 Joined: Dec 2012 Reputation: 0

RE: [help]httpwebrequest/HtmlAgilityPack html parse?
I have no clue about this code.

can I do this in vb.net?

05-02-2014, 09:41 PM #4
AceInfinity
Developer
*******
Administrators
Posts: 9,733 Threads:1,026 Joined: Jun 2011 Reputation: 76

RE: [help]httpwebrequest/HtmlAgilityPack html parse?
Yeah, you'll have to write the .NET equivalent there, since there's no real elegant or true way to execute native JS code from .NET without using some kind of 3rd party library... I've seen some interesting stuff in C# for executing JS code, but it's not really practical. Can you provide an exact example of what information you're trying to parse? Where that information is located is not very clear, I can only make a good guess.

Here's a single job entry for instance:
Code:
<div class="jRes">
  <a id="1" target="_blank" title="View &amp; Apply" href="http://jobsearch.naukri.com/job-listings-BPO-Call-Center-Executive-fresher-Sec-6-Noida-DELITES-SERVICES-PRIVATE-LIMITED-Noida-0-to-1-030514000064?xz=1_0_69&amp;xo=&amp;xp=1&amp;xid=139908449559910600&amp;qjt=&amp;qp=&amp;id=&amp;f=-030514000064" class="l_j aK">
    <strong>
      <b>
      </b>
      BPO
      <b>
      </b>
      /
      <b>
      </b>
      Call
      <b>
      </b>
      
      <b>
      </b>
      Center
      <b>
      </b>
      
      <b>
      </b>
      Executive
      <b>
      </b>
      (
      <b>
      </b>
      fresher
      <b>
      </b>
      ) -
      <b>
      </b>
      Sec
      <b>
      </b>
      -
      <b>
      </b>
      6
      <b>
      </b>
      ,
      <b>
      </b>
      Noida
      <b>
      </b>
      <span>
        (0-1 yrs.)
      </span>
    </strong>
    <dfn>
      <span class="" title="">
      </span>
      DELITES SERVICES PRIVATE LIMITED
      <i>
        hiring for
      </i>
      B-124, 2nd &amp; 3rd Floor, Sector-6, Noida
    </dfn>
    <i>
      Noida
    </i>
    <em>
      ~~
      <b>
      </b>
      5
      <b>
      </b>
      
      <b>
      </b>
      Days
      <b>
      </b>
      
      <b>
      </b>
      Shift
      <b>
      </b>
      (
      <b>
      </b>
      Sat
      <b>
      </b>
      &amp;
      <b>
      </b>
      Sunday
      <b>
      </b>
      
      <b>
      </b>
      Fixed
      <b>
      </b>
      
      <b>
      </b>
      Off
      <b>
      </b>
      )
      ~~
      <b>
      </b>
      NO
      <b>
      </b>
      
      <b>
      </b>
      TARGET
      <b>
      </b>
      |
      <b>
      </b>
      NO
      <b>
      </b>
      
      <b>
      </b>
      SALES
      <b>
      </b>
      |
      <b>
      </b>
      NO
      <b>
      </b>
      
      <b>
      </b>
      COLLECTION
      <b>
      </b>
      
      ~~
      <b>
      </b>
      Very
      <b>
      </b>
      ...
    </em>
    <em>
      <span class="f12">
        Keyskills:
      </span>
      
      <b>
      </b>
      CCE
      <b>
      </b>
      ,
      <b>
      </b>
      Customer
      <b>
      </b>
      
      <b>
      </b>
      Service
      <b>
      </b>
      ,
      <b>
      </b>
      Customer
      <b>
      </b>
      
      <b>
      </b>
      Service
      <b>
      </b>
      
      <b>
      </b>
      Executive
      <b>
      </b>
      ,
      <b>
      </b>
      CSA
      <b>
      </b>
      ,
      <b>
      </b>
      CSE
      <b>
      </b>
      ,
      <b>
      </b>
      BPO
      <b>
      </b>
      ...
    </em>
  </a>
  <div id="actRow030514000064" class="actRow">
    <script>
      LnK('030514000064','Just Now','','Mr. Deepak Singh','');
    </script>
    <div class="fl f11">
      
      <a id="030514000064" class="l_vs">
        View Similar Jobs
      </a>
    </div>
    <span class="pbD f11">
      Posted by&nbsp;Mr. Deepak Singh, Just Now
    </span>
  </div>
</div>

I see HTML entities in there, but this is where you'll use HttpUtility.HtmlDecode() to get a more humanly readable plaintext output.

If you ask me, the HTML is a damn disaster lol. Incredibly ugly. It may be better to see if you can parse the information from the URL at the top for anything you can find:
Code:
http://jobsearch.naukri.com/job-listings-BPO-Call-Center-Executive-fresher-Sec-6-Noida-DELITES-SERVICES-PRIVATE-LIMITED-Noida-0-to-1-030514000064?xz=1_0_69&amp;xo=&amp;xp=1&amp;xid=139908449559910600&amp;qjt=&amp;qp=&amp;id=&amp;f=-030514000064


Microsoft MVP .NET Programming - (2012 - Present)
®Crestron DMC-T Certified Automation Programmer

Development Site: aceinfinity.net

 ▲
 ▲ ▲

05-03-2014, 07:35 AM #5
ѕα3єкα
Junior Member
**
Posts: 37 Threads:12 Joined: Dec 2012 Reputation: 0

RE: [help]httpwebrequest/HtmlAgilityPack html parse?

this info from there to here




why there is not much help for vb.net on this complex problem? I only see info for c# only..
This post was last modified: 05-03-2014, 07:42 AM by ѕα3єкα.

05-03-2014, 12:01 PM #6
sh@rp
Member
**
Posts: 199 Threads:11 Joined: Feb 2012 Reputation: 10

RE: [help]httpwebrequest/HtmlAgilityPack html parse?
(05-03-2014, 07:35 AM)ѕα3єкα Wrote:  why there is not much help for vb.net on this complex problem? I only see info for c# only..

VbNet and C# are DotNet language so you can use a converter online to understand one or the other.

;)

05-03-2014, 01:09 PM #7
AceInfinity
Developer
*******
Administrators
Posts: 9,733 Threads:1,026 Joined: Jun 2011 Reputation: 76

RE: [help]httpwebrequest/HtmlAgilityPack html parse?
Developer Fusion is junk, Telerik's code converter is much better. This problem is easy once you understand WebRequests and string parsing methods. It's really not too difficult, you shouldn't have to look for existing source code if you read up on some documentation.


Microsoft MVP .NET Programming - (2012 - Present)
®Crestron DMC-T Certified Automation Programmer

Development Site: aceinfinity.net

 ▲
 ▲ ▲

05-03-2014, 02:00 PM #8
ѕα3єкα
Junior Member
**
Posts: 37 Threads:12 Joined: Dec 2012 Reputation: 0

RE: [help]httpwebrequest/HtmlAgilityPack html parse?
I found this valuable post and it turned out to be in cs...

htmlAgility is written in cs

can you explain in vb.net?

Code:
HtmlAgilityPack.HtmlNodeCollection divTags = htmlDocument.DocumentNode.SelectNodes("//div[@id='inline-list']");

            HtmlAgilityPack.HtmlNode aTags = divTags.FirstOrDefault();
            var manufacturerList = from hyperlink in aTags.SelectNodes(".//a[@href]")
                                   where hyperlink != null
                                   select hyperlink.InnerText;

            foreach (var model in manufacturerList)
                Console.WriteLine(model);
This post was last modified: 05-03-2014, 02:10 PM by ѕα3єкα.

05-03-2014, 02:21 PM #9
AceInfinity
Developer
*******
Administrators
Posts: 9,733 Threads:1,026 Joined: Jun 2011 Reputation: 76

RE: [help]httpwebrequest/HtmlAgilityPack html parse?
I shouldn't have to, the methods are the same. The only difference here is the use of the var keyword for an implicitly strongly typed variable declaration, and the semicolon. The semantics of the rest of this code are nearly identical to it's VB.NET equivalent.

C#:
Code:
foreach (var model in manufacturerList)
  Console.WriteLine(model);

VB.NET:
Code:
For Each model As String In manufacturerList
  Console.WriteLine(model)
Next

Because manufacturerList is an IEnumerable(Of String) (a collection of strings returned by that LINQ), you can see that each enumerable result returned by the enumerator should be a string. As I pointed out, the C# code is really not different than the VB.NET equivalent.

Converters will never be completely accurate 100% of the time, but they'll give you a good starting point in your translation from C# code to VB.NET code and/or vice versa.

http://converter.telerik.com/
This post was last modified: 05-03-2014, 02:25 PM by AceInfinity.


Microsoft MVP .NET Programming - (2012 - Present)
®Crestron DMC-T Certified Automation Programmer

Development Site: aceinfinity.net

 ▲
 ▲ ▲

05-04-2014, 07:57 AM #10
ѕα3єкα
Junior Member
**
Posts: 37 Threads:12 Joined: Dec 2012 Reputation: 0

RE: [help]httpwebrequest/HtmlAgilityPack html parse?
Thanks alot guys but code isn't working im frustrated...




Forum Jump:


Possibly Related Threads...
Thread Author Replies Views Last Post
Exclamation  problem multi post + httpwebrequest read ip sharokurdi 1 1,398 11-11-2016, 06:39 PM
Last Post: AceInfinity
   httpwebrequest, tunneling and the CONNECT method romalerius 2 2,711 05-21-2014, 11:58 AM
Last Post: romalerius
   DateTime Parse Lee Stevens 7 3,235 01-08-2014, 10:15 PM
Last Post: Lee Stevens
  Capture 'Type" value in page html Morpheus 14 6,688 10-13-2013, 03:26 PM
Last Post: Morpheus
  How to make httpwebrequest login facebok account in listbox DilarangMasuk 3 3,368 10-05-2013, 10:51 AM
Last Post: AceInfinity


Users browsing this thread: 1 Guest(s)