Coming from a solid open source/PHP background, any interaction I have with .NET usually leaves me wondering just how those guys at Microsoft were able to stay that high for that long.
Maybe they’d made a wish on a cursed monkey hand where they wanted the web to be stateful. Or made a pact with the devil. One of those “always backfires” kind of deals, like when you wish for hot babes and they’re literally one million degrees centigrade.
I recently created an application that would interact with a .NET search form. I looked into the face of insanity and remained (I think) sane. To do this you need to not question why it is the way it is, but accept it and treat it as a puzzle, which I promise does have a solution, even if it is a little nefarious.
The basic outline is this:
1. .NET serves you a page. It sets a cookie and some hidden input variables that it submits via a POST request to essentially make the stateless web stateful.
2. You set the values of inputs, eg a text search box, and potentially also some other POST variable, like what button you clicked. Because of insanity.
3. You send this data back to the .NET server, cast a short incantation and wait for that sudden chill in the room to pass.
4. You get results.
So essentially the two steps every time are: get the form, then submit the form. You can’t just submit the form directly (*sigh*).
Down to the technical stuff: I used PHP’s native curl libary to get the page data and simplehtmldom to access the form elements.
What I did was use Chrome’s web developer console (F12). In the Elements tab I found the
<div class="aspNetHidden"> <input type="hidden" name="__EVENTTARGET" id="__EVENTTARGET" value=""> <input type="hidden" name="__EVENTARGUMENT" id="__EVENTARGUMENT" value="">
I don’t think the first two are essential but I’m pretty sure the __VIEWSTATEFIELDCOUNT is.
I also noticed 17 of these:
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPD...(really long data string)...">
I have no desire to understand any of this. My strategy is just to use any hidden elements on this page when I submit the form.
With Chrome’s web developer console still open, fill out some fields in your evil .NET form and click the submit button.
In web developer, click the “Network” tab. Click the very first item and on the “Headers” subtab you should see “Form data”. This will show you the truly crazy amount of data it has to send to process one simple form. You’ll also see the “name” attributes of each “input” element that was sent. This is what we want to replicate. If we send this data, we’ll get the page we want.
//basic curl options, because there's 2 curl calls we reuse this. $options = array( CURLOPT_RETURNTRANSFER => 1, CURLOPT_FOLLOWLOCATION => 1, CURLOPT_REFERER => '', CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:220.127.116.11) Gecko/20061204 Firefox/18.104.22.168', CURLOPT_COOKIEFILE => 'cookiefile.txt', //make sure this file is writable, same directory as the php script CURLOPT_COOKIEJAR => 'cookiefile.txt' ); include("simple_html_dom.php"); $ch = curl_init('http://www.evilvoodooclosedsourcewebsite.com/form'); curl_setopt_array($ch, $options); $curl_html = curl_exec($ch); $html = str_get_html($curl_html); //str_get_html is a simplehtmldom function
We now have $html, which is an object that allows us traverse the DOM in a similar way to jquery. We want all the hidden inputs, and we want to add set our values into the names of inputs the .NET form expects, ready to actually get some search results.
$post_data = array(); $inputs = $html->find("input[type='hidden']"); foreach( $inputs as $i ) $post_data[$i->name] = $i->value; //fill out the form $post_data['body_0$searchoptions_0$ddlCrazyName'] = 'search term'; $post_data['body_0$searchoptions_0$ddlNamedLikeThisBecauseEffYou'] = -1;
From here we’re ready to submit the form data. This might have a different address to where the form was actually displayed initially.
$ch_search = curl_init('http://www.evilvoodooclosedsourcewebsite.com/search-results'); curl_setopt_array($ch_search, $options); curl_setopt($ch_search, CURLOPT_POST, true); //tell curl to send post data curl_setopt($ch_search, CURLOPT_POSTFIELDS, $post_data); //tell curl what that data is $curl_search_html = curl_exec($ch_search); $search_html -= str_get_html($curl_search_html);
And that’s it! $search_html should now be a simplehtmldom object that you can traverse to get your search results. More info on using simplehtmldom can be found in my previous post.
Now you can step back, maybe have a shower or ten (the dirt comes off eventually) and relax safe in the knowledge that good has once again defeated evil and the people of the village can rejoice once again (okay maybe this did make me slightly insane).