cURL Login And Cookies and Regex

Status
Not open for further replies.

Tau_Zero

New Member
Messages
19
Reaction score
0
Points
0
First off, I'm on the intermediate PHP plan, so cURL functions work.
Objective: Take a username and password and log in to another site. This other site sets and works off of cookies, so keep track of the cookies set. Then, go to another page on the site, passing along the cookies so it still registers as logged in and grab information from that page.
Code:
//location of cookie storage file (chmod 777)
$cookie_loc = "/home/tauzero/tmp/cookies/hypLogin";
$login_url = "http://thesite?login=$login&pwd=$pass"; //Main Login url	
$APIgen_url = "http://thesite/servlet/Preferences?genhapikey"; //Key generating url

//create the cURL handle for main login and set some options
$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_loc);
curl_setopt($ch, CURLOPT_URL, $login_url);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10); //Wait a max of 10 seconds for connection to fubar
curl_setopt($ch, CURLOPT_TIMEOUT, 10); //Wait a max of 10 seconds for response

//execute the query and close the connection
ob_start(); // start buffer to prevent output
curl_exec($ch);
//ob_end_clean();
curl_close($ch);
unset($ch);
	
//create the cURL handle for key generation and set some options
$ch = curl_init();
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_loc);
curl_setopt($ch, CURLOPT_URL, $APIgen_url);
		
//execute the query and close the connection
ob_start();
$pref_page = curl_exec($ch);
ob_end_clean();
curl_close($ch);
unset($ch);
		
echo "<PRE>".htmlentities($pref_page); //for debug purposes

//extract the key from the page
ereg("(?<=<b>)[a-f0-9]{16,18}(?=</b>)",$pref_page,$key);//extract the API key

Now, the first cURL thing works fine. It logs in, gathers the cookies, and writes the cookies to the specified file.

The second part, however, doesn't work. More specifically, the cookies don't seem to be passed along. The page that it retrieves is not the page that you get when logged on normally in a browser, but rather the page you get if you either weren't logged on or were and just deleted your browser's cookies. So, I think it's safe to assume that it's not seeing the cookies. I really don't know what needs to be changed to get it to work.

Also, i'm getting a regex error
Warning: ereg() [function.ereg]: REG_BADRPT in pathtothefile on line 71. I can't figure out what's going wrong. It's supposed to pick out an alphanumeric key (containing characters in the a-f or 0-9 ranges) that is between 16 and 18 characters long bordered by bold tags. It uses lookahead and lookbehind so that the tags themselves are not matched and I only extract the key. I don't know why this is throwing warnings.

Thanks in advance for any help
 

Slothie

New Member
Messages
1,429
Reaction score
0
Points
0
You may need to set the cookies twice,

I'm not sure about how the site you are in works but for remote logins to paypal, I have to

1. Visit the site normally so a cookie gets dropped.
2. Login via POST (then grab the redirection URL).
3. Visit the redirection URL so they register my session/cookie
4. Finally visit the main paypal page.


For your regexes, I haven't used ereg in sometime but you can check your syntax at http://bluefrogx.exofire.net/ for preg_match (Live validator)
 

Tau_Zero

New Member
Messages
19
Reaction score
0
Points
0
If it helps, the login normally seems to work like this:
You go to login page. It has a form where you type in your information and hit the login button.
For a split second, you're on some other page where it says something along the lines of "if you're not redirected in a few seconds click HERE".
Then I end up on the account home page.

So what you're saying is visit the login page and login with cURL (like I do with my first part)
Visit that intermediate page using the cookiejar and cookiefile options so it registers everything with the site
Then finally visit the page I want to get.

Right?

Is that intermediate step the redirection URL you were talking about? If so, how exactly do I grab that? (i'm totally new to cURL).
 

Slothie

New Member
Messages
1,429
Reaction score
0
Points
0
It should have a general format for the url, or some kind of container. If so regex it.

I meant visit the logon page once and THEN use curl to login.
 

Tau_Zero

New Member
Messages
19
Reaction score
0
Points
0
I still don't know if I'm quite understanding you. You mean to first use cURL to access the base login page, without sending any login variables. Then access it again, sending the login information?

Also, I logged in normally and looked at the forms and noticed this:
step 1) Login page-->form action:Login (the same page)
step 2) Intermediate page--> ".../Home?fromlogin=" This starts out saying you'll be redirected and then the main page loads. Perhaps it starts reloading so fast that the url in the address bar changes before I can see it, I don't know.

Am I supposed to extract that intermediate page (if it is infact a different page) from the headers somewhere? I tried to output the headers after running curl_exec for the login by commenting out the ob_start and ob_end_clean, but then it loads the page and redirects on my site, leading to an obvious page not found. Is there a way of running teh query and saving the headers or whatever i need to a variable?

(sorry if i'm missing something. i've spent a few hours now trying to go through the curl documentation and searching similar situations on google, but haven't found anything that works)
Edit:
I figured it out. I first ran the cURL query to log in. I then ran the query to the home page with the log in as the referrer. Then I could finally go and get information from other places.

Thanks for your help.
 
Last edited:
Status
Not open for further replies.
Top