Reverse-engineer auth and cookies
Background[edit]
This exercise of using firefox to log into a website and generate a cookie which can be used by curl to automatically parse that website would have been much easier if I realized that firebug generates cookie files that curl doesn't know how to read. I eventually found that a chrome cookie plugin generates cookie files that curl understands. Perhaps a different firefox cookie plugin generates curl-compatible cookie files.
Also, I had to learn more about curl syntax. The manpage says this:
NOTE that the file specified with -b, --cookie is only used as input. No cookies will be stored in the file. To store cookies, use the -c, --cookie-jar option
And this:
-c, --cookie-jar <file name> (HTTP) Specify to which file you want curl to write all cookies after a completed operation. Curl writes all cookies previously read from a specified file as well as all cookies received from remote server(s).
That left me wondering if -c allows you to read and write from the cookie-jar since it mentions reading and writing. But evidence shows that -c is for writing only.
log in with firefox and export cookies to cookies.txt[edit]
This only exports cookies for the given site. Not all domains. But it could be relevant that another domain was used to authenticate and does have some cookies.
Curl options
-b read cookies from this file. Don't store any cookies here. -c write all cookies here -v be verbose -D dump headers to this file -I just show the headers, not the body
rday@ferret:~$ ls -l incookies.txt outcookies.txt ls: cannot access outcookies.txt: No such file or directory -rw-r--r-- 1 rday rday 643 Feb 1 11:25 incookies.txt
use those cookies in curl[edit]
rday@ferret:~$ curl -vD headers -b incookies.txt -c outcookies.txt http://christianscience.com/bible-lessons/ebiblelesson/love * Adding handle: conn: 0x1f03ef0 * Adding handle: send: 0 * Adding handle: recv: 0 * Curl_addHandleToPipeline: length: 1 * - Conn 0 (0x1f03ef0) send_pipe: 1, recv_pipe: 0 * About to connect() to christianscience.com port 80 (#0) * Trying 174.129.17.231... * Connected to christianscience.com (174.129.17.231) port 80 (#0) > GET /bible-lessons/ebiblelesson/love HTTP/1.1 > User-Agent: curl/7.32.0 > Host: christianscience.com > Accept: */* > < HTTP/1.1 200 OK < Date: Sat, 01 Feb 2014 19:41:33 GMT * Server Apache/2.2.22 (Ubuntu) is not blacklisted < Server: Apache/2.2.22 (Ubuntu) < X-Powered-By: eZ Publish < Expires: Mon, 26 Jul 1997 05:00:00 GMT < Last-Modified: Sat, 01 Feb 2014 19:41:33 GMT < Cache-Control: no-cache, must-revalidate < Pragma: no-cache < Served-by: christianscience.com < Content-language: en-US < Vary: Accept-Encoding < Transfer-Encoding: chunked < Content-Type: text/html; charset=utf-8 < <!DOCTYPE html>
The site thinks I'm not authenticated.
try to use those same cookies with a different browser[edit]
Can chrome incognito windows load cookies from a file? Hmm. It isn't apparent how to load a cookie file in chrome. Using another firefox window, I can see that it has the same cookies for this domain, but still can't log in.
I launched a firefox private window and it viewed the cookies for this domain. They looked the same. I exported and diffed them and they were exactly the same. I looked at the cookies for the main third party domain and they were slightly different.
Then I logged in through the private window and exported the cookies for this domain. Still the same.
So the domains I've looked at so far are
- christianscience.com
- buysub.com
- w1.buysub.com
Looking in the cookie database directly with a sqlite client, I can see there is another domain I need to check:
- christianscience.buysub.com
So I create a new private window in firefox and start by verifying that I am not authenticated. And I am not. In another tab, I visit the new domain. The private window picks up 5 cookies. I think these are bleeding through from my normal firefox window which is authenticated properly.
I come back to the main url and refresh to find that I am still unauthenticated. I force a refresh in the new domain tab and get many more cookies. I force a refresh in the main url tab and I'm still unauthenticated.
I'm beginning to suspect that firefox and firebug have an unexpected behavior. Or are lying to me.
watch cookies accumulate in chrome[edit]
I start with chrome, logged in to christianscience.com
- clear the cookies in christianscience.com
- clear the cookies in buysub.com
- clear the cookies in w1.buysub.com
- clear the cookies in christianscience.buysub.com
Refresh the tab in christianscience.com and I'm now logged out.
- the only cookie is __ff_prevReqData in christianscience.com
I click to log in and am redirected to w1.buysub.com
I now have cookies in
- w1.buysub.com
- ws.sharethis.com
- seg.sharethis.com
I log in and am redirected to christianscience.com
But chrome doesn't let me export or import cookies as far as I can tell. Ahh, I just had to install a chrome plugin called Edit This Cookie.
I exported the cookies for each of the domains in json. Then I opened a new incognito window, but that didn't have the plugin so I couldn't import the cookies back in.
Instead, I cleared all chrome history and closed all tabs. Then I went to cs.com and I was not authenticated. Then I imported the json for just the three cookies for this domain and I was authenticated. Simple. Why doesn't this work for curl or firefox?
I changed the format for the cookie export to netscape cookie.txt format and exported the three cookies.
import cookies into firefox[edit]
I installed a cookie import/export plugin for firefox and was able to clear all history, visit cs.com and be unauthenticated, and then import the three cookies and be authenticated. Simple. Now to make it work for curl.
back to curl[edit]
Given that I now know which cookies will get me an authenticated session, get curl to successfully use them.
$ curl -c blah.txt http://christianscience.com/bible-lessons/ebiblelesson/love
No authentication.
$ curl -b blah.txt http://christianscience.com/bible-lessons/ebiblelesson/love
Holy crow, that worked. Looking at the diff of the cookie files, I see a difference in the number of fields exported by firebug and chrome's Edit This Cookie. Firebug inserts a field that is always 'undefined' between the cookie timestamp and the cookie value. Edit This Cookie doesn't have this 'undefined' field.
This also tells me that -c is *only* for writing session cookies and -b is *only* for reading cookies.
manual procedure to make a good auth cookie in chrome[edit]
The auth cookie will expire, so I'll need a procedure for getting a new cookie. Next, I'll script the login process to generate the auth cookie automatically.
- visit http://christianscience.com/bible-lessons/ebiblelesson
- log out if you are authenticated
- log in, visiting buysub.com, and returning to cs.com
- in Edit This Cookie settings, choose preferred export format for cookies: Netscape HTTP Cookie File
- export cookies for domain cs.com. They land in the clipboard.
- paste into a file called cookies.txt
- curl $ curl -b cookies.txt http://christianscience.com/bible-lessons/ebiblelesson/love
scriptable procedure to get an auth cookie[edit]
The login form has these fields:
- cds_email
- cds_gk_password
- the submit button is named "send"
So the curl invocation might look like this:
$ curl -vc cookiesin.txt -b cookiesin.txt -d "cds_email=$user&cds_gk_password=$password&send=submit" https://w1.buysub.com/servlet/CSGateway?cds_mag_code=CAV&cds_page_id=58314
And then check the contents of cookiesin.txt
But then there is the redirect to the other domain which has to set the real cookie.
closer analysis of manual login process[edit]
- fill in form on CAV_login.jsp
- post to servlet/CSQuery
- 302 redirect to CAV_access.jsp
- get CAV_access.jsp
- 200 OK (this page contains a javascript redirect to domain cs.com and displays a log in page for browsers that allow javascript)
Switch domains from buysub.com to cs.com
- get cs.com/biblelesson/login/eBibleLesson (send cookie: eZSESSID)
- 301 redirect to cs.com/biblelesson/login/eBibleLesson
- get cs.com/biblelesson/login/eBibleLesson
- 302 redirect to cs.com/bible-lessons/ebiblelesson/love (set cookie: eZSESSID, expires in 30 days)
- get cs.com/ebiblelesson/love
- 200 OK
Lots of small properties follow.
Logged out, started the log in process, but disabled javascript for buysub.com and examined the page that resulted.
Hmm, I guess it gets too embarrassing if I write anymore about how this works. I'll have to stop here. I should have stopped sooner.