Decode Google Recaptcha with C#
The Google Recaptcha system is one of the most popular Captcha systems in use. To beat it, you’ll need to subscribe to a Human Captcha API, What I used was FastTypers.org (Also known as HumanCoders or ExpertDecoders). To test this, I used the standard Recaptcha setup under ASP.NET CLR4 using the code downloaded from http://code.google.com/p/recaptcha/source/browse/trunk/recaptcha-plugins/
I set up a Windows forms application, with a button called btnReCaptcha, and ran the Recaptcha website under the virtual folder /Recaptcha.Test-CLR4/ – and here is the code – Note the code 6Lf9udYSAAAAAGF0LkIu3QsmMPfanZH3T8EXs9fA is the public key that will be contained in the HTML code of the website hosting the website.
private void btnReCaptcha_Click(object sender, EventArgs e)
{
var wc = new WebClient();
var strHtml = wc.DownloadString(“http://www.google.com/recaptcha/api/challenge?k=6Lf9udYSAAAAAGF0LkIu3QsmMPfanZH3T8EXs9fA&hl=&”);
const string strChallengeRegex = @”challenge.{4}(?<Challenge>[\w-_]+)”;
var strChallenge = Regex.Match(strHtml, strChallengeRegex).Groups[“Challenge”].Value;
var bImage = wc.DownloadData(“http://www.google.com/recaptcha/api/image?c=” + strChallenge);
var solver = new CaptchaSolver();
solver.SolveCaptcha(bImage);
var strImageText = solver.LastResponseText;
strHtml = wc.DownloadString(“http://localhost/Recaptcha.Test-CLR4/”);
var strViewstate = GetViewStateFromHtml(strHtml, true);
var strEventValidation = GetEventValidationFromHtml(strHtml);
var strPostData = “__EVENTTARGET=”;
strPostData += “&__EVENTARGUMENT=”;
strPostData += “&__VIEWSTATE=” + strViewstate;
strPostData += “&__EVENTVALIDATION=” + strEventValidation;
strPostData += “&recaptcha_challenge_field=” + strChallenge;
strPostData += “&recaptcha_response_field=” + strImageText;
strPostData += “&RecaptchaButton=Submit”;
wc.Headers[HttpRequestHeader.ContentType] = “application/x-www-form-urlencoded”;
string HtmlResult = wc.UploadString(“http://localhost/Recaptcha.Test-CLR4/”, strPostData);
}/// <summary>
/// Gets a ASP.NET Viewstate from an aspx page.
/// </summary>
/// <param name=”strHtml”>The HTML to extract the viewstate string from.</param>
/// <param name=”urlEncode”>Should the response be Url Encoded.</param>
/// <returns></returns>
public static string GetViewStateFromHtml(string strHtml, bool urlEncode)
{
const string strViewStateRegex = @”__VIEWSTATE.*value..(?<viewstate>[/\w\+=]+)”;
var strViewState = Regex.Match(strHtml, strViewStateRegex, RegexOptions.Compiled).Groups[“viewstate”].Value;
if (urlEncode) { strViewState = HttpUtility.UrlEncode(strViewState); }
return strViewState;
}/// <summary>
/// Gets a ASP.NET EventValidation from an aspx page, it will be already urlencoded.
/// </summary>
/// <param name=”strHtml”></param>
/// <returns></returns>
public static string GetEventValidationFromHtml(string strHtml)
{
var strEventValidationRegex = @”__EVENTVALIDATION.{32}(?<EventValidation>[/\w\+=]+)”;
var strEventValidation = Regex.Match(strHtml, strEventValidationRegex, RegexOptions.Compiled).Groups[“EventValidation”].Value;
if (strEventValidation.Length % 4 != 0)
{
// Invalid Capture, try another regex.
strEventValidationRegex = @”__EVENTVALIDATION..value..(?<EventValidation>[/\w\+=]+)”;
strEventValidation = Regex.Match(strHtml, strEventValidationRegex, RegexOptions.Compiled).Groups[“EventValidation”].Value;
}
strEventValidation = HttpUtility.UrlEncode(strEventValidation);
return strEventValidation;
}
Basically, it makes a request to Google for a Challenge key, uses the challenge key to get the image, passes the image to the Human OCR API, and then captures the Viewstate and Event Validation from the page, then posts the decoded text, challenge key, back to the webserver.
I am doing exactly the same but I am getting as reply the initial page with the captcha.
LikeLike
Technical note: a cookie called “NID” – if set, will make the captcha much much easier to solve.
http.RequestCookies.Add(new Cookie(“NID”, “68=TjXaruAXaJsiU4x8ffZpT0eKPEJHcnDHPGkkole0nLN_cZ4KhuO9dPXE8ASjUiJnGDy5RsDPKYriIf3JaNIaD10tk6of4qHYcOL3uIdUu2rsOny_dQ66-1hfjQTqjeI7-EGpGJtz9usFPsKaOYkeVltXpjuSmDgnbHnSAjfERQIKysFN4qjhjZTOArf_3bWIp_m_ycs”, “/”, “.google.com”));
LikeLike
I added exactly the same parameters as I see with fiddler. I’m using deathByCaptcha and it’s seems to solve the captcha, but when posting everything, instead of getting the token I am getting back the initial page. Are you still manage to get the token? By the way, we are looking for xamarin developer so if you are interested let me know.
LikeLike