Translate a webpage server-side using CURL and PHP for added #SEO
A great way to increase the accessibility of your website to more users is to provided it in multiple languages, not just English. but what happens if you have dynamic content, and just too much content to justify translating manually?
You can simply link out to Google translate, but then you loose the SEO advantage of having your content served from your domain, so doing it server side is key to maintain your traffic. – You “should” use the Google Translate API for this, but it’s a paid-for service, and if you’re not feeling that generous, you can hack your way through Google’s page translation service – until such time as they block your IP 🙂
So, here’s my case in point for a project I was working on http://guide.universaltravelsearch.com , where when a user navigates to the “attraction page”, such as http://guide.universaltravelsearch.com/attraction/33634/%22Sylvan%20Beach%20Resort%22 Then there are links on the foot foot of the page which link to http://guide.universaltravelsearch.com/translate/33634/ru/%22Sylvan+Beach+Resort%22 for the Russian version, for example. – where the user is kept on the same domain, but the content is in Russian.
So, here’s how I did it:
<?php
$url = “http://yoururl/yourpage.php?id=” . $_GET[“id”];
$url = “http://translate.google.com/translate_p?hl=en&sl=en&tl=” . $_GET[“lang”] . “&u=” . urlencode($url);
$output = HttpGet($url);
preg_match(‘/href..(.*).>Translating../’, $output, $m);
$url = html_entity_decode ($m[1]);
$output = HttpGet($url);
echo $output;function HttpGet($url) {
$ch = curl_init($url );
//return the transfer as a string
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT,’Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.14′);
curl_setopt($ch, CURLOPT_REFERER, ‘http://www.google.com/’);
// $output contains the output string
$output = curl_exec($ch);
// close curl resource to free up system resources
curl_close($ch);
return $output;
}
This makes a call to translate_p, and extracts the link that google returns, then makes a call to this url. The Curl request has to fake the user agent, and referrer, otherwise Google will block the request.
There’s no guarantee that this will last for ever, in fact, it’s probable that it will be blocked before too long, but should be fine for low-traffic websites.
?>
Just as an update to this – looks like Google views this as a spider trap… so doesnt’ work
Sorry.
LikeLike