Home
> Uncategorized > Retrieving web pages with foreign characters
Retrieving web pages with foreign characters
When you make a request to a web page using code such as:
HttpWebRequest httprequest = (HttpWebRequest)WebRequest.Create(requestURI);
HttpWebResponse httpresponse = (HttpWebResponse)httprequest.GetResponse();
Stream responsestream = httpresponse.GetResponseStream();
StreamReader httpstream = new StreamReader(responsestream);
string bodytext = httpstream.ReadToEnd();
You may find that certain characters may be missing from the string returned, such as the copyright © character, or foreign characters, such as é (e acute). In order to get around this you need to use Latin encoding (ISO 8859) in the StreamReader thus:
StreamReader httpstream =
new StreamReader(responsestream, Encoding.GetEncoding("iso8859-1"));
… had me stumped for ages!
Categories: Uncategorized
Comments (0)
Trackbacks (0)
Leave a comment
Trackback