Archive

Archive for April, 2005

Browser Helper Objects C#

Can anybody help tell my why this code isn’t working. Its a Browser Helper Object for Internet Explorer, Simply designed to Write the Current Time at the bottom of each page. It compiles ok, and it registers as a COM object with regasm ok, and I put its GUID in the correct place in the registry (under Explorer/Browser Helper Objects). I can’t even get VS.NET to attach to the IEXPLORE process.

using System;

using

SHDocVw;

using

mshtml;

namespace BrowserHelperObject

{

using System;

using System.Runtime.InteropServices;

[ComVisible(true),

InterfaceType(ComInterfaceType.InterfaceIsIUnknown),

Guid("FC4801A3-2BA9-11CF-A229-00AA003D7352")]

public interface IObjectWithSite

{

void SetSite ([MarshalAs(UnmanagedType.IUnknown)]object site);

void GetSite (ref Guid guid, out object ppvSite);

}

/// <summary>

/// Summary description for CustomBrowserHelperObject

/// </summary>

[ComVisible(

true),

ClassInterface(ClassInterfaceType.None),

Guid("9E6F3590-52D0-406d-A7ED-DAE87BED989F")]

public class CustomBrowserHelperObject : IObjectWithSite

{

const int E_FAIL = unchecked((int)0x80004005);

const int E_NOINTERFACE = unchecked((int)0x80004002);

private IWebBrowser2 iwb2InternetExplorer;

public void SetSite ([MarshalAs(UnmanagedType.IUnknown)]object site)

{

iwb2InternetExplorer = (IWebBrowser2)site;

if (iwb2InternetExplorer != null)

{

WebBrowser wbInternetExplorer = (WebBrowser)iwb2InternetExplorer;

wbInternetExplorer.DocumentComplete +=

new DWebBrowserEvents2_DocumentCompleteEventHandler(

this.OnDocumentComplete);

}

}

public void GetSite (ref Guid guid, out object ppvSite)

{

ppvSite=

null;

if (iwb2InternetExplorer != null)

{

IntPtr ipSite = IntPtr.Zero;

IntPtr ipUnknown = Marshal.GetIUnknownForObject(iwb2InternetExplorer);

Marshal.QueryInterface(ipUnknown, ref guid, out ipSite);

Marshal.Release(ipUnknown);

Marshal.Release(ipUnknown);

if (!ipSite.Equals(IntPtr.Zero))

{

ppvSite = ipSite;

}

else

{

Release();

Marshal.ThrowExceptionForHR(E_NOINTERFACE);

}

}

else

{

Release();

Marshal.ThrowExceptionForHR(E_FAIL);

}

}

private void OnDocumentComplete (object frame, ref object urlObj)

{

HTMLDocument hDoc = (HTMLDocument)iwb2InternetExplorer.Document;

hDoc.body.innerHTML = hDoc.body.innerHTML + "<hr><br>Page viewed at " + DateTime.Now;

}

protected void Release()

{

if (iwb2InternetExplorer != null)

{

Marshal.ReleaseComObject(iwb2InternetExplorer);

iwb2InternetExplorer = null;

}

}

}

}

Categories: Uncategorized

“Send to a Friend” Menu extension for Internet Explorer

I recently was using the Full Source utility by ThunderMain, and I was curious to see if I could make my own context menu for Internet Explorer. A useful utility I thought of using was a "send to a friend" utility.

I created the following registry key

HKEY_CURRENT_USERSoftwareMicrosoftInternet ExplorerMenuExtSend to a friend

and set the default value to

c:windowswebsendToAFriend.html

Which contains the following

<script language="JavaScript">
window.open("http://www.pop3webmail.info/reply.aspx?url=" + external.menuArguments.document.URL);
</script>

From the page on www.pop3webmail.info, it automatically fills in a message saying "Please Visit: (whatever url)"

 

Categories: Uncategorized

Javascript for copying a drop down list

I found this nifty piece of javascript handy for copying the contents of one drop down list into another

function copyDropDownList(source,destination)

 for(i=0;i<source.options.length;i++)
 {
  optionText = source.options[i].text;
  optionValue = source.options[i].value;
  destination.add(new Option(optionText,optionValue));
 }
}

Categories: Uncategorized

Grumbles about web service endpoints in SQL 2005

I was interested in trying out the new feature in SQL 2005, whereby it can host web services directly within the database engine, rather than via IIS.

So following a few online guides I created a simple stored procedure

create procedure FindPub
 @city varchar(100)
as
select * from pubs where city=@city

Then mapped this to an endpoint with the following code:

CREATE ENDPOINT SQLEP_FindPub
    STATE = STARTED
AS HTTP
(
    PATH = ‘/FinfPub’,
    AUTHENTICATION = (BASIC),
    PORTS = (CLEAR),
    SITE = ‘listofpubs’
)
FOR SOAP
(
    WEBMETHOD ‘FindPub’
        (NAME=’dinfo.dbo.FindPub’),
    BATCHES = DISABLED,
    WSDL = DEFAULT,
    DATABASE = ‘dinfo’,
    NAMESPACE = ‘http://www.listofPubs.info/FindPub&#8217;
)

On running this code – I got an error saying the "CREATE ENDPOINT’ statment is not supported in this edition of SQL server. (note the typo in the error message! :)). Fair enough, it’s a free product. But what really struck me was that it forced me to enter an Authentication option. To me this is counter intuitive for a web service. – I have to add the caveat here that I’m no expert in SQL server webserices, so there may be a way around this.

Forcing the user to provide authentication seems logical, as in, the web service needs to run under some credentials so that it can authenticate itself against the database, but this should be coded into the webservice, not provided by the caller. After all, in order to supply Windows authentication (NTLM) over the internet, We’d need a VPN or similar. Then if we were to use Basic authentication, this is too insecure, it would need to be provided over SSL.

Therefore in order to supply a web service directly from SQL server to the general public over the Internet, you’d need (a) to get a SSL certificate and install it (b) create a new windows login, since, you don’t want mr Joe public logging in under Administrator rights (c) map the new windows login to a sql login, (d) Include the login details in a public place so people can find it easily. – This seems much more difficult then hammering out an equivalent stored procedure in Visual Studio, with hard-coded credentials.

To me, the ethos behind web services, was that they were designed to be publicly viewable and useable resources, not coveted entry points to fulll control of a company’s data.

Anyway, enough grumbling. I just realized this morning that my website www.openmerchantaccount.info just got a PR5 in google. Unfortunately it hasn’t yet re-positioned itself in google yet, I guess I have to wait for the next index shuffle. It’s got over 1,000 backlinks from my own personal ring of data-heavy websites, of which I just launched two more, www.listofhauliers.info and www.listofcardealers.info – I’m quite an expert at SEO !.

 

 

 

Categories: Uncategorized

Reading an unknown database format (.BDB)

I was given the task of extracting data from an unknown database format – given the extension .BDB. From research on the Internet, I narrowed this down to two database formats, either Microsoft Works or Berkely database.

I tried importing into Works, but only got a cryptic "BTLDB2.0" as an output. I then downloaded Berkly Db from SleepyCat.com, to find, to my dissapointment no binary distribution, just a bulk of C++ code, which when I tried to compile in VS 2005 gave errors like failure during conversion to COFF: file invalid or corrupt , duplicate resource — type: type , name: name , language: language , flags: flags , size: size etc. Which I didn’t have the time or patience to fix. I tried also their Java version but that gave an error too.

I then read that MySQL supported Berkely databases, so I downloaded MySQL, and used the mySQLImport utility to insert it into a table, Unfortunately, you need to have the table created already, in order for mySQLImport to work, apparently it doesn’t import table schema. I created a generic table with just one column of type blob. The import succeded, but the data was as vague as if I opened it up in notepad.

I then decieded to write a C# app to read through the text, splitting on any english phrases (i.e. strings over 3 characters long with ASCII values in the range 32 to 127 and 13) thus:

int iRead = 0;
byte bRead = 0;
ArrayList alStringCollection =
new ArrayList();
ArrayList alByteCollection =
new ArrayList();
bool isReadingWord = true;
string strWord = "";
byte[] bWord;
int iWordCounter = 0;
frmUI.tbStatus.Text += "rnThread started";
FileStream fsIn =
new FileStream(inFile,FileMode.Open);
FileStream fsOut = new FileStream(outFile,FileMode.Create);
StreamWriter swOut =
new StreamWriter(fsOut);
while(true)
{
iRead = fsIn.ReadByte();
if (iRead==-1) break;
bRead = Convert.ToByte(iRead);
if (isAlphaNumeric(bRead))
{
isReadingWord =
true;
alByteCollection.Add(bRead);
}
if (!isAlphaNumeric(bRead) && isReadingWord)
{
if (alByteCollection.Count>3)
{
bWord = (
byte[])alByteCollection.ToArray(bRead.GetType());
strWord = Encoding.UTF8.GetString(bWord);
alStringCollection.Add(strWord);
iWordCounter ++;
swOut.WriteLine("[" + iWordCounter.ToString() + "] " + strWord);
}
alByteCollection =
new ArrayList();
isReadingWord=
false;
}
}
frmUI.tbStatus.Text += "rnRead " + alStringCollection.Count.ToString() + " words";
fsOut.Close();

I then opened the resultant file in notepad, and tried to figure out the schema from the debug into. And luckily with a little study I got it, for the specific database I was working on. Unfortunately, I can’t give an exact schema of BDB files here, since I don’t 100% understand them. But It serves as an interesting example of reading a non-standard database.

The result of my work should be soon visible on www.listofpubs.info

 

Categories: Uncategorized

Full text searching in MSDE (without MSSEARCH)

One of the drawbacks of MSDE is that Full Text Searching (FTS) us unavailable. However, you can put together your own version of FTS with a stored procedure and a few tables.

The benifit of FTS is that it can match queries against english text without great difficullty. Say we had some textual data "The cat sat on the mat" and we wanted to make a select statement for text containing the word "cat" we would use: select from sometable where text like ‘% cat %’. This has two knock on effects – (a) text "like" comparisons are relatively slow. and (b) since we have no idea how many records the query will return, we cannot use select top to speed up the query. Furthermore, if we searched for a collection of words such as "cat", "mat" and "floor", the above text should still come with a match, albeit with a lower degree of certainty, since the word "floor" is not in the data.

Instead, if we create a table conatining all the words in our table, joined on another table containing a link to each word, and a link to the document’s primary key. we can then do queries such as

select ID into #phraseIDs from phrases where phrase = ‘ cat’
select DocumentID into #DocIDs from links L join #phraseIDs P on P.ID = L.ID
select * from documents d join #DocIDs d2 on d.ID = D2.DocumentID

Although this looks more complex, since we are only using exact string comparison and joins on integers (assuming all the relevant columns are properly indexed). The query should work out to execute quicker.

The trick to doing this starts with a nice user defined function called fn_Split (http://www.windowsitpro.com/Windows/Articles/ArticleID/21071/pg/2/2.html)

Then I created two temporary tables in a seperate database

create database dinfoFTS
go
create table dinfoFTS..Phrases
( id int identity(1,1) not null,
  Phrase varchar(1000)
)
go
create table dinfoFTS..Links
( id int identity(1,1) not null,
  PhraseID int,
  DocumentID int
)
go

I then created a stored procedure to populate these tables, making use of fn_split

alter procedure CreateFTS as
SET NOCOUNT on
delete from  dinfoFTS..Phrases
delete from dinfoFTS..Links
create table #phrases (phrase varchar(1000),minID int)
DECLARE @name varchar(1000)
DECLARE @id int
DECLARE @lastInserted int
DECLARE companies_cursor CURSOR FOR
SELECT name + ‘ , ‘ + address,id
FROM companies
OPEN companies_cursor
FETCH NEXT FROM companies_cursor INTO @name,@id
WHILE @@FETCH_STATUS = 0
BEGIN
 print @id 
 — parse the company name and address
 insert into #phrases (phrase)
 select value from dbo.fn_Split(@name,’ ‘)
  
 — set the minID for any duplicates
 update #phrases
 set #phrases.minID = dinfoFTS..Phrases.id
 from dinfoFTS..Phrases 
 where dinfoFTS..Phrases.Phrase = #phrases.phrase
 
 set @lastInserted = scope_identity()

 — insert any non-duplicates
 insert into dinfoFTS..Phrases (phrase)
 select phrase from #phrases where minID is null
 
 — set the minID for recently inserted phrases
 update #phrases
 set #phrases.minID = dinfoFTS..Phrases.id
 from dinfoFTS..Phrases  
 where dinfoFTS..Phrases.Phrase = #phrases.phrase
 and #phrases.minID is null
 and dinfoFTS..Phrases.id>@lastInserted
 

 — Then insert #phrases into the dinfoFTS..Links table
 insert into dinfoFTS..Links (PhraseID,DocumentID)
 select minID,@id from #phrases
 
 — clear out #phrases
 delete from #phrases
 FETCH NEXT FROM companies_cursor INTO @name,@id
END
CLOSE companies_cursor
DEALLOCATE companies_cursor
drop table #phrases

As a benchmark it took 20 seconds to read 1000 rows, and I’m now running it on a 11M row database!.

If anybody has any optimization hints, just post a comment.

Categories: Uncategorized

Screen scraping windows applications

I’ve often found the need for extracting text from other windows applications, for example to extract text from MSN messenger, or the Internet explorer address bar or whatever. Instead of opening up Spy++ to get the Hwnd, and working with scraps of Win32 API calls I decided to write a generic class which recursively extracts text from every window on the screen, and stores it in an XmlDocument.

I used a VB6 version of this class before, so I decided to port it to .NET to modernize the code a bit, hense it’s written in VB.NET.

First I declared a few API functions thus:

Private Declare Function GetWindow Lib "user32.dll" (ByVal hWnd As Integer, ByVal wCmd As Integer) As Integer
Private Declare Function GetDesktopWindow Lib "user32.dll" () As Integer
Private Declare Function SendMessage Lib "user32.dll" Alias "SendMessageA" (ByVal hWnd As Integer, ByVal Msg As Integer, ByVal wParam As Int32, ByVal lParam As Int32) As Integer
Private Declare Function SendMessage Lib "user32.dll" Alias "SendMessageA" (ByVal hWnd As Integer, ByVal Msg As Integer, ByVal wParam As Int32, ByVal lParam As String) As Integer
Private Const GW_HWNDNEXT As Short = 2 ‘The window below the given window in the Z-order.
Private Const GW_CHILD As Short = 5 ‘The topmost of the given window’s child windows. This has the same effect as using the GetTopWindow function.
Private Const WM_GETTEXT As Short = &HDS ‘ Get Window text
Private Const WM_GETTEXTLENGTH As Short = &HES ‘ Get the length of window text

Notice that SendMessage is declared twice, with an overloaded parameter lParam. I found this to be a better approach to using "Object" as the type of the parameter, as it kept causing Ole errors.

The enty function was simply

Public Function Analyze() As System.Xml.XmlDocument
      
Return Analyze(GetDesktopWindow())
End Function

This then calls the Analyze(hWnd) function thus:

Private

Function Analyze(ByVal hWnd As Integer) As System.Xml.XmlDocument
         
Dim childHwnd As Integer
         
Dim XMLRootElement As Xml.XmlElement
         
Dim XMLRoot As Xml.XmlElement
          xmlDoc =
New XmlDocument
          childHwnd = GetWindow(hWnd, GW_CHILD)
          XMLRootElement = xmlDoc.CreateElement("window")
          XMLRoot = xmlDoc.AppendChild(XMLRootElement)
          AppendAnalysis(XMLRoot, hWnd,
Nothing,0)
          recurseChildren(XMLRoot, childHwnd, hWnd)
         
Return xmlDoc
End Function

Which then calls the recursive function recurseChildren

Private Sub recurseChildren(ByRef XMLParent As System.Xml.XmlElement, ByRef childHwnd As Integer, ByRef parentHwnd As Integer)
         
Dim grandChildHwnd As Integer
         
Dim ZCount As Integer = 1
         
Dim XMLChildElement As XmlElement
         
Dim XMLChild As XmlElement
          Do Until childHwnd = 0
                     XMLChildElement = xmlDoc.CreateElement("window")
                    
XMLChild = XMLParent.AppendChild(XMLChildElement)
                    
AppendAnalysis(XMLChild, childHwnd, parentHwnd, ZCount)
                    
ZCount = ZCount + 1
                    
grandChildHwnd = GetWindow(childHwnd, GW_CHILD)
                    
recurseChildren(XMLChild, grandChildHwnd, childHwnd)
                    
childHwnd = GetWindow(childHwnd, GW_HWNDNEXT)
         
Loop
End Sub

This calls AppendAnalysis to build up the XmlDocument

Private Sub AppendAnalysis(ByRef Parent As System.Xml.XmlElement, ByRef childHwnd As Integer, ByRef parentHwnd As Integer, ByRef ZCount As Integer)
        
Dim Description As String
        
Dim XMLNodeWindowText As XmlElement
         XMLNodeWindowText = xmlDoc.CreateElement("Text")
         XMLNodeWindowText.InnerText = getTextFromHwnd(childHwnd)
         Parent.AppendChild(XMLNodeWindowText)
        
If XMLNodeWindowText.InnerText = "" Then
           
Description = "#" & ZCount
        
Else
          
Description = XMLNodeWindowText.InnerText
        
End If
End Sub

And finally, we get down and dirty with the Win32 API

Private Function getTextFromHwnd(ByVal hwnd As Integer) As String
   
‘ use <?xml version="1.0" encoding="UTF-8"?> as a header.
    
Dim wintext As String ‘ receives the copied text from the target window
   
Dim slength As Integer ‘ length of the window text
   
Dim retval As Integer ‘ return value of message
   
slength = SendMessage(hwnd, WM_GETTEXTLENGTH, 0, 0) + 1
    wintext =
New String(ChrW(0), slength)
    retval = SendMessage(hwnd, WM_GETTEXT, slength, wintext)
    wintext = wintext.Substring(0, retval)
   
Return wintext
End Function

Here, I used SendMessage WM_GETTEXT rather than getWindowText, since I found that the SendMessage technique works better accross process boundaries.

The resultant XML looks like this

<window><Text></Text><window><Text></Text></window><window><Text>AppAnalyze.NET – Microsoft Visual Basic .NET [run] – Appanalyze.vb [Read Only]</Text>…

Categories: Uncategorized

Importing a malformed flat file using DTS

If you ever had the pleasure of importing a large text file into SQL server using DTS Import and Export data, and had it fail on row 10.000, just to drop you back to the start again, you might find this a life saver.

You can get a DTS import to ignore malformed data using this process:

When using DTS import and export data, select "Save as DTS package". Run the task as usual and when it fails, open Enterprise manager, select your database, Select Data Transformation services > Local packages. Right click on the DTS package, and select "Design package"

Go to menu > package > Disconnected Edit then go to Tasks and click on the task that is failing due to errors. Located within this task properties is MaximumErrorCount and change that to a large number. Click the close button. Press package then execute to run the query

see: http://sqlservercentral.com/forums/shwmessage.aspx?forumid=92&messageid=173561 for more info…

Categories: Uncategorized

Using Response.Filter in ASP.NET

http://www.webtropy.com/projects/textonly/

I was once asked if it were possible to easily create "Text only" versions of websites, so that they wouldn’t cost a fortune to look at when using a smart phone or PDA – which can cost 2 euros per megabyte, or more. I had forgotten about it for a while until I picked up the september 2004 edition of MSDN, which was sitting under my coffee table, and came across an article about Response.Filter, and it’s application for custom ASP.NET traces.

I developed this in VS.NET 2005 / .NET 2.0, but the code is the same for previous versions.
I activate the filter from a button click thus:

void Button1_Click(object sender, EventArgs e)
{
     Response.Filter =
new TextOnlySteam(Response.Filter);
}

And now, I define my custom stream, TextOnlyStream

public class TextOnlySteam : MemoryStream
{
     private Stream strmOutput;
    /// <summary>
    /// Constructor, initializes the output stream
    /// </summary>
    public TextOnlySteam(Stream strmConstructor)
    {
        strmOutput = strmConstructor;
    }
    /// <summary>
    /// Replaces occurrances of src="whatever.jpg" with src=""
    /// </summary>
    public override void Write(byte[] buffer, int offset, int count)
    {
        Encoding enc = HttpContext.Current.Response.ContentEncoding;
        string strHtml = enc.GetString(buffer, offset, count);
        strHtml =
Regex.Replace(strHtml, "src=\"\w.*\"", "src=""",
                      RegexOptions
.IgnoreCase);
        byte[] bHtml = enc.GetBytes(strHtml);
        strmOutput.Write(bHtml, 0, strHtml.Length);
    }
}

The code should be improved to support other types of image inclusion such as table background and the CSS background url tag. Also, HTML does not dictate that inverted commas need to delimit tag parameters, apostrophes, or spaces can suffice. For this I would expect that I would use the HTML Agility pack to normalize the HTML to XML before parsing.

Oh, and I also set live two new sites www.listOfTaxis.info and www.listOfBanks.info (the DNS mightn’t propogate until monday morning)

 

Categories: Uncategorized

set IDENTITY_INSERT is connection specific?

Just noticed a glitch in SQL server 2000. Which is quite ironic, since this blog is more than likely hosted on SQL 2000, being with Msn spaces.

use Database1

set IDENTITY_INSERT Database1.dbo.operator on

use Database2

set IDENTITY_INSERT Database2.dbo.operator on

IDENTITY_INSERT is already ON for table Database1.dbo.Operator’. Cannot perform SET operation for table Database2.dbo.operator’.

When I opened a new connection to server, it allowed me set the identity insert on database2. (the names of the tables involved have been changed for client privacy). So does this mean that SET variables are connection specific, and that the IDENTITY_INSERT variable can only be applied to one table at a time? – I can’t really think of a reason for making a multi-table insert requiring idenity inserts within the same connection, but I have to complain about something….

Anyway, just bought myself a new laptop today, off micromart, After my last one died when I spilt coffee on it!. If anybody would like a dead toshiba laptop for spare parts or to repair it, please post a comment on this blog, and I’ll get back to you.

 

 

 

Categories: Uncategorized