Home > Uncategorized > Reading an unknown database format (.BDB)

Reading an unknown database format (.BDB)

I was given the task of extracting data from an unknown database format – given the extension .BDB. From research on the Internet, I narrowed this down to two database formats, either Microsoft Works or Berkely database.

I tried importing into Works, but only got a cryptic "BTLDB2.0" as an output. I then downloaded Berkly Db from SleepyCat.com, to find, to my dissapointment no binary distribution, just a bulk of C++ code, which when I tried to compile in VS 2005 gave errors like failure during conversion to COFF: file invalid or corrupt , duplicate resource — type: type , name: name , language: language , flags: flags , size: size etc. Which I didn’t have the time or patience to fix. I tried also their Java version but that gave an error too.

I then read that MySQL supported Berkely databases, so I downloaded MySQL, and used the mySQLImport utility to insert it into a table, Unfortunately, you need to have the table created already, in order for mySQLImport to work, apparently it doesn’t import table schema. I created a generic table with just one column of type blob. The import succeded, but the data was as vague as if I opened it up in notepad.

I then decieded to write a C# app to read through the text, splitting on any english phrases (i.e. strings over 3 characters long with ASCII values in the range 32 to 127 and 13) thus:

int iRead = 0;
byte bRead = 0;
ArrayList alStringCollection =
new ArrayList();
ArrayList alByteCollection =
new ArrayList();
bool isReadingWord = true;
string strWord = "";
byte[] bWord;
int iWordCounter = 0;
frmUI.tbStatus.Text += "rnThread started";
FileStream fsIn =
new FileStream(inFile,FileMode.Open);
FileStream fsOut = new FileStream(outFile,FileMode.Create);
StreamWriter swOut =
new StreamWriter(fsOut);
while(true)
{
iRead = fsIn.ReadByte();
if (iRead==-1) break;
bRead = Convert.ToByte(iRead);
if (isAlphaNumeric(bRead))
{
isReadingWord =
true;
alByteCollection.Add(bRead);
}
if (!isAlphaNumeric(bRead) && isReadingWord)
{
if (alByteCollection.Count>3)
{
bWord = (
byte[])alByteCollection.ToArray(bRead.GetType());
strWord = Encoding.UTF8.GetString(bWord);
alStringCollection.Add(strWord);
iWordCounter ++;
swOut.WriteLine("[" + iWordCounter.ToString() + "] " + strWord);
}
alByteCollection =
new ArrayList();
isReadingWord=
false;
}
}
frmUI.tbStatus.Text += "rnRead " + alStringCollection.Count.ToString() + " words";
fsOut.Close();

I then opened the resultant file in notepad, and tried to figure out the schema from the debug into. And luckily with a little study I got it, for the specific database I was working on. Unfortunately, I can’t give an exact schema of BDB files here, since I don’t 100% understand them. But It serves as an interesting example of reading a non-standard database.

The result of my work should be soon visible on www.listofpubs.info

 

Advertisements
Categories: Uncategorized
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: