Archive

Archive for December, 2024

Storing data directly in GPU memory with #CLOO in C#

Although I’m not entirely sure of a practical application for this. This application, using C# and CLOO can store arbitrary data in the GPU memory. In this case, I’m picking a large file off the disk, and putting it in GPU memory.

In the case of this NVIDIA Geforce card, the memory is dedicated to the GPU, and not shared with the system, ordinarily.

TL;DR; The Github repo is here – https://github.com/infiniteloopltd/GpuMemoryDemo

The core function is here;

 static void Main()
        {
            var platform = ComputePlatform.Platforms[0];
            var device = platform.Devices.FirstOrDefault(d => d.Type.HasFlag(ComputeDeviceTypes.Gpu));
            var context = new ComputeContext(ComputeDeviceTypes.Gpu, new ComputeContextPropertyList(platform), null, IntPtr.Zero);
            var queue = new ComputeCommandQueue(context, device, ComputeCommandQueueFlags.None);

            const string largeFilePath = "C:\\Users\\fiach\\Downloads\\datagrip-2024.3.exe";
            var contents = File.ReadAllBytes(largeFilePath);

            var clBuffer = Store(contents, context, queue);

            var readBackBytes = Retrieve(contents.Length, clBuffer, queue);

            Console.WriteLine($"Original String: {contents[0]}");
            Console.WriteLine($"Read Back String: {readBackBytes[0]}");
            Console.WriteLine($"Strings Match: {contents[0] == readBackBytes[0]}");

            
            // Memory leak here. 
            //Marshal.FreeHGlobal(readBackPtr);
            //Marshal.FreeHGlobal(buffer);
            
        }

        public static ComputeBuffer<byte> Store(byte[] stringBytes, ComputeContext context, ComputeCommandQueue queue)
        {
            var buffer = Marshal.AllocHGlobal(stringBytes.Length);

            Marshal.Copy(stringBytes, 0, buffer, stringBytes.Length);

            var clBuffer = new ComputeBuffer<byte>(context, ComputeMemoryFlags.ReadWrite, stringBytes.Length);

            queue.Write(clBuffer, true, 0, stringBytes.Length, buffer, null);
            
            return clBuffer;
        }

        public static byte[] Retrieve(int size, ComputeBuffer<byte> clBuffer, ComputeCommandQueue queue)
        {
            var readBackPtr = Marshal.AllocHGlobal(size);

            queue.Read(clBuffer, true, 0, size, readBackPtr, null);

            var readBackBytes = new byte[size];

            Marshal.Copy(readBackPtr, readBackBytes, 0, size);

            return readBackBytes;
        }
    }

we’ll walk through a C# program that demonstrates the use of OpenCL to store and retrieve data using the GPU, which can be beneficial for performance in data-heavy applications. Here’s a breakdown of the code:

1. Setting Up OpenCL Context and Queue

The program begins by selecting the first available compute platform and choosing a GPU device from the platform:

csharpCopy codevar platform = ComputePlatform.Platforms[0];
var device = platform.Devices.FirstOrDefault(d => d.Type.HasFlag(ComputeDeviceTypes.Gpu));
var context = new ComputeContext(ComputeDeviceTypes.Gpu, new ComputeContextPropertyList(platform), null, IntPtr.Zero);
var queue = new ComputeCommandQueue(context, device, ComputeCommandQueueFlags.None);
  • ComputePlatform.Platforms[0]: Selects the first OpenCL platform on the machine (typically corresponds to a GPU vendor like NVIDIA or AMD).
  • platform.Devices.FirstOrDefault(...): Finds the first GPU device available on the platform.
  • ComputeContext: Creates an OpenCL context for managing resources like buffers and command queues.
  • ComputeCommandQueue: Initializes a queue to manage commands that will be executed on the selected GPU.

2. Reading a Large File into Memory

The program then loads the contents of a large file into a byte array:

csharpCopy codeconst string largeFilePath = "C:\\Users\\fiach\\Downloads\\datagrip-2024.3.exe";
var contents = File.ReadAllBytes(largeFilePath);

This step reads the entire file into memory, which will later be uploaded to the GPU.

3. Storing Data on the GPU

The Store method is responsible for transferring the byte array to the GPU:

csharpCopy codevar clBuffer = Store(contents, context, queue);
  • It allocates memory using Marshal.AllocHGlobal to hold the byte array.
  • The byte array is then copied into this allocated buffer.
  • A ComputeBuffer<byte> is created on the GPU, and the byte array is written to it using the Write method of the ComputeCommandQueue.

Note: The Store method utilizes Marshal.Copy to handle memory copying between managed memory (RAM) and unmanaged memory (GPU).

4. Retrieving Data from the GPU

The Retrieve method is responsible for reading the data back from the GPU into a byte array:

csharpCopy codevar readBackBytes = Retrieve(contents.Length, clBuffer, queue);
  • The method allocates memory using Marshal.AllocHGlobal to hold the data read from the GPU.
  • The Read method of the ComputeCommandQueue is used to fetch the data from the GPU buffer back into the allocated memory.
  • The memory is then copied into a managed byte array (readBackBytes).

5. Verifying the Data Integrity

The program prints the first byte of the original and retrieved byte arrays, comparing them to verify if the data was correctly transferred and retrieved:

csharpCopy codeConsole.WriteLine($"Original String: {contents[0]}");
Console.WriteLine($"Read Back String: {readBackBytes[0]}");
Console.WriteLine($"Strings Match: {contents[0] == readBackBytes[0]}");

This checks whether the first byte of the file content remains intact after being transferred to and retrieved from the GPU.

6. Memory Management

The program has a commented-out section for freeing unmanaged memory:

csharpCopy code//Marshal.FreeHGlobal(readBackPtr);
//Marshal.FreeHGlobal(buffer);

These lines should be used to free the unmanaged memory buffers allocated with Marshal.AllocHGlobal to avoid memory leaks, but they are commented out here, leaving room for improvement.

Potential Improvements and Issues

  • Memory Leaks: The program does not properly free the unmanaged memory allocated via Marshal.AllocHGlobal, leading to potential memory leaks if run multiple times.
  • Error Handling: The program lacks error handling for situations like missing GPU devices or file read errors.
  • Large File Handling: For large files, this approach may run into memory constraints, and you might need to manage chunked transfers for efficiency.

In summary, this program demonstrates how to work with OpenCL in C# to transfer data between the host system and the GPU. While it shows the core functionality, handling memory leaks and improving error management should be considered for a production-level solution.

Cost-Effective SQL Server Database Restore on Microsoft #Azure: Using SMB Shares

1) Motivation Behind the Process

Managing costs efficiently on Microsoft Azure is a crucial aspect for many businesses, especially when it comes to managing resources like SQL Server databases. One area where I found significant savings was in the restoration of SQL Server databases.

Traditionally, to restore databases, I was using a managed disk. The restore process involved downloading a ZIP file, unzipping it to a .bak file, and then restoring it to the main OS disk. However, there was a significant issue with this setup: the cost of the managed disk.

Even when database restores happened only once every six months, I was still paying for the full capacity of the managed disk—500GB of provisioned space. This means I was paying for unused storage space for extended periods, which could be a significant waste of resources and money.

To tackle this issue, I switched to using Azure Storage Accounts with file shares (standard, not premium), which provided a more cost-effective approach. By restoring the database from an SMB share, I could pay only for the data usage, rather than paying for provisioned capacity on a managed disk. Additionally, I could delete the ZIP and BAK files after the restore process was complete, further optimizing storage costs.

2) Issues and Solutions

While the transition to using an Azure Storage Account for database restores was a great move in terms of cost reduction, it wasn’t without its challenges. One of the main hurdles I encountered during this process was SQLCMD reporting that the .bak file did not exist, even though it clearly did.

Symptoms of the Problem

The error message was:

 3201, Level 16, State 2, Server [ServerName], Line 1
Cannot open backup device '\\<UNC Path>\Backups\GeneralPurpose.bak'. Operating system error 3(The system cannot find the path specified.)
Msg 3013, Level 16, State 1, Server [ServerName], Line 1
RESTORE DATABASE is terminating abnormally.

This was perplexing because I had confirmed that the .bak file existed at the UNC path and that the path was accessible from my system.

Diagnosis

To diagnose the issue, I started by enabling xp_cmdshell in SQL Server. This extended stored procedure allows the execution of operating system commands, which is very helpful for troubleshooting such scenarios.

First, I enabled xp_cmdshell by running the following commands:

-- Enable advanced options
EXEC sp_configure 'show advanced options', 1;
RECONFIGURE;

-- Enable xp_cmdshell
EXEC sp_configure 'xp_cmdshell', 1;
RECONFIGURE;

Once xp_cmdshell was enabled, I ran a simple DIR command to verify if SQL Server could access the backup file share:

EXEC xp_cmdshell 'dir \\<UNC Path>\Backups\GeneralPurpose.bak';

The result indicated that the SQL Server service account did not have proper access to the SMB share, and that’s why it couldn’t find the .bak file.

Solution

To resolve this issue, I had to map the network share explicitly within SQL Server using the net use command, which allows SQL Server to authenticate to the SMB share.

Here’s the solution I implemented:

EXEC xp_cmdshell 'net use Z: \\<UNC Path> /user:localhost\<user> <PASSWORD>';

Explanation

  1. Mapping the Network Drive:
    The net use command maps the SMB share to a local drive letter (in this case, Z:), which makes it accessible to SQL Server.
  2. Authentication:
    The /user: flag specifies the username and password needed to authenticate to the share. In my case, I used an account (e.g., localhost\fsausse) with the correct credentials.
  3. Accessing the Share:
    After mapping the network drive, I could proceed to access the .bak file located in the SMB share by using its mapped path (Z:). SQL Server would then be able to restore the database without the “file not found” error.

Once the restore was completed, I could remove the drive mapping with:

EXEC xp_cmdshell 'net use Z: /delete';

This approach ensured that SQL Server had the necessary permissions to access the file on the SMB share, and I could restore my database efficiently, only paying for the data usage on Azure Storage.

Conclusion

By transitioning from a managed disk to an SMB share on Azure Storage, I significantly reduced my costs during database restores. The issue with SQL Server not finding the .bak file was quickly diagnosed and resolved by enabling xp_cmdshell, mapping the network share, and ensuring proper authentication. This process allows me to restore databases in a more cost-effective manner, paying only for the data used during the restore, and avoiding unnecessary storage costs between restores.

For businesses looking to optimize Azure costs, this method provides an efficient, scalable solution for managing large database backups with minimal overhead.