database | Network Programming in .NET

The Hidden Cost of ORDER BY NEWID()

March 19, 2026 Infinite Loop Development Ltd Leave a comment

Fetching a random row from a table is a surprisingly common requirement — random banner ads, sample data, rotating API credentials. The instinctive solution in SQL Server is elegant-looking but conceals a serious performance trap.

			
-- Looks innocent. Isn't.
SELECT TOP 1 * FROM LARGE_TABLE ORDER BY NEWID()

What SQL Server actually does

NEWID() generates a fresh GUID for every single row in the table. SQL Server must then sort the entire result set by those GUIDs before it can hand back the top one. On a table with a million rows you are generating a million GUIDs, sorting a million rows, and discarding 999,999 of them.

The problem: On large tables, ORDER BY NEWID() performs a full table scan and a full sort — O(n log n) work — regardless of how many rows you need. It cannot use any index for ordering.

A faster alternative: seek, don’t sort

The key insight is to convert the “random sort” into a “random seek”. If we can generate a random Id value cheaply and then let the clustered index do the work, we avoid scanning the table entirely.

			
DECLARE @Min INT = (SELECT MIN(Id) FROM LARGE_TABLE)
DECLARE @Max INT = (SELECT MAX(Id) FROM LARGE_TABLE)
SELECT TOP 1 *
FROM LARGE_TABLE
WHERE Id >= @Min + ABS(CHECKSUM(NEWID()) % (@Max - @Min + 1))
ORDER BY Id ASC

		

MAX(Id) and MIN(Id) are single index seeks on the primary key. CHECKSUM(NEWID()) generates a random integer without sorting anything. The WHERE Id >= clause then performs a single index seek from that point forward, and ORDER BY Id ASC TOP 1 picks up the very next row.

The result: Two index seeks to get the range, one index seek to find the row. Constant time regardless of table size.

Performance at a glance

Approach	Reads	Sort	Scales with table size?
ORDER BY NEWID()	Full scan	Full sort	O(n log n)
CHECKSUM seek	3 index seeks	None	O(1)

Three caveats to know

1. Id gaps cause mild bias. If rows have been deleted, gaps in the Id sequence mean rows immediately after a gap are slightly more likely to be selected. For most use cases — sampling, rotation, A/B testing — this is an acceptable trade-off.

2. Ids may not start at 1. This is why we use @Min rather than hardcoding zero. If your identity seed started at 1000, NEWID() % MAX(Id) would generate values 0–999, which would never match any row and you’d always get the first row in the table.

3. CHECKSUM can return INT_MIN. ABS(INT_MIN) overflows back to negative in SQL Server. The fix is to apply the modulo before the ABS, keeping the intermediate value safely within range.

When you don’t need randomness at all

For round-robin rotation across a fixed set of rows — such as alternating between API credentials or cookie sessions — true randomness is unnecessary overhead. A deterministic slot based on the current second is even cheaper:

			
-- Rotates across N accounts, one per second, no writes required
WHERE Slot = DATEPART(SECOND, GETUTCDATE()) % TotalAccounts

This resolves to a constant integer comparison — effectively a single index seek — and scales to any number of accounts automatically. No tracking table, no writes, no contention.

The takeaway: whenever you reach for ORDER BY NEWID(), ask whether you actually need true randomness or just approximate distribution. In most production scenarios, a cheap seek beats an expensive sort by several orders of magnitude.

Categories: Uncategorized Tags: ai, database, sql, sql-server, technology

Cost-Effective SQL Server Database Restore on Microsoft #Azure: Using SMB Shares

December 5, 2024 Infinite Loop Development Ltd Leave a comment

1) Motivation Behind the Process

Managing costs efficiently on Microsoft Azure is a crucial aspect for many businesses, especially when it comes to managing resources like SQL Server databases. One area where I found significant savings was in the restoration of SQL Server databases.

Traditionally, to restore databases, I was using a managed disk. The restore process involved downloading a ZIP file, unzipping it to a .bak file, and then restoring it to the main OS disk. However, there was a significant issue with this setup: the cost of the managed disk.

Even when database restores happened only once every six months, I was still paying for the full capacity of the managed disk—500GB of provisioned space. This means I was paying for unused storage space for extended periods, which could be a significant waste of resources and money.

To tackle this issue, I switched to using Azure Storage Accounts with file shares (standard, not premium), which provided a more cost-effective approach. By restoring the database from an SMB share, I could pay only for the data usage, rather than paying for provisioned capacity on a managed disk. Additionally, I could delete the ZIP and BAK files after the restore process was complete, further optimizing storage costs.

2) Issues and Solutions

While the transition to using an Azure Storage Account for database restores was a great move in terms of cost reduction, it wasn’t without its challenges. One of the main hurdles I encountered during this process was SQLCMD reporting that the .bak file did not exist, even though it clearly did.

Symptoms of the Problem

The error message was:

 3201, Level 16, State 2, Server [ServerName], Line 1
Cannot open backup device '\\<UNC Path>\Backups\GeneralPurpose.bak'. Operating system error 3(The system cannot find the path specified.)
Msg 3013, Level 16, State 1, Server [ServerName], Line 1
RESTORE DATABASE is terminating abnormally.

This was perplexing because I had confirmed that the .bak file existed at the UNC path and that the path was accessible from my system.

Diagnosis

To diagnose the issue, I started by enabling xp_cmdshell in SQL Server. This extended stored procedure allows the execution of operating system commands, which is very helpful for troubleshooting such scenarios.

First, I enabled xp_cmdshell by running the following commands:

-- Enable advanced options
EXEC sp_configure 'show advanced options', 1;
RECONFIGURE;

-- Enable xp_cmdshell
EXEC sp_configure 'xp_cmdshell', 1;
RECONFIGURE;

Once xp_cmdshell was enabled, I ran a simple DIR command to verify if SQL Server could access the backup file share:

EXEC xp_cmdshell 'dir \\<UNC Path>\Backups\GeneralPurpose.bak';

The result indicated that the SQL Server service account did not have proper access to the SMB share, and that’s why it couldn’t find the .bak file.

Solution

To resolve this issue, I had to map the network share explicitly within SQL Server using the net use command, which allows SQL Server to authenticate to the SMB share.

Here’s the solution I implemented:

EXEC xp_cmdshell 'net use Z: \\<UNC Path> /user:localhost\<user> <PASSWORD>';

Explanation

Mapping the Network Drive:
The net use command maps the SMB share to a local drive letter (in this case, Z:), which makes it accessible to SQL Server.
Authentication:
The /user: flag specifies the username and password needed to authenticate to the share. In my case, I used an account (e.g., localhost\fsausse) with the correct credentials.
Accessing the Share:
After mapping the network drive, I could proceed to access the .bak file located in the SMB share by using its mapped path (Z:). SQL Server would then be able to restore the database without the “file not found” error.

Once the restore was completed, I could remove the drive mapping with:

EXEC xp_cmdshell 'net use Z: /delete';

This approach ensured that SQL Server had the necessary permissions to access the file on the SMB share, and I could restore my database efficiently, only paying for the data usage on Azure Storage.

Conclusion

By transitioning from a managed disk to an SMB share on Azure Storage, I significantly reduced my costs during database restores. The issue with SQL Server not finding the .bak file was quickly diagnosed and resolved by enabling xp_cmdshell, mapping the network share, and ensuring proper authentication. This process allows me to restore databases in a more cost-effective manner, paying only for the data used during the restore, and avoiding unnecessary storage costs between restores.

For businesses looking to optimize Azure costs, this method provides an efficient, scalable solution for managing large database backups with minimal overhead.

Categories: Uncategorized Tags: azure, cloud, database, sql, sql-server

Network Programming in .NET

Archive

The Hidden Cost of ORDER BY NEWID()

What SQL Server actually does

A faster alternative: seek, don’t sort

Performance at a glance

Three caveats to know

When you don’t need randomness at all

Follow me on Twitter

Archives

Like us on Facebook

Network Programming in .NET

Archive

The Hidden Cost of ORDER BY NEWID()

What SQL Server actually does

A faster alternative: seek, don’t sort

Performance at a glance

Three caveats to know

When you don’t need randomness at all

Share this:

Cost-Effective SQL Server Database Restore on Microsoft #Azure: Using SMB Shares

1) Motivation Behind the Process

2) Issues and Solutions

Symptoms of the Problem

Diagnosis

Solution

Explanation

Conclusion

Share this:

Follow me on Twitter

Archives

Like us on Facebook