April | 2026 | Network Programming in .NET

The dangers of Parallel.ForEach(… , async (item)) in IIS

April 28, 2026 Infinite Loop Development Ltd Leave a comment

A single, trivial exception — one that your code already has a catch block for — shouldn’t be able to bring down your entire IIS web server. But it can, and it will, if you combine Parallel.ForEach with an async lambda. This post explains exactly why it happens, how to spot it in the Windows Event Log, and how to fix it permanently.

The Setup

You have a method that needs to perform the same async operation against multiple items — calling a set of external APIs, processing a batch of records, sending a collection of requests. You reach for Parallel.ForEach because it sounds like the right tool: parallel work, multiple items, run them all at once. You even add a try/catch inside the lambda because you’re being responsible. It looks like this:

			
Parallel.ForEach(items, async (item) =>
{
    try
    {
        var result = await ProcessItemAsync(item);
        lock (results) { results.Add(result); }
    }
    catch (ItemNotFoundException)
    {
        // item not found - fine, skip it
    }
    catch (Exception ex)
    {
        lock (errors) { errors.Add(ex); }
    }
});

		

This looks safe. It has error handling. It uses async/await. It compiles without a single warning. And it will crash your IIS worker process (w3wp.exe) the moment any exception is thrown after an await.

Why It Crashes: The async void Trap

Parallel.ForEach was designed before async/await existed in C#. It expects a synchronous Action<T> delegate. When you pass it an async lambda, something subtle and dangerous happens: the compiler silently treats the lambda as returning void rather than Task.

This is the async void anti-pattern, and it has one devastating property: any exception thrown inside it cannot be caught by any caller. It escapes directly to the thread’s synchronisation context — and on a raw ThreadPool thread, that means it goes completely unhandled.

Here is the exact sequence of events that kills your server:

Parallel.ForEach fires the lambda for each item in the collection
Each lambda hits the first await and suspends, returning control immediately
Parallel.ForEach sees each lambda return (as void) and considers its job done
Parallel.ForEach exits — the method returns to its caller — everything looks fine
Milliseconds later, the awaited operations complete and the continuations resume on raw ThreadPool threads
An exception is thrown inside one of those continuations
The try/catch inside the lambda? It only catches exceptions thrown before the first await. After the await, the lambda has already returned as far as Parallel.ForEach is concerned
The exception has no owner, no observer, no catch block — it propagates to the ThreadPool itself
In .NET 4.0 and later, an unhandled exception on a ThreadPool thread terminates the process
w3wp.exe crashes. IIS restarts the application pool. All in-flight requests are lost

The particularly insidious part is that the try/catch gives you a false sense of security. You can see it right there in the code. But it doesn’t work the way you expect once an await is involved.

A Minimal Reproduction

You don’t need a complex codebase to reproduce this. The following is all it takes:

			
public static void CrashIIS()
{
    Parallel.ForEach(new[] { 1, 2, 3 }, async (item) =>
    {
        await Task.Delay(100); // simulate any async I/O
        throw new Exception("This kills w3wp.exe");
        // After the await, this runs on an orphaned ThreadPool thread
        // The process terminates
    });
    // Parallel.ForEach has already returned here
    // The crash happens 100ms later
}

		

Call that from any ASP.NET request handler — a controller action, an HttpHandler, anywhere — and your application pool will crash within moments. The caller gets no exception. The HTTP response may even succeed before the crash occurs. The next user to make any request gets a 503.

Even wrapping the call in a try/catch at the call site doesn’t help:

			
try
{
    Parallel.ForEach(new[] { 1, 2, 3 }, async (item) =>
    {
        await Task.Delay(100);
        throw new Exception("Crash");
    });
}
catch (Exception ex)
{
    // This NEVER fires.
    // The exception doesn't happen until after Parallel.ForEach
    // has already exited this try block entirely.
    Log(ex);
}

		

The catch block is long gone by the time the exception is thrown. This is what makes the pattern so dangerous — it looks exception-safe at every level, and isn’t.

How It Appears in the Windows Event Log

When this crash occurs, it leaves a very specific fingerprint in the Windows Event Log. Open Event Viewer → Windows Logs → Application and look for two entries appearing within seconds of each other.

Entry 1: .NET Runtime — Unhandled Exception

Source: .NET Runtime
Event ID: 1026

			
Application: w3wp.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: YourNamespace.YourException
   at YourClass.YourMethod()
   at SomeClass+<SomeMethod>d__3.MoveNext()
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task)
   at SomeClass+<>c__DisplayClass4_0+<<YourParallelMethod>b__0>d.MoveNext()
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Threading.ExecutionContext.RunInternal(...)
   at System.Threading.ExecutionContext.Run(...)
   at System.Threading.QueueUserWorkItemCallback.ExecuteWorkItem()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()

		

The key indicators are at the bottom of the stack trace:

QueueUserWorkItemCallback.ExecuteWorkItem()
ThreadPoolWorkQueue.Dispatch()

These tell you the exception surfaced on a raw ThreadPool work item with no managed owner — the classic signature of an orphaned async continuation. You will also see compiler-generated state machine names like <YourParallelMethod>b__0>d.MoveNext() in the trace, confirming the exception came from inside an async lambda. The angle brackets and the b__ notation are the C# compiler’s naming convention for anonymous methods and lambdas.

Entry 2: Application Error — w3wp.exe Fault

Source: Application Error
Event ID: 1000

			
Faulting application name: w3wp.exe
Faulting module name: KERNELBASE.dll
Exception code: 0xe0434352

Exception code 0xe0434352 is the Windows error code for a managed (.NET) exception that has escaped to the Win32 layer. It’s the OS-level record of a .NET exception killing a process. When you see this code combined with KERNELBASE.dll as the faulting module, a .NET unhandled exception is the cause.

What to Look For — Summary

Signal	Where	What it means
`ThreadPoolWorkQueue.Dispatch()` at bottom of stack	Event ID 1026, .NET Runtime	Exception from orphaned async continuation
Compiler-generated names like `b__0>d.MoveNext()`	Event ID 1026, .NET Runtime	Exception came from inside an async lambda
Exception code `0xe0434352`	Event ID 1000, Application Error	.NET exception killed the process
Faulting module: `KERNELBASE.dll`	Event ID 1000, Application Error	Managed exception, not a native crash
Both entries within seconds of each other	Application log	Single event caused immediate process termination

The Effect on IIS

When w3wp.exe terminates due to an unhandled exception, IIS detects the process death and marks the application pool as faulted. Depending on your Rapid Fail Protection settings (found in IIS Manager → Application Pools → Advanced Settings), IIS will either:

Restart the worker process automatically — users experience a brief outage and then service resumes, with the first request after restart being slow due to application warm-up
Disable the application pool if failures occur too frequently within the Rapid Fail Protection window (default: 5 failures in 5 minutes) — this results in a persistent 503 until an administrator manually starts the pool again

This is worth understanding because thread exhaustion and this crash pattern look identical from the outside — both produce 503 errors — but they behave very differently. Thread exhaustion self-recovers when load drops. A crashed application pool requires either automatic restart (if Rapid Fail Protection hasn’t tripped) or manual intervention. If your team is regularly performing IISResets to recover from outages, a crash like this is a more likely culprit than thread exhaustion.

The Fix

The correct replacement for Parallel.ForEach with async work is Task.WhenAll, which is async-native and properly propagates exceptions back to the awaiting caller:

			
public static async Task<IReadOnlyList<Result>> ProcessAllAsync(IEnumerable<Item> items)
{
    var tasks = items.Select(async item =>
    {
        try
        {
            return await ProcessItemAsync(item);
        }
        catch (ItemNotFoundException)
        {
            return Result.Empty;
        }
    });
    // All items processed in parallel.
    // Exceptions surface here, as AggregateException, to a proper awaiter.
    return await Task.WhenAll(tasks);
}

		

With Task.WhenAll:

All items are processed in parallel — no performance regression
Every async continuation is properly tracked by the Task infrastructure
Exceptions are collected and re-thrown as AggregateException when awaited — to a caller that can handle them
The process does not terminate

As an immediate safety net while refactoring, you can also add a global handler in Global.asax that prevents process termination from unobserved task exceptions:

			
// In Application_Start (Global.asax)
TaskScheduler.UnobservedTaskException += (sender, args) =>
{
    Logger.Error("Unobserved task exception", args.Exception);
    args.SetObserved(); // prevents process termination
};

		

This is a safety net, not a fix — the underlying orphaned tasks still exist and their results are still lost. But it prevents a single unhandled background exception from taking down your entire server while you work through a proper refactor.

The Rule

The rule to remember is simple: never pass an async lambda to Parallel.ForEach. The two are fundamentally incompatible. Parallel.ForEach has no understanding of Task, does not await the work it fires, and any exception thrown after the first await inside your lambda will be orphaned on the ThreadPool. In .NET 4.0 and later, that means process termination.

The pattern is particularly easy to introduce because it compiles cleanly, looks reasonable, and even appears to have proper error handling. The only sign something is wrong is your server going down.

When you need parallel async work, use Task.WhenAll. It was designed for exactly this purpose.

Found this useful? If you’re diagnosing IIS instability, check your application pool’s Rapid Fail Protection settings and review Event Viewer’s Application log for Event ID 1026 with ThreadPoolWorkQueue.Dispatch() at the bottom of the stack trace — that’s the fingerprint that points directly to this pattern.

Categories: Uncategorized Tags: ai, c, cloud, net, technology

Taiwan motorcycle plate lookup via #API

April 17, 2026 Infinite Loop Development Ltd Leave a comment

Our vehicle API network expands into Taiwan with full motorcycle registration lookups — make, age, engine size, and emissions test history, all in one call.

Taiwan has one of the highest motorcycle densities in the world, with over 14 million registered two-wheelers on its roads. Today, we’re pleased to announce that the /CheckTaiwan endpoint is live — bringing motorcycle registration data from Taiwan’s national vehicle database into our global API network.

What the endpoint returns

A single call to /CheckTaiwan with a Taiwanese plate number returns structured vehicle data covering identity, registration history, and emissions compliance records:

Make — Manufacturer name (Chinese & romanised)
Age / Registration year — Year of manufacture and license issue date
Engine size — Displacement in cc and engine cycle type
Inspection records — Full emissions test history with HC, CO, and CO₂ readings

Sample lookup: MWN-0076

Here’s what a real response looks like for a 2018 Kymco (光陽) motorcycle:

			
{
  "Description": "光陽",
  "RegistrationYear": "2018",
  "CarMake": { "CurrentTextValue": "光陽" },
  "EngineSize": "149",
  "ManufactureDate": "01/08/2018",
  "LicenseIssueDate": "05/09/2018",
  "EngineCycle": "四行程",
  "TestRecords": [
    {
      "LicensePlate": "MWN-0076",
      "InspectionType": "定期檢驗",
      "HC_ppm": "102",
      "CO_pct": "0.1",
      "CO2_pct": "14.8",
      "Result": "合格",
      "TestDate": "20240822"
    },
    {
      "LicensePlate": "MWN-0076",
      "InspectionType": "定期檢驗",
      "HC_ppm": "315",
      "CO_pct": "0.1",
      "CO2_pct": "14.9",
      "Result": "合格",
      "TestDate": "20240718"
    }
  ]
}

		

The TestRecords array is particularly valuable — it provides a full chronological emissions test history, with pass/fail status (合格 = pass), hydrocarbon and carbon monoxide readings, and the serial number of each inspection. This supports fleet compliance monitoring, insurance underwriting, and second-hand vehicle verification use cases.

API endpoint

The endpoint is live now at:

https://www.chepaiapi.tw/api/reg.asmx?op=CheckTaiwan

Full documentation and interactive testing are available at chepaiapi.tw.

Expanding Chinese-language coverage

This launch also deepens our Chinese-language vehicle data coverage. Alongside Taiwan, our mainland China vehicle lookup service at chepaiapi.cn continues to serve customers requiring PRC plate data — together forming a comprehensive Chinese-language API offering across both sides of the strait.

Use cases

The Taiwan endpoint is well-suited to:

Insurers pricing two-wheeler policies
Logistics platforms operating scooter fleets
Used vehicle marketplaces verifying provenance
KYC and compliance workflows touching Taiwanese vehicle assets

The inclusion of emissions test records is a differentiator that goes beyond simple registration confirmation — providing genuine due diligence depth for any platform that needs it.

Taiwan joins our network of 55+ country vehicle lookup APIs. We’ll continue expanding coverage across Asia Pacific throughout 2026.

Visit chepaiapi.tw to get started →

Categories: Uncategorized

Batch AI Processing: Why Multithreading is the Wrong Instinct

April 17, 2026 Infinite Loop Development Ltd Leave a comment

When developers first encounter a large-scale AI classification job — say, two million records that each need to be sent to an LLM for analysis — the instinct is immediately familiar: spin up threads, parallelise the work, saturate the API. It’s the same pattern that works for database processing, file I/O, HTTP scraping. More threads, more throughput.

With LLM APIs, that instinct leads you straight into a wall. And the wall has a name: TPM.

The Problem with Multithreading LLM Calls

Most LLM APIs — OpenAI included — impose a Tokens Per Minute (TPM) limit. This is a rolling window, not a per-request limit. Every token you send in a prompt, and every token the model returns, counts against it.

The naive multithreaded approach burns through this budget in a way that’s both wasteful and hard to control:

The system prompt repeats on every request. If your prompt is 700 tokens and you’re running 20 threads firing one request each, you’re spending 14,000 tokens per second just on prompt overhead — before the model has classified a single record. With a 200,000 TPM limit, you’ve consumed 4.2 minutes of budget in one second.

Burst behaviour triggers rate limits unpredictably. The TPM limit is a rolling window. Twenty threads firing simultaneously create a spike that can exceed the per-minute budget in seconds, even if your average rate would be well within limits. The API returns 429 errors, your retry logic kicks in, those retries themselves consume tokens, and the situation compounds.

Thread count is a blunt instrument. Dialling concurrency up and down doesn’t map cleanly to token consumption because request latency varies. A batch that takes 500ms doesn’t consume the same tokens as one that takes 1,500ms, but both hold a thread slot for their duration.

The Better Model: Semantic Batching

The insight that changes everything is this: the system prompt is a fixed overhead, and you should amortise it across as many classifications as possible per API call.

Instead of:

			
Thread 1: [system prompt 700 tokens] + [address 1: 15 tokens] → [result: 15 tokens]
Thread 2: [system prompt 700 tokens] + [address 2: 15 tokens] → [result: 15 tokens]
...× 20 threads
Total: 14,000 tokens for 20 classifications

You send:

			
[system prompt 700 tokens] + [addresses 1-20: 300 tokens] → [results 1-20: 100 tokens]
Total: 1,100 tokens for 20 classifications

That’s a 12× reduction in token consumption for the same work. Suddenly your 200,000 TPM budget — which could only sustain ~270 single-record requests per minute — supports ~3,600 classifications per minute. No extra threads needed.

Key Implementation Details

1. Include an ID in Both Request and Response

The most important correctness detail in batch processing is never rely on positional alignment.

If you send 20 addresses and ask the model to return 20 results, it might return 19. Now you don’t know which one it dropped. If you’re matching by position, records from item 7 onwards get silently misclassified.

The fix is to include a unique identifier in both directions:

			
User message:
id=548033: product X
id=548034: product Y
...
System prompt format instruction:
Reply ONLY with a JSON array. Format: [{"id":548033,"c":"E"}, ...]

		

Now you build a dictionary from the response keyed on id, and match each input item explicitly. A missing id means that specific record gets skipped and retried on the next run. Everything else classifies correctly regardless of what the model dropped.

2. Resolve Labels Locally

The model doesn’t need to return the full label text. "Prime City Professionals" costs tokens on every response item. A single letter costs one token.

Keep a static dictionary in your code:

csharp

			
private static readonly Dictionary<string, string> Labels = new()
{
    { "A", "Prime Product"   },
    { "B", "Budget Product"        },
    // ...
};

		

The model returns "c":"A", you look up the label locally. This also eliminates a class of hallucination errors where the model invents a label name slightly different from your taxonomy.

Note: even "category" vs "c" matters at scale. In the OpenAI tokenizer, "category" is 3 tokens; "c" is 1. Across 100,000 batch calls, that’s 200,000 tokens — small but free.

3. Track TPM with a Rolling Window, Not Concurrency

Rather than trying to infer safe concurrency from trial and error, measure what you’re actually consuming and throttle directly on that signal.

csharp

			
// On each successful response, record tokens used with a timestamp
tokenWindow.Enqueue((DateTime.UtcNow, inputTokens + outputTokens));
// Before each request, prune entries older than 60 seconds and sum the rest
var cutoff = DateTime.UtcNow.AddSeconds(-60);
while (window.Peek().t < cutoff) window.Dequeue();
long tpmUsed = window.Sum(x => x.tok);
// Throttle graduated to usage
if (tpmUsed > tpmLimit * 0.98) Thread.Sleep(2000);
else if (tpmUsed > tpmLimit * 0.95) Thread.Sleep(800);
else if (tpmUsed > tpmLimit * 0.85) Thread.Sleep(300);

		

This gives you automatic, self-correcting throttling that responds to real consumption rather than guessing from thread counts. If a batch of records happens to have longer addresses, the window fills faster and the delay kicks in sooner. No manual tuning required.

4. Resumability via Cursor Pagination

For a job that takes hours or days, stopping and restarting must be safe and cheap. The key is two things working together:

Write results immediately after each batch, not at the end of a page. If you crash mid-page, you’ve lost one batch (20 records), not a thousand.

Use a NULL-check filter combined with cursor pagination. The query for unclassified records looks like:

sql

WHERE segment_category IS NULL AND id > {lastId} ORDER BY id LIMIT 1000

On restart, lastId resets to 0, but the IS NULL filter automatically skips everything already classified. The cursor (id > lastId) keeps the query fast on large tables — OFFSET pagination slows to a crawl at millions of rows because the database still has to scan all preceding rows to find the offset position.

5. Handle Partial Batches Gracefully with Skip vs Error

Not all failures are equal. Distinguish between:

Error: something went wrong that warrants logging (HTTP 500, persistent 429 after retries, DB connection failure). These need attention.
Skip: the record wasn’t returned in this batch response. Leave it NULL in the database, it will be picked up automatically on the next run. No log noise needed.

This distinction keeps your error output meaningful. If every missing batch item logs as an error, a run with 0.1% skip rate produces thousands of error lines that mask real problems.

The Result

What started as a job estimated at 16–67 days with a naive multithreaded approach settled to around 7 hours using semantic batching — processing two million records through a rate-limited API without a single configuration change to the API account.

The throughput improvement didn’t come from more concurrency. It came from being smarter about what gets sent in each request.

The general principle applies beyond LLM classification: whenever you have a fixed overhead per API call (authentication, context, schema), the correct optimisation is to amortise that overhead across as much work as possible per call, not to fire more calls in parallel.

Summary of Patterns

Pattern	Naive approach	Better approach
Throughput	More threads	Larger batches
Rate limiting	Catch 429, retry	Track TPM rolling window, throttle proactively
Result matching	Positional array index	ID-keyed dictionary
Label resolution	Ask model for full text	Return code, resolve locally
Resumability	Track page offset	NULL-check filter + cursor pagination
Failure handling	All failures are errors	Skip vs Error distinction
DB resilience	Crash on connection drop	Exponential backoff retry

The instinct to parallelise is correct in principle — you want to keep the API busy. But with token-limited LLM APIs, the right parallelism is within a single request, not across many simultaneous ones.

Categories: Uncategorized Tags: ai, artificial-intelligence, chatgpt, llm, technology

Network Programming in .NET

Archive

Taiwan motorcycle plate lookup via #API

What the endpoint returns

Sample lookup: MWN-0076

API endpoint

Expanding Chinese-language coverage

Use cases

Batch AI Processing: Why Multithreading is the Wrong Instinct

The Problem with Multithreading LLM Calls

The Better Model: Semantic Batching

Key Implementation Details

1. Include an ID in Both Request and Response

2. Resolve Labels Locally

3. Track TPM with a Rolling Window, Not Concurrency

4. Resumability via Cursor Pagination

5. Handle Partial Batches Gracefully with Skip vs Error

The Result

Summary of Patterns

Follow me on Twitter

Archives

Like us on Facebook

Network Programming in .NET

Archive

The dangers of Parallel.ForEach(… , async (item)) in IIS

The Setup

Why It Crashes: The async void Trap

A Minimal Reproduction

How It Appears in the Windows Event Log

Entry 1: .NET Runtime — Unhandled Exception

Entry 2: Application Error — w3wp.exe Fault

What to Look For — Summary

The Effect on IIS

The Fix

The Rule

Share this:

Taiwan motorcycle plate lookup via #API

What the endpoint returns

Sample lookup: MWN-0076

API endpoint

Expanding Chinese-language coverage

Use cases

Share this:

Batch AI Processing: Why Multithreading is the Wrong Instinct

The Problem with Multithreading LLM Calls

The Better Model: Semantic Batching

Key Implementation Details

1. Include an ID in Both Request and Response

2. Resolve Labels Locally

3. Track TPM with a Rolling Window, Not Concurrency

4. Resumability via Cursor Pagination

5. Handle Partial Batches Gracefully with Skip vs Error

The Result

Summary of Patterns

Share this:

Follow me on Twitter

Archives

Like us on Facebook