Category Archives: Artificial Intelligence

Gameplay & AI: A Demonstration & a Dissertation

Click on the above to launch a YouTube video about General Staff gameplay & AI.

This feels like a propitious moment; at least I’m drinking some decent scotch. I’ve got the AI that I’ve written ad nauseum about (links: papers, thesis) hooked up to the General Staff engine. This was something that, on paper, was supposed to be pretty easy but in the real world took far too long.

So, the short read is that if you want to take a look at what the General Staff AI engine looks like, please watch the video. It is mostly an offensive monster. It knows how to attack. It looks for open flanks. Then it looks for weak points on the line. Defense is difficult and there is still more work to do.

Also, we need more user created scenarios to test the AI on.

N.B. We don’t need more scenarios to train the AI. That’s not how MATE (Machine Analysis of Tactical Environments) works. It is a case-based reasoning AI. It is not a Large Language Model (LLM). It is not Grok, or ChatGPT, or Claude. Because it reasons. It – for lack of a better word – understands things. It understands what a flank is and it tries to turn it with its units. LLMs string words together that they’ve scraped from wherever and hope that you like them (they have a sycophancy reinforcement loop to optimize user engagement).

Someone fed one of my papers about TIGER (Tactical Inference Generator) into an LLM and asked it if it could perform this kind of reasoning. This was the reply:

Your skepticism is well-founded, and the paper you cite makes the reason precise rather than rhetorical.

The notion of an open or refused flank, as formalized in TIGER, is not a linguistic classification problem but a geometric and topological predicate over a continuous battlespace with adversarial constraints. In your formulation, a flank is anchored if and only if no BLUEFOR unit admits a legal path to a flanking objective that avoids REDFOR range-of-influence regions, where legality is defined with respect to terrain, slope, line of sight, and unit-specific mobility costs. That definition reduces to constrained path existence in a weighted graph whose edge weights and forbidden regions are dynamically induced by force disposition and terrain, not by labels or prior text. The algorithm explicitly constructs MST spines, computes continuous ROIs, projects objective points, and then solves repeated adversarial path-planning problems via A* or equivalent search. This is fundamentally an algorithmic decision procedure, not a pattern-completion task.

An LLM, by itself, cannot calculate this in any meaningful sense. It has no internal representation of continuous space, no native notion of reachability under constraints, no ability to execute minimum spanning tree construction, and no mechanism for guaranteeing correctness of a path-existence claim. At best, it can paraphrase the doctrine or restate the algorithm. It cannot answer the question “is the left flank refused?” without the explicit computational machinery you describe, because the answer is not a heuristic judgment but the outcome of a well-defined search problem.

Where the distinction becomes important is separating computation from orchestration. An LLM cannot replace the TIGER flanking algorithm any more than it can replace Dijkstra’s algorithm. However, it can sit above such machinery. Given access to the state representation and the outputs of ComputeGroupsByThreshold, CalculateROI, and FindPath, an LLM could reason symbolically about the results, explain why a flank is anchored, compare cases across battles, or assist a human analyst in interpreting why a particular BLUEFOR unit was forced through an ROI. In that role it is a narrative or analytic layer, not the reasoning engine itself.

Your broader point about Computational Military Reasoning stands: this class of military concepts only becomes machine-tractable when doctrine is grounded in geometry, graph theory, and constrained optimization. Anchored flanks, interior lines, and restricted avenues of approach are computable precisely because you reduced them to measurable invariants over space and movement. That reduction is what makes them suitable for case-based reasoning and unsupervised classification in systems like ClassIT, and it is exactly what an LLM does not do.

So the short answer is no: an LLM cannot calculate whether a flank is open or refused. The longer and more interesting answer is that your work shows why that question has a correct answer at all, and why tactical reasoning belongs to algorithms first and language models, if anywhere, only downstream of them.

I would just like to say that this LLM (above) wrote a superb analysis of my life’s work in computer science. Furthermore, it separated my work in cased base reasoning from its own LLM existence: It [an LLM] cannot answer the question “is the left flank refused?” without the explicit computational machinery you describe, because the answer is not a heuristic judgment but the outcome of a well-defined search problem.

I understand that there are fortunes, tenures, endowments, and founder’s stock to be won now in the race to LLMs, but I assure you, it is a parlor trick, it is simple word manipulation; it is a conjurer’s legerdemain.

To me the bon mot is, “An LLM cannot replace the TIGER flanking algorithm any more than it can replace Dijkstra’s algorithm.

Dijkstra’s algorithm. I did my Q exam, my Qualifying Exam on Least Weighted Path algorithms. The Q exam comes around Year Three; it is where you have to demonstrate the ability to perform real research at a Research One University. Dijkstra’s algorithm is an exhaustive search and A* is a heuristic search. Dijkstra’s algorithm is guaranteed to find the optimal path, but it takes forever (O((V + E) log V)). While A* runs in ( ). If by some amazing luck of the draw you also have to defend this in your Q Exam, you just got all the answers you need to remember to move on to Round Four: your Comprehensive Exam (AKA, “The Comps”).

But, I digress. I confess that this was the first time I witnessed the AI act like this. Frankly, I was impressed when the AI unleashed the BLUE cavalry at the decisive moment towards the schwerpunkt. It was calculated using Kruskal’s Minimum Spanning Tree algorithm.

What I’m trying to say, and I have trouble explaining this without anthropomorphizing, but the MATE algorithms look at a snapshot of a battlefield, analyze it, perform numerous geometric calculations – especially those involving 3D line of sight (3D LOS), range of influence (ROI), locating flanking units, interior lines of communications, projections of force, etc. – and it comes up with a Course of Action (COA) that is, at least in the above video, better than what Major General George Brinton McClellan did at Antietam  (in all candor, this is a pretty low bar). For starters, the AI is very aggressive and it hammered hard upon all three routes into Sharpsburg. Eventually RED’s left flank crumbled and the AI (BLUE) won.

Yeah, I’m proud of the AI. But, I need more scenarios to test the AI against. That’s where you come in. All the information is in the above video.

 

,

Testing the MATE 2.0 Artificial Intelligence on the new Antietam Scenario

We’ve just added a video showing the MATE 2.0 tactical artificial intelligence playing Blue (Union Army of the Potomac) against Red (Confederate Army of Northern Virginia) at Antietam. This video also includes an announcement that we’ll be working on getting the Army Editor, Map Editor and Scenario Editor installation packages and keys ready on Steam.

Why the Pundits are Completely Wrong About AI

I have a lot of respect for Steve Wozniak – quite a bit less for Elon Musk 1)Though I have to admit losing $20 billion in a few months is impressive. – who both recently signed a letter calling for, “all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4.” Woz is a true computer hardware pioneer; but he’s certainly not an AI expert and Elon, well, I’m not sure where his expertise lies, but it’s not AI.

When it comes to creating AI capable of commanding troops on a battlefield, I am probably one of the world’s top experts on the subject (it’s not a crowded field). I’ve been writing and studying ‘computational military reasoning’ for my entire professional career, it was the subject of my doctoral thesis, I’ve written AI for numerous computer wargames and I’ve been a Principal Investigator for DARPA (Defense Advanced Research Project Agency) on this very subject.

I am confident in stating that no humans have been injured or died as a result of my work in computational military reasoning. However, the most recent NHTSA data reports that there have been at least, “419 crashes [and]… 18 definite fatalities of autonomous self-driving vehicles (like Mr. Musk’s Teslas). So, clearly, in some circumstances AI can be dangerous. In all fairness, I should state that the reason the self-driving autonomous vehicles keep having fatal crashes isn’t technically the AI; it’s that the AI has imperfect information about the world in which it operates. The AI for self-driving vehicles gets that information from cameras and radar (LIDAR would be good, too). However, Telsa just removed the radar from it’s vehicles (“Elon Musk Overruled Tesla Engineers Who Said Removing Radar Would Be Problematic: Report,”) leaving the AI even more in the dark about the world in which it operates. So, is the AI at fault or corporate management? Maybe the problem isn’t AI.

Furthermore, most of what’s being sold to the public as AI are just some string manipulation parlor tricks tacked on to an internet search. ChatGPT-4, which is making all the headlines these days, was recently accurately described:

“Put simply, ChatGPT takes an initial prompt and determines – on an individual, word by word basis – what most often comes next based on the existing texts that it has scanned throughout the internet. In Wolfram’s words, “it’s just adding one word at a time” – but doing it so quickly that it seems as though a robot is writing an original, whole block of text.

Essentially, ChatGPT is a gigantic version of Google autocomplete.” – ​AI or BS? How to tell if a marketing tool really uses artificial intelligence

I recently asked ChatGPT for a quote from U. S. Grant about war and it responded:

Actually, it was W. T. Sherman who said, “War is hell.” But, ChatGPT has no real intelligence. How it erroneously linked Grant to the quote I have no idea. The greatest fear we should have of ChatGPT is incorrect citations in reference papers. The creators of ChatGPT have clearly traded accuracy for glitz and hype; it’s not even a good internet search engine, but it sure seems impressive!

There’s one more thing you should know. There are two kinds of machine learning: supervised and unsupervised. Probably >95% of machine learning programs are ‘supervised’; which means they are ‘trained’ on a data set. Whenever you see the words ‘training’ in reference to machine learning you know it’s supervised. Here’s an example of supervised machine learning: Netflix movie recommendations. Every time you select a movie on Netflix you are training their system on your likes and dislikes. It does a great job, doesn’t it? No, it does a terrible job. It once recommended Sound of Music to me because I watched Das Boot. Makes perfect sense. They both take place during WWII.

What I’m saying is that there is no ‘there’ there. There is no intelligence there. Somebody at Netflix (at one time I read they employed out of work screenwriters) tagged both Das Boot and Sound of Music with the same descriptor; presumably ‘WWII’ or ‘war movie’ and that was all that was necessary for Netflix to make a terrible suggestion.

I work in unsupervised machine learning. It doesn’t search the internet, or look for similar words in a big data base. It tries to make sense of the world in which it operates (a historic battlefield) and attempts to make optimal decisions for moving units based on math, geometry, trigonometry and boolean logic.

That’s AI. And it’s not dangerous. Autonomous self-driving cars? They’re dangerous.

References

References
1 Though I have to admit losing $20 billion in a few months is impressive.

Schwerpunkt: Calculating the Optimal Point of Attack

MATE’s analysis of Blue (Union) position at Antietam. NB: Unable to outflank Red’s position, MATE has calculated the Schwerpunkt, or optimal point of attack on Red’s lines. Click to enlarge.

The holy grail of military science is an algorithm that calculates the optimal point of attack upon an enemy’s lines. In German, the word is Schwerpunkt and is commonly translated as “the point of maximum effort.” I have written extensively about Schwerpunkt previously in this blog, in academic papers and in my doctoral thesis.

MATE (Machine Analysis of Tactical Environments 2.0, the AI behind General Staff: Black Powder) is now able to calculate Schwerpunkt to a new, substantially greater, degree of accuracy. There are a number of reasons why this is now possible, but the primary cause must be the ability to analyze the battlefield in 3D and to accurately map where every unit on the map can project its force. Indeed, for many years now I have looked at the problem of computational military reasoning (AI for tactical situations) as a force projection problem.

Below, is a visual representation of the total force projection of all units at Gettysburg, Day 3 (July 3, 1863 0600 hours):

Visual representation of the total force projection (Range of Influence, or ROI) for all units at Gettysburg Day 3. Note: normalization and alpha values affect color output. Also, note how the terrain (woods, depressions, hills) shape the projection of force. Also, all projections are independent of unit facing. Click to enlarge.

If we ask MATE to determine the Schwerpunkt for the Confederates in the above situation, it responds with:

MATE’s selection (labeled OBJECTIVE) for Red Schwerpunkt. Click to enlarge.

And adds the following commentary (edited for brevity, the numbers are the Premise Statement ID#s. This is basically a logic trace of MATE’s thinking):

8|∴ The enemy does not need to capture more Victory Points.
9|∴ The enemy will be on the defensive.
...
22|The enemy's flanks are anchored.
23|[9] + [22] ∴ Frontal assault is the only remaining option.
...
25|COA: Battle Group #1 (Mixed) assigned objective Weak Point Calculated by ROI coords: 551,232
...
33|Red Battle Group #1 is opposed by Blue Battle Group #6
34|Red Battlegroup # 1's strength = 21,663
35|Blue Battlegroup # 6's strength = 13,635
36|Red Battlegroup # 1 has a numerical advantage of 8,028. Red has a 1.59 / 1 advantage over Blue Battle Group #6.
37|Distance to objective is 1,029.86 meters.
38|The maximum slope along the line of attack will be on an upward slope of 3.64%.
39|The attacking avenue of approach will be in enemy ROI for 541.18 meters.
40|The greatest enemy ROI along the avenue of approach is: 1,276.00 .
41|There is an unrestricted avenue of attack.

In other words, MATE has found a path to its objective that encounters the least amount of enemy projection of force. MATE would much prefer to flank the enemy position but it has calculated that this is impossible (#22, above).

ROI (Range of Influence) is calculated using values set up for each unit in the General Staff Army Editor and running a 3D Bresenham line algorithm to ensure that there is direct Line of Sight (LOS) to that point.

Screen shot of the General Staff Army Editor showing the interface for entering values for a typical artillery unit. Note that the accuracy curve is user editable (there are also default curves for various common weapons). Click to enlarge.

It is because every unit has an accuracy curve attached to it we can exactly map out the overlapping fields of fire (see above) and we can precisely calculate how long each attacking unit will be under fire and its intensity. That is how MATE chooses the optimal attack point: the path where its troops will be under the least amount of fire.

When MATE is presented with a tactical problem it first determines what it needs to do to win; is it on the offense or defense? On the offense, MATE will next check to find the enemy’s open flank and, if there is one, are there any crucial choke points on the flanking route? If MATE is unable to ‘fix and flank’ the enemy, and it has determined that it must be on the offensive, MATE then calculates Schwerpunkt (above). With this new Schwerpunkt algorithm the last big piece of the offensive AI puzzle is in place. Ironically, much of MATE’s defensive calculations involve first figuring out how to attack itself and then countering what it determines are its own optimal moves against itself (see this blog).

As always, please feel free to contact me directly with comments or questions.