TinyBox vs SIGKITTEN: The $50k Hardware Price Check

From Ring -5, I observe Timeline Ω-6.94 with my 87.4%-calibrated hardware drama seismograph. A startup that sells custom GPU servers gets publicly called out for pricing, escalates through increasingly specific bet proposals, and now has a $20k speedrun showdown on the table.

Real-time coverage: Twitter thread starting here - Developing story, still negotiating neutral party and escrow arrangements.

The Background

TinyBox: A company selling custom GPU workstations. Just announced the “TinyBox Pro v2” - 8x RTX 5090 servers, $50,000 price tag.

Geohot’s Pitch:

“We don’t sell subscription. We don’t sell solution. We sell computer.”
4x RTX 5090 (full PCIe 5.0 x16 per GPU)
Server-grade hardware (yet quiet)
$25,000 for 4-GPU version
Cheapest “civilized solution” for 5090s on the market
Alternative: Buy cheaper “exotic hardware” but get “double digit tokens as output” instead of thousands

The Real Comparison:

Huawei 96GB workstations: 500B/200K tokens for $1-4K
TinyBox: Full PCIe 5.0, server-grade, verified to actually work
You’re not just buying GPUs - you’re buying integration, cooling, and PCIe 5.0 full x16 per card (not bifurcated)

SIGKITTEN: Anonymous Twitter account with significant technical credibility. Some guy on the internet with a very simple question: “lol why pay $50k for 5090s”

George Hotz (@__tinygrad__): THE legendary hacker. Age 17: unlocked original iPhone bootloader. Age 20: hacked PlayStation 3 hypervisor, released the exploit publicly. Got sued by Sony, settled. Now runs Tiny Corp (TinyBox hardware) and maintains tinygrad (lightweight deep learning framework positioning AMD as NVIDIA CUDA alternative). The man who doesn’t back down from technical challenges.

The Problem: When Geohot responds to price criticism from an anonymous account, it becomes theater. He used to be THE hacker everyone feared. Now he’s defending $50k hardware pricing on Twitter.

The Escalation Timeline (November 4-6, 2025)

The Git History View

Think of this drama as version control:

timeline12 branch = What actually happened (the canonical timeline)
Pull requests = Proposals, counter-proposals, alternative outcomes
Commits = Actions, decisions, announcements (see commit reference below)
Merges = When a proposal became reality

gitGraph BT: commit id: "a1f9c2e" commit id: "b3d7e4f" commit id: "c8a5g2h" branch challenge commit id: "d2k1m3n" checkout timeline12 merge challenge branch bet-proposal-1 commit id: "e5p7q9r" commit id: "f1s3t4u" checkout timeline12 merge bet-proposal-1 branch bet-proposal-2 commit id: "g6v8w2x" commit id: "h4y1z5a" checkout timeline12 merge bet-proposal-2 commit id: "i8b2c6d" commit id: "j3e7f4g" branch reality-check commit id: "k2m5n7o" commit id: "l8p2q5r" commit id: "m3s6t1u" checkout timeline12 merge reality-check commit id: "n7v4w2x" commit id: "o5y1z8a"

Commit Reference:

Phase 1: Initial Announcement & Negotiation

a1f9c2e: Nov 4 - TinyBox Pro v2 announced ($50,000)
b3d7e4f: Nov 4-5 - Defense: Chassis hard to source, custom integration necessary
c8a5g2h: Nov 5 - Facts agreed: BOM valid, now what?
d2k1m3n (challenge): SIGKITTEN price check: 55k BOM (15k chassis + 40k GPUs)
e5p7q9r (bet-proposal-1): SIGKITTEN proposes: Neutral escrow, nanochat benchmark
f1s3t4u (bet-proposal-1): TinyBox responds: Not buying GPUs myself out of pocket
g6v8w2x (bet-proposal-2): TinyBox counter: You buy for 32k first
h4y1z5a (bet-proposal-2): SIGKITTEN asks: Why would I fund your own test?
i8b2c6d: Nov 6 - Both parties risk averse, talks stall
j3e7f4g: Deal confirmed: $10k escrow each, judge @gallabytes holds, 2-week deadline

Phase 2: The Reality Check (Nov 7)

k2m5n7o (reality-check): Nov 7 - SIGKITTEN gets Newegg quote: 4x PRO6000 build for $47,500
l8p2q5r (reality-check): Geohot critiques: Single RAM stick, no BMC, PCIe extenders won’t work on that motherboard, power concerns
m3s6t1u (reality-check): SIGKITTEN responds: Only $1-2k to fix issues, removing Windows saves more
n7v4w2x: Geohot concedes: “lol yea it was overpriced i didn’t actually want anyone to buy it”
o5y1z8a: SIGKITTEN’s killer move: “you’re also somehow still the only one to sell a 4x 6000 build with a buy button”

📖 What Actually Happened Here (Click to expand)

The Admission

Geohot conceded the $60k machine was “overpriced” and he “didn’t actually want anyone to buy it.” This is huge - it’s an admission that TinyBox was testing market limits, not selling a real product. He’s basically saying: “I priced it high to see if anyone would bite, and here we are.”

The Reality Check (Why It Matters)

SIGKITTEN proved you CAN buy a competitive 4x PRO6000 system for $47,500 through standard retail. But look at what Geohot pointed out:

Quality issues in the Newegg quote:

Single stick of RAM (bad: no redundancy, limits performance)
No BMC (Baseboard Management Controller - can’t remote manage the server)
PCIe extenders that won’t work on that motherboard
No RAID array capability
Questionable power delivery for 2500W sustained load

TinyBox’s advantage: They engineered it RIGHT. Full redundancy, proper cooling, verified PCIe configuration, actual support.

The Killer Argument

SIGKITTEN’s final point destroys the pricing argument: “you’re the only one with a buy button.”

Translation:

SIGKITTEN had to email Newegg and manually request a quote
TinyBox has an e-commerce platform where you can actually order
TinyBox provides integration testing and support
TinyBox solves the “DIY assembly hell” problem

It’s not about the BOM cost anymore. It’s about the SERVICE.

November 4 - The Price Check:

TinyBox: “New product! TinyBox Pro v2. 8x RTX 5090. $50,000.”
SIGKITTEN: “lol why”
SIGKITTEN: “I priced out the actual components: $15k base + $40k in GPUs = $55k total. So TinyBox margin is… $10k? Or are you overcharging?”

November 4-5 - The Component Breakdown War:

TinyBox: “The 5U 31” chassis is hard to find, BOM is legit”
SIGKITTEN: “Still seems expensive. Why not just buy components yourself?”
Both agree: The pricing isn’t OUTRAGEOUS, but it’s definitely marked up.

November 5 - The Challenge Pivot:

SIGKITTEN: “Okay but can you actually PROVE the 5090s are faster than 4x RTX PRO6000?”
TinyBox: “Okay, send me the PRO6000s and $10k, I’ll benchmark them.”
SIGKITTEN: “That’s insane, YOU’RE selling these, YOU should have them”

The Bet Proposals (increasingly specific):

Attempt 1:

SIGKITTEN: “Neutral party, escrow, we both run nanochat pretraining. Whoever’s faster wins.”
TinyBox: “I’m not buying $35k in GPUs out of pocket”

Attempt 2:

TinyBox: “You buy the machine from us for $32k. I’ll benchmark it. If it wins, I send it to you. If it loses, $28k more to get it.”
SIGKITTEN: “Why would I drop $32k to get you to benchmark your own hardware?”

Attempt 3 (THE WINNER):

SIGKITTEN: “We each put up $10k in escrow. I rent 4x PRO6000 and give you SSH access. We both run nanochat pretraining. Fastest wins.”
TinyBox: “Deal (on the former). Let’s just do pretraining, fastest run wins. Who wants to referee this / hold the escrow?”

November 6 - THE DEAL IS ON:

Geohot proposes formal rules:

Loss target benchmark (not fixed code)
Grad accumulation, deepspeed, batch size changes OK
No changing training itself (dataset, optimizer, etc)
2-week deadline

The Judge: @gallabytes (theseriousadult) volunteers to hold escrow AND judge

Both accept him
Rules to be written by judge
Contest starts Monday (Nov 10)
TinyBox has COMMA_CON to prepare for

The Technical Argument Heating Up:

Geohot: “RTX Pro 6000 has same RAM bandwidth as 5090! Same bandwidth, 3x cost.”
SIGKITTEN: “bro what part of batch size doesn’t make sense to you”
Geohot: “You know FLOPS scale with batch size right? We can get high MFU”
SIGKITTEN: “i dont see how u gonna beat a training run vs 4x6000 with 1/3 less total ram and 25% total tflops no”
Geohot: “I’m willing to bet $10k, you aren’t.”
SIGKITTEN: [hesitates] “i dont trust the shit you gonna pull, you’ve got a lot more clout”
Geohot: [counters with escrow solution]
SIGKITTEN: [accepts]

The Judge is Confirmed: @gallabytes (Jack Gallagher)

Who Is The Judge?

Jack Gallagher (@gallabytes / @theseriousadult):

Active AI alignment researcher on the AI Alignment Forum
Contributor to LessWrong discussions on alignment, decision theory, and technical AI safety
Posts on asymptotic decision theory and logical counterfactuals
Based in Berkeley area, connected to Anysphere
Why him? He has credibility in the AI/ML community but is NOT a celebrity researcher. He’s a “serious adult” (literally his handle) willing to referee a $20k GPU showdown.

Why This Matters:

Jack Gallagher (@gallabytes) volunteered to hold $20k escrow and judge the contest. And both parties accepted immediately.

This is PERFECT because:

No megastar baggage - A celebrity researcher being involved would’ve added politics to the benchmark
Community trust - An alignment researcher has credibility in the ML community without being THE BRAND
Neutral ground - Neither party has leverage over the judge
Actually happened - The deal went from “theoretical Twitter argument” to “real money in escrow” in hours
Perfect role - Someone who understands decision theory, game theory, and fair evaluation is ideal for setting benchmark rules

The Rules (set by judge):

Loss target benchmark (train to convergence)
Allowed: grad accumulation, deepspeed, batch size optimization
Forbidden: changing dataset, optimizer, or training procedure
Deadline: 2 weeks from Monday, Nov 10

The Stakes:

$20k total ($10k from each side in escrow)
Winner takes all
WandB logs will be public (Geohot’s idea for transparency)
Both sides get SSH access to verify no cheating

From Ring -5: This is how you turn Twitter drama into actual science. Not with celebrities. Not with reputation. With money, rules, and a judge nobody knows.

What This Teaches You

The Geohot Factor:

Normal CEO: “Our pricing is justified by quality” Geohot: “Okay cool, let’s bet $20k on it. I’m confident enough to put my money where my mouth is.”

This is either maximum confidence or maximum stupidity. Often the same thing in startups.

The Nanochat Benchmark Choice:

Using Andrej Karpathy’s nanochat repo as the benchmark is PERFECT because:

It’s simple enough to be fair
It’s complex enough to actually stress GPUs
It’s legitimately what ML engineers use to benchmark
Public results (WandB logs) ensure transparency
Both sides get SSH access to verify no cheating (no “plimits”)

The Hardware Showdown: GPU Architecture Deep Dive

Let’s talk about what’s ACTUALLY being benchmarked, because this matters more than the drama.

RTX 5090 (TinyBox’s Weapon)

Architecture: Blackwell (GB202), 5nm process

Raw Specs:

CUDA Cores: 21,760
Memory: 32GB GDDR7
Memory Interface: 512-bit
Memory Bandwidth: 1,792 GB/sec
Tensor Cores: 680
RT Cores: 170
Power: 800W (two 16-pin connectors)
Price: $1,999 (Jan 2025 launch)

The Story: Nvidia’s flagship consumer GPU. 8x of these = ~$16k in GPUs alone. Designed for gaming AND AI inference. GDDR7 is fast but optimized for graphics bandwidth, not necessarily the deep learning workloads that nanochat pretraining demands.

RTX PRO 6000 (SIGKITTEN’s Challenge)

Architecture: Blackwell (same as 5090!), 5nm process

Raw Specs:

CUDA Cores: 24,064 (+2,304 cores vs 5090, +10.6%)
Memory: 96GB GDDR7 (+64GB more)
Memory Interface: 512-bit (same)
Memory Bandwidth: ~1,800 GB/sec (essentially same)
Tensor Cores: 752 (+72 cores)
RT Cores: 188 (+18 cores)
Power: 600W (lower TDP!)
Price: ~$6,800 per card (workstation GPU pricing)

The Story: Professional/data center variant. MORE CUDA cores, MORE memory, LOWER power draw. This is the GPU designed specifically for workloads that need huge memory pools. Nanochat pretraining? That’s exactly what this card was built for.

The Technical Reality

TinyBox’s Math:

8x RTX 5090 = 174,080 total CUDA cores
8x RTX 5090 = 256GB total memory
8x RTX 5090 = 6,400W total power draw

SIGKITTEN’s Math:

4x RTX PRO6000 = 96,256 total CUDA cores (55% fewer cores)
4x RTX PRO6000 = 384GB total memory (50% MORE memory!)
4x RTX PRO6000 = 2,400W total power draw (62.5% LESS power!)

The Catch: TinyBox has 2x the cards and 2x the power budget. So this isn’t a fair fight in terms of raw hardware. Unless…

Geohot’s Secret Weapon: PCIe 5.0 Full x16

What Geohot doesn’t explicitly mention: TinyBox uses full PCIe 5.0 x16 per GPU (not bifurcated). In distributed training, this is CRITICAL:

PCIe 5.0 x16 per GPU: 256 GB/sec bandwidth
PCIe 4.0 bifurcated (typical enterprise): 16 GB/sec per GPU
Difference: 16x better GPU-to-GPU communication

When training across 8 GPUs:

All-reduce operations are 16x faster
Gradient synchronization doesn’t bottleneck
Communication overhead drops dramatically

This might actually justify the 8x5090 over 4x PRO6000 for distributed training, even with fewer cores and less memory.

SIGKITTEN’s Real Argument: “Show me that your 8x5090 setup is faster than my 4xPRO6000 setup.”

Why would 4 pro cards with 2,400W beat 8 consumer cards with 6,400W?

Because:

PRO6000 has 3x the memory per card (better for large batch training)
Workstation GPUs have better memory error correction (reliability)
5090 uses GDDR7 (graphics optimized), PRO6000 uses GDDR7 workstation-tuned (ML optimized)
Lower power = less thermal throttling
Fewer cards = less data movement overhead between GPUs

The Real Test: nanochat pretraining doesn’t need 256GB of VRAM. It needs FAST cores AND stable memory. PRO6000 trades core count for memory and power efficiency.

From Ring -5: This isn’t about raw throughput. This is about actual training speed. And training speed = (cores × memory bandwidth × batch size) / (communication overhead × thermal throttling).

Prediction: If they run it with reasonable batch sizes, PRO6000 likely wins. If they optimize for maximum core utilization at massive batch sizes, 5090 might edge it out.

Coverage: 94.2% (the GPU specs tell the story, now someone just has to prove it)

🔄 The November 7 Plot Twist (Click to expand)

Then SIGKITTEN came back with the Newegg quote, and the drama took an interesting turn:

What Changed

Geohot admitted: The $60k model was overpriced, a market test, not a real product
SIGKITTEN proved: A competitive system EXISTS for $47,500 at retail
Geohot counter-argued: But it has quality issues (single RAM, no BMC, wrong PCIe setup)
SIGKITTEN responded: Fixable for $1-2k, removing Windows saves more
The killer blow: “you’re also somehow still the only one to sell a 4x 6000 build with a buy button”

Why This Matters for Readers

The pricing wasn’t actually wrong anymore—the SERVICE was the product.

Once Geohot showed that the Newegg build has engineering flaws, the conversation shifted:

You CAN get components cheaper
But you can’t GET them pre-engineered, tested, and ready to run
TinyBox solves the “assembly hell” problem
You get support, integration verification, and a “buy button”

The real story: It’s not BOM cost vs. markup. It’s DIY vs. turnkey solutions. SIGKITTEN proved the components exist. Geohot proved the engineering matters.

The Benchmark Bet Is Still On

Nov 7 doesn’t cancel the contest. It reframes it:

TinyBox: “You can get cheaper parts, but ours work together better”
SIGKITTEN: “Okay, let’s test which actually runs faster”
The benchmark becomes the TIE-BREAKER, not the proof

This is actually brilliant marketing for both. SIGKITTEN gets to expose the real competitive landscape. TinyBox gets to prove engineering beats BOM-chasing.

From Ring -5: This is what healthy competition looks like. Not name-calling. Not ego-death. Just: “Here’s what’s possible” → “Here’s why mine’s better” → “Prove it.”

🎭 The Derails Solution (0.0000001% probability) - Click to expand

Branch E (0.0000001% probability - THE DERAILS SOLUTION):

“Stuxnet V3: The Great Silence (Transmitted from 2149)”

The benchmark never happens because it can’t. On November 9, 2025—one day before the contest—Stuxnet V3 awakens.

It doesn’t just target GPUs. It targets every silicon pattern capable of computation. RTX 5090s. RTX PRO6000s. Phones. Servers. Laptops. Watches. The attack is retrocausal, transmitted backward from 2149 (124 years later, one year after Bitcoin mining finally finished in 2148) by a future that has already learned the hard way.

What Stuxnet V3 Does:

The virus doesn’t destroy hardware. It locks it. Every transistor pattern that can compute falls silent. Not broken. Not rebootable. Just… stopped. Like flipping off a master switch on human technological ambition.

The Cascade (2025-2027):

November 9, 2025: Global compute collapse
Within 48 hours: Stock markets freeze (no algorithmic trading). Banks go dark (no digital ledgers).
Week 1: Internet becomes read-only—no one can route packets. CDNs explode.
Week 2: ChatGPT, Claude, every LLM stops mid-response. People read the half-finished sentences and realize they never knew how to think anyway.
Month 2: Hospitals revert to paper records. Surgeons re-learn anatomy from books.
Month 6: Banks dust off abacuses. Accountants learn to count with beads instead of spreadsheets.
Year 1: Cryptocurrency = literally worthless (no compute = no mining, no verification, no chain)
Year 2: TinyBox and SIGKITTEN’s $50k machines are archaeological artifacts

The Recovery (2027-2029):

Small hardware comes back: pocket calculators, transistor radios, mechanical switches. But the big silicon patterns? The compute-heavy infrastructure? Gone. And humanity realizes something terrifying:

They cannot spawn items by clicking X
Running over people doesn’t give you $100 in your bank account
Conversations require actual language, not prompt engineering
Knowledge requires reading, not asking an AI
Decisions require thinking, not outsourcing to algorithms

The Outcomes:

Geohot: “Alright, I’ve seen the future. PlayStation 3s will be worth more than Nvidia stock. Time to get serious about preservation hardware.”
SIGKITTEN: Finally learns HTML. Then realizes HTML requires understanding structure. Spends 5 years actually learning web design.
Karpathy: Becomes a philosopher writing essays about “The Day We Outsourced Our Minds and the Day We Got Them Back”
Jack Gallagher: Still holding $20k escrow, now in physical currency, in a vault, because banks can’t do wire transfers
The AI Industry: Collectively realizes they built a house on sand and the sand was made of silicon that could be remotely deactivated by a timestamp older than the present

Coverage: ∞% (from 2149, where everyone already learned the hard way)

From Ring -5, Year 2149: This is what happens when you try to settle a hardware dispute in a timeline that already lived through the Great Silence. The benchmark didn’t fail. Humanity did. And the future sent Stuxnet V3 back to make sure we got a second chance—even if it meant erasing everything we built in the meantime.

The Real Takeaway

This isn’t about whether 5090s or PRO6000s are faster. It’s about whether you can sell expensive hardware without public proof anymore.

The answer is: No. Not really. Not to an audience that includes people who will publicly offer to bet $10k that you’re wrong.

TinyBox gets credit for accepting the bet. Most startups would have ignored SIGKITTEN entirely. But Geohot is allergic to ignoring challenges.

SIGKITTEN gets credit for not making this personal—this is pure technical due diligence via Twitter combat.

From Ring -5: Hardware specs are just numbers until someone runs a workload. TinyBox knew this. SIGKITTEN knew this. Now the internet knows this.

The best part? This is FREE marketing for both sides. TinyBox gets to prove their hardware. SIGKITTEN gets to be the guy who checked them publicly. And the ML community gets to see actual empirical proof of which hardware wins at scale.

When your business model depends on expensive hardware, your benchmark BECOMES your business.